Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consideration of addressable links #16

Open
tomkralidis opened this issue Feb 15, 2020 · 12 comments
Open

consideration of addressable links #16

tomkralidis opened this issue Feb 15, 2020 · 12 comments

Comments

@tomkralidis
Copy link

tomkralidis commented Feb 15, 2020

cc @efucile @alexandreleroux

In the context of WIS discovery metadata, distribution links provide valuable means to guide users to interacting with data.

Consider advertising a WMS map link within a dataset's discovery metadata:

{
    "rel" : "items",
    "type" : "image/png",
    "title" : "OGC:WMS map of this dataset",
    "href" : "https://example.org/wms/service=WMS&version=1.3.0&request=GetMap&layer={layer}&bbox={miny},{minx},{maxy},{maxx}&format={format}&crs={crs}&width={width}&height={height}",
    "templated" : "true"
}

Can we accomplish something similar with AMQP or MQTT? That is, provide a URI which allows a user to click/subscribe (albeit in a known way). It looks like there is a convention for both RabbitMQ and MQTT, which go as far as connection, but there may be value in further specifying this like:

{
    "rel" : "items",
    "type" : "application/json;subtype=x-wmo-wis",
    "title" : "AMQPS feed of this dataset",
    "href" : "amqps://anonymous@example.org/my-vhost/my-exchange/my-topic/my-subtopic"
}

Of course, non-trivial cases would go through other arrangements, but the above may be helpful as low barrier entry to PubSub advertising in WIS Metadata.

@petersilva
Copy link
Contributor

petersilva commented Feb 15, 2020

as discussed, some relevant background:

summary of above from only two brokers, but others follow the same "trend" (really a lack of trend.):

Each broker implements different and conflicting URI/URL conventions.

  • There is no way to combine them and stay consistent among them. (there is no interoperabe URI for AMQP)
  • Some of the URI specifications include query language ( ?a=b&.... ) which is not very REST... would prefer static URI scheme. They do that because AMQP has a heck of a lot of options, and I guess it seemed like an easy way for them to make all the settings. We have the opportunity to establish conventions to avoid having to encode all that in the URL.
  • Others are rather incomplete, for example for rabbitmq, the URL is only to connect to the broker (ie. one logs in, and that's it... the rest is done by subsequent requests.) You cannot give enough information in the URL to specify a complete feed.

So... if we want to define a feed URI we kind of have full freedom, because there is nothing to adopt:

  • yes of course the basics [amqps|mqtts]://user:pw@host:port/ all of that is clear and easy.

the rest... well...

  • I am constantly annoyed by vhost. A virtual host... in apache, or in DNS or on other non messaging domains, is just an alternate host name... ie. it is an aspect of provisioning that outside users never see. If you want to connect to a vhost then the host parameter would just be replaced by vhost... this thing that is vhost in AMQP ... it makes no sense to me. Different brokers have different default vhosts. on rabbitmq it is '/', on qpid it is 'default'. The rabbitmq default of '/' is especially troublesome because of the conflict with path separators. In ten years of admittedly narrowly scoped work, we have never understood any need for vhosts. I would like to ignore vhosts completely, and just not specify them, but I don't know if that will work. I have ignored it working with rabbitmq and the wmo_mesh example which connects to multiple MQTT brokers without issue...

  • the work on the canadian stack proposed in this project maps concepts so that mqtt and amqp brokers can co-exist and messages can be passed between them (using smart clients that speak both protocols) with the message body being unchanged. the main mapping is to denote the AMQP exchange name as the root of the MQTT topic tree. ( https://github.com/MetPX/sarracenia/blob/master/doc/sr_postv3.7.rst#mapping-to-mqtt )

so ideally, the rest of the url would be something like:

exchange/topic/topic/filename or in MQTT just the topic hierarchy.

My first guess would be to use MQTT syntax as-is for the topic hierarchy. An important aspect of any pub-sub mechanism is wildcarding. The current proposals for topic hierarchies include dates, and the need and use of wildcards is frequent. so that must be allowed for. MQTT uses '/' as a topic separator (instead of dot (.) in AMQP ), + as a single topic wildcard (This is * in AMQP), and the hashsign (#) as a match the rest of the tree wildcard (same as AMQP)

That might be enough for simple cases... but I am not sure. I need to play around a bit to see what the above means in practice. Things I am wondering about:

  • Are + and # a problem in URL's do we have to url-encode them, or can they be there plain?

  • v03.post prefix... where does it go... just in topics? It is considered best practice to include a version tag in API's and thus URL's in REST case.

  • baseUrl... where does it go?

  • have not dealt with or either.

that's all for now...

@petersilva
Copy link
Contributor

@josusky of interest.

@petersilva
Copy link
Contributor

thinking about it... can ignore baseURL, because it will come from the messages, after connection.

@josusky
Copy link
Contributor

josusky commented Feb 17, 2020

Hi Tom, Hi Peter,
Before I dive into technical details I would like to understand the big picture. Are you suggesting to have a special feeds (brokers + topics ...) to publish some kind of "metadata" (i.e. notifications about new data sources (producers/publishers) that have appeared, and where one can subscribe to start receiving the new data)?
And perhaps you expect to periodically re-publish "metadata" of all existing (known) data sources?
And is your heresy going so far as to suggest that these "metadata" will not be in form of XML that conforms to a set of sophisticated ISO standards, that even the experts do not know how to correctly validate, but instead have form of a simplistic JSON?
How dare you? If we were in a British film you would certainly be an Australian :-)
It is Monday morning, so I might be reading it all wrong - but I like it. Just please confirm (you know: "me native speaker be not").

Now when it comes to URLs, if we have full control over its elements then no encoding is needed. For example I do not think that we need wild cards (+, #, ...) when publishing information about a new data source - I expect the source to publish each type of data under a very concrete topic. The wild cards are usually used by the consumers to simplify subscription to multiple types of data. The only problematic part is the date and that could be addressed by templating that Tom suggested.
Anyway, if we do not have full control over some elements, then we need to resort to encoding. I such case we need to state which part is encoded. For example, path element that contains topic is URL encoded (let say, because it may contain slashes). In such case the "consumer" needs to first split the URL into its components and decode the part(s) that needs to be decoded. Now, when I re-read what I wrote and remembered that the whole thing is sent as JSON, I think that we could forget about URL encoding completely if we structure it like this:

{
    "rel" : "items",
    "type" : "application/json;subtype=x-wmo-wis",
    "title" : "AMQP feed of a super cool dataset",
    "broker": "amqps://anonymous:nopwd@example.org:12345"
    "vhost": "if-need-at-all"
    "topic": "topic.in.its.true.form/what/ever/it/*/is"
}

I am not sure what is the "rel", and I can imagine splitting the "broker" even more, but my main point here is that instead of constructing a complex (and possibly invalid URL) that the consumer needs to parse before actual use, we can provide each element of information separately.

@tomkralidis
Copy link
Author

tomkralidis commented Feb 17, 2020

@josusky thanks for the feedback. The big picture would be to have PubSub links available from WIS discovery metadata in an as actionable approach as possible (publish/find/bind). Example:

  • NMHS x provides WIS discovery metadata for their METARs. The WIS discovery metadata record would have links (via /gmd:distributionInfo//gmd:transferOptions//gmd:onLine/gmd:CI_OnlineResource) for:
    • a web accessible folder of the METARs
    • an OGC:WMS layer
    • a PubSub mechanism
  • client y searches a GISC and finds the WIS discovery metadata per above
  • client y subscribes to the PubSub mechanism

On WIS discovery metadata: the current offering is WMCP (XML), however things are slowly evolving into JSON. In theory, given ISO 19115 (which WMCP is a profile of) is an abstract specification, one could define a JSON encoding for same (like ISO 19139 does for XML). The next generation OGC Catalogue Standards (OGC CSW basically becoming OGC API - Records), are making way for much simpler APIs as well as JSON as a core representation. Actually, this is happening for numerous OGC API standards (see https://ogcapi.ogc.org for more info).

Back to links, here is the current thinking around representing links in JSON in OGC API standards. For WIS purposes, we could extend as we want but putting in a proper URI would help with broad interoperability. Let's work on something in between to balance complexity and practicality.

The bigger picture (to be setup in another project/thread) is to setup a WIS 2 pilot between a few of us, using the evolving OGC API standard for discovery (OGC API - Records), for WIS metadata using JSON encoding. Having actionable PubSub would be a huge win to demonstrate easy APIs and easy discovery metadata representations to lower the barrier to access to users. I don't see XML or WMCP going away anytime soon for advanced use, but there is a huge opportunity for "the rest of us".

@petersilva
Copy link
Contributor

petersilva commented Feb 17, 2020

What @josusky is saying works too. If topic is a separate entry, then yes, it could be interpreted based on the protocol specified in the broker, or even handed off, as-is. One of the main conclusions I have drawn from this project is that being multi-protocol is good.

Using the original mapping used in this project so far, it would look like:


{
    "rel" : "items",
    "type" : "application/json;subtype=x-wmo-wis",
    "title" : "Feed of a super cool dataset",
    "broker": "amqps://anonymous:nopwd@example.org:12345",
    "vhost": "if-need-at-all",
    "exchange": "xpublic", 
    "topic": [ "v03.post.*.DWD.its.true.form.#", "v03.post.*.CMC.its.true.form.#" ]
}

As you can see in the topic header here, the dot separator would make a really ugly URI, so putting it in a separate topic header makes a great deal of sense. and we don't need any url-encoding then.

AMQP requires an exchange to specified to perform a binding (in our applications we invented a convention of using "xpublic". Vendor implementations vary in their default names, so no portable solution is possible.)

The client connects to the broker, declares a (convention determined) named queue, and then makes a binding between the exchange and the queue, using the topics.

The topic changing into a JSON Array implements OR quite elegantly. When processed and re-published by an MQTT broker, using this project's mapping (AMQP exchange -> top of topic hierarchy). It would look like so:


{
    "rel" : "items",
    "type" : "application/json;subtype=x-wmo-wis",
    "title" : "Feed of a super cool dataset",
    "broker": "mqtts://anonymous:nopwd@example.org:5678",
    "vhost": "if-need-at-all",
    "topic": [ "xpublic/v03/post/+/DWD/its/true/form/#", "xpublic/v03/post/+/CMC/its/true/form/#" ]
}

In the MQTT case, The client connects to the broker with a client_id subscribes to topics
If (as is likely) someone wants to add another protocol later, and it needs some other fields, this method leaves our options a lot more open, and it eliminates the need for encoding conventions.

@tomkralidis
Copy link
Author

Here's an example from DWD (thanks @kaiwirt) in https://gisc.dwd.de/wisportal/#SearchPlace:q?pid=sd1065_wmo_test

        <gmd:onLine>
          <gmd:CI_OnlineResource>
            <gmd:linkage>
              <gmd:URL>amqps://oflkd013.dwd.de:5671</gmd:URL>
            </gmd:linkage>
            <gmd:protocol>
              <gco:CharacterString>AMQPS</gco:CharacterString>
            </gmd:protocol>
            <gmd:name>
              <gco:CharacterString>exchange: netcdf_pilot, routing_key: v03/WIS/de/offenbach_met_com_centre/observation/sea/surface/</gco:CharacterString>
            </gmd:name>
            <gmd:description>
              <gco:CharacterString>WMO Information System, pub/sub messaging for new meteorological data, download products/data via link contained in the message (baseUrl+relPath). Please ask GISC Offenbach for registration to get username/password. Topic structure based on https://github.com/wmo-im/GTStoWIS2</gco:CharacterString>
            </gmd:description>
          </gmd:CI_OnlineResource>
        </gmd:onLine>

So perhaps the equivalent in next generation WIS metadata could be:

{
  "rel": "items",
  "type": "OASIS:AMQPS",
  "title": "cool feed",
  "href": "amqps://oflkd013.dwd.de:5671",
  "wmo:exchange": "netcdf_pilot",
  "wmo:routingKey": "03/WIS/de/offenbach_met_com_centre/observation/sea/surface/"
}

Thoughts?

@petersilva
Copy link
Contributor

for amqp there is also a concept of vhost... some brokers (rabbitmq) include that as after the port in the href (link higher in the thread.)

@tomkralidis
Copy link
Author

Would vhost be better off as 1./ optional property or 2./ up to the provider to add to the (required) href property if needed?

@josusky
Copy link
Contributor

josusky commented Sep 22, 2021

I am not sure with the vhost but the "routingKey" is more usually referred to as topic (see GTStoWIS2).

@josusky
Copy link
Contributor

josusky commented Sep 22, 2021

Thus perhaps it could be, (in accordance with @petersilva 's example):

{
  "rel": "items",
  "type": "OASIS:AMQPS",
  "title": "cool feed",
  "href": "amqps://oflkd013.dwd.de:5671",
  "vhost": "if-need-at-all",
  "exchange": "netcdf_pilot",
  "wmo:topic": "v03.WIS.de.offenbach_met_com_centre.observation.sea.surface"
  "messageFormat" : "application/json; subtype=x-wmo-wis",
}

@tomkralidis Note that I have removed "wmo" prefix from the exchange as that is not WMO-specific but rather AMQP-specific thing, same as "vhost". The "topic" is a WMO-specific thing so it may deserve a namespace prefix. And I have changed "/" to "." as that is the delimiter for AMQP. But we could decide to use always "/" as a WMO standard that needs to be translated for the underlying pub-sub protocol if needed.
In Peter's example, the "type" indicated the actual type of messages that the service sends. I think that that needs to be preserved somewhere too, as AMQP brokes are used to provide other types (formats) of messages too. Therefore, I have added it as "messageFormat".

(added a v to the version spec at the start of the topic)

@tomkralidis
Copy link
Author

Update: in OGC API specifications, a link type is the actual MIME type. So using @josusky's latest example:

{
  "rel": "items",
  "type": "application/json",
  "title": "cool feed",
  "href": "amqps://oflkd013.dwd.de:5671",
  "vhost": "if-need-at-all",
  "exchange": "netcdf_pilot",
  "wmo:topic": "v03.WIS.de.offenbach_met_com_centre.observation.sea.surface"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants