Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

annotate record associations with service details to facilitate service binding #95

Closed
pvgenuchten opened this issue Feb 20, 2021 · 33 comments

Comments

@pvgenuchten
Copy link
Contributor

pvgenuchten commented Feb 20, 2021

Tom mentioned this aspect in the final demo of the February 2021 sprint, the record model should contain enough service details so a client is fully facilitated to bind to the relevant data in a service, this should be the case for OGC API, but also to the OWS stack (i prefer if this goes in core, but it could also be an extra conformance class).

Key aspects to binding to a service are:

  • what type of service is it
  • which are the relevant layernames/featuretypes

Current associations follow the link model, properties are:

  • href (url) -> link to the service
  • rel -> list of available rel values seems already extended, maybe we can extend it with "service", "alternate", "attachment", to indicate if the target is a service, website or file-download
  • type -> seems best option to capture service type
  • hreflang -> language of the target
  • title -> populate with layername/featuretype, in the Netherlands we have in iso19139 a similar convention to populate the name element with the layer/featuretype, could be valuable to adopt
  • templated (boolean)

In the meeting it was suggested that the url could also be templated so a client could create a valid request by replacing placeholders in the template with valid parameters, e.g.:

<link href="http://example.com?request=getmap&layer=foo&style=foo&bbox={{bbox}}&width={{width}}&height={{height}}&format=png&crs={{crs}}&version=1.3.0&service=wms"

We need a codelist for the service types, uri's would make sense, but placing a url in an attribute is not practical, various initiatives exist for such a list, see INSPIRE and cat-interop. Notice that in both the identifiers from OGC definition service are used

A complicating aspect is that the OWS stack throws an error, in case the endpoint is called without the required parameters. Within the INSPIRE best practices, there is a convention to always link to {service}?service=xxx&request=getcapabilities, so the link will always return a valid response. The record model could recommend this, for linking to OWS services, or more general: a link-url should be in such a form that it doesn't return error 500.

For OGC API, de we need a convention to link to the landing page of the service and select the relevant collection with the title attribute, like we do in OWS, or can we link directly to /landing/collections/foo (or do we allow both)? In either case the client would need to deduce from the url structure where the openapi document lives, in order to discover the available encodings/queryables etc.

@pvretano
Copy link
Contributor

22-FEB-2021: Peter made the comment that OGC APIs should provide "default" responses (like features at the /items endpoint) to make binding to the resource easier. Clemens: We are not sure if there is anything we need to do yet in core about this aspect but the general goal of being able to bind to a resource discovered via the catalogue is good.

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Feb 23, 2021

Some of this discussion is already taken place in #13, the aspect brought in by @dr-shorthair to use an element conformsTo (from DCAT), to indicate the type of service link could be an alternative to the suggested use of < link type="OGC:WMS"/ >. Type seems reserved to iana mediatypes, which services are not.

Services providing answers with default parameters for sure is a good idea, however it will not work on links to OWS services, which will probably be around for a while.

@dr-shorthair
Copy link

Not limited to IANA. The range of dcterms:conformsTo is dcterms:Standard.
So this just entails that the object of a triple with this predicate is of rdf:type dcterms:Standard alongside any other type that it has. (i.e. the intersection of the types is not the empty set)

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Feb 24, 2021

An experiment to annotate OGC API associations with a json-ld-context could be:

"@context": {
    ...
    "associations": "dcat:service",
    "href": "dcat:endpointUrl",
    "title": "dct:title",
    "id": "dcat:servesDataset",
    "schema": "dct:conformsTo",
    "description": "dcat:endpointDescription"
   ...
),{
...
associations: [{
  @type:"dcat:DataService",
  @id:"http://example.com/foo",
  href:"http://example.com/my-service",
  title:"My favourite service",
  schema: "https://inspire.ec.europa.eu/applicationschema/ad",
  id: "12",
  description: "http://www.opengis.net/def/serviceType/ogc/wms"
}]
...
}

@andrea-perego
Copy link

@dr-shorthair said:

Not limited to IANA. The range of dcterms:conformsTo is dcterms:Standard.
So this just entails that the object of a triple with this predicate is of rdf:type dcterms:Standard alongside any other type that it has. (i.e. the intersection of the types is not the empty set)

@pvgenuchten , about this, you can find examples on the use of dct:conformsTo together with URIs from INSPIRE and cat-interop in DCAT 2 and GeoDCAT-AP (where dct:conformsTo is used to map gmd:protocol):

https://www.w3.org/TR/vocab-dcat-2/#ex-service-gsa

https://semiceu.github.io/GeoDCAT-AP/releases/2.0.0/#ex-data-service

@andrea-perego
Copy link

@pvgenuchten said:

An experiment to annotate OGC API associations with a json-ld-context could be:

[...]

Please note that dcat:endpointDescription is meant to point to a (possibly machine-actionable) description of the endpoint, which in this case should be a GetCapabilities request to http://example.com/my-service.

In DCAT the protocol is specified with dct:conformsTo, as noted earlier.

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Feb 24, 2021

Thank you @andrea-perego, I missed that aspect,
so the revised example would be

{"@context": {
    "associations": "dcat:service",
    "href": "dcat:endpointUrl",
    "title": "dct:title",
    "id": "dcat:servesDataset", 
    "type": "dct:conformsTo",
    "description": "dcat:endpointDescription"
},
"associations": [{
  "@type": "dcat:DataService",
  "@id": "http://example.com/foo",
  "href": "http://example.com/my-wms-service",
  "title": "My favorite service",
  "type": "http://www.opengis.net/def/serviceType/ogc/wms",
  "id": "12",
  "description": "http://example.com/my-wms-service?service=wms&request=GetCapabilities"
},{
  "@type": "dcat:DataService",
  "@id": "http://example.com/foo2",
  "href": "http://example.com/ogcapi",
  "title": "My favorite ogcapi service",
  "type": "http://www.opengis.net/def/interface/ogcapi-features",
  "id": "Adresses",
  "description": "http://example.com/ogcapi/openapi"
}
]}}

@cportele
Copy link
Member

@pvgenuchten - I am not sure what you are trying to do, but

{
  "@type": "dcat:DataService",
  "@id": "http://example.com/foo",
  "href": "http://example.com/my-wms-service",
  "title": "My favourite service",
  "type": "http://www.opengis.net/def/serviceType/ogc/wms",
  "id": "12",
  "description": "http://example.com/my-wms-service?service=wms&request=GetCapabilities"
}

is not an association. An association in Records is a web link to another resource, not a DCAT resource. There is no id or description in a web link (of course you can add them, but don't expect others to process them) and type is a media type hint. However, every link must have a rel property. Maybe you better add another record property/queryable that is specific to your use case?

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Feb 24, 2021

For me this is an exploration on how to model service binding details compatible with OGC API Records, but adopting some of the DCAT linkage conventions. For me an important aspect is to allow implementors to implement both, so they don't contradict each other.

Ok, so if link@type requires a mediatype (which would need a repetition for each of the mediatypes supported), there should be an alternate property to contain a reference to the protocol, e.g.

{
  "href": "http://example.com/wms",
  "title": "My favorite wms service",
  "type": "image/png",
  "rel": "service",
  "protocol": "http://www.opengis.net/def/serviceType/ogc/wms",
  "id": "12",
  "description": "http://example.com/wms?service=wms&request=GetCapabilities"
}

@pvgenuchten
Copy link
Contributor Author

An association in Records is a web link to another resource, not a DCAT resource.

When reading about associations in the spec, my impression is that this is the record-aspect that deals with the service links (the transferoptions in iso19139). Which is an essential aspect of metadata, to indicate where and how the resource described can be accessed.

@pvretano
Copy link
Contributor

pvretano commented Feb 24, 2021

@pvgenuchten my thinking about this goes something like this ...

  1. A catalogue would have records that describe datasets (among other things).
  2. A dataset record might include links in the associations section to get the dataset as a bulk download and/or it might include links in the associations section to other records in the catalogue the described services that might also be used to access the data in the dataset.
  3. Such a service record might look like the following JSON fragment ...
  4. The main features of this JSON fragment are:
    (a) links to the various service endpoints (for getting capabilities, maps, etc.) are included in the associations section
    (b) these association links are templated links
    (c) I have added a parameters sections using a subset of OPENAPI schema for describing the parameters used in the templates
    (d) The links section has links to records describing the datasets that the service offers.

Additional musings ...

  1. I think that the records should include a reference to an OPENAPI document describing the WMS (in this example).
  2. I think, however, there is value in having enough information in the record itself to bind to the service hence my use of templated links with the additional parameters property to describe the substitution variables in the binding templates.
{
    "id": "...",
    "type": "Feature",
    "geometry": { ... },
    "properties": {
        "record-created": "2021-02-08",
        "record-updated": "2021-02-08",
        "type": "http://www.opengis.net/def/serviceType/ogc/wms",
        "title": "Service Offring Total Ozone - Daily Observations",
        "description": "A measurement of the total amount of ...",
        "keywords": [ "total", "ozone", "level 1.0", "column", "dobson",
                      "brewer", "saoz" ],
        "language": "en",
        "externalId": "urn:x-wmo:md:int.wmo.wis::...",
        "created": "2015-01-23",
        "updated": "2015-01-23",
        "publisher": "https://woudc.org",
        "themes": [
            {
                "scheme": "https://geo.woudc.org/codelists.xml#...",
                "concepts": [ "dobson", "brewer", "vassey", "pion", "microtops",
                              "spectral", "hoelper", "saoz", "filter" ]
            },
            {
                "scheme": "https://wis.wmo.int/2012/codelists/WMOCode...",
                "concepts": [ "atmosphericComposition", "pollution",
                              "observationPlatform", "rocketSounding" ]
            }
        ],
        "formats": [ "KML", "PNG", "JPEG", "GIF", "PDF", "SVG", "TIFF" ],
        "contactPoint": "https://woudc.org/contact.php",
        "license": "https://woudc.org/about/data-policy.php",
        "rights": null,
        "extent": [
            {
                "spatial": {
                    "bbox": [ -180, -90, 180, 90 ],
                    "crs": "http://www.opengis.net/def/crs/OGC/1.3/CRS84"
                },
                "temporal": {
                    "interval": [ [ "1924-08-18", null ] ],
                    "trs": "http://www.opengis.net/def/uom/ISO-8601/0/Gregorian"
                }
            }
        ],
        "associations": [
           {
              "rel": "service-desc",
              "type": "application/xml",
              "title": "OGC Web Map Service (WMS)",
              "href": "https://geo.woudc.org/ows?service=WMS&version=1.3.0&request=GetCapabilities&format={format}&updatesequence={updseq}",
              "templated": true"
           },
           {
              "rel": "map",
              "type": "image/png",
              "title": "OGC Web Map Service (WMS)",
              "href": "https://geo.woudc.org/ows?service=WMS&version=1.3.0&request=GetMap&crs={crs}&bbox={bbox}&layers=totalozone&width={width}&height={height}&format=image/png",
              "templated": true
           },
            ... additional links for the other service endpoints (e.g. GetStyles, etc.)...
        ],
        "parameters": [
           {
              "name": "layers",
              "description": "...",
              "required": true,
              "schema": {
                 "type": "array",
                 "items": {
                    "type": "string",
                    "enum": ["totalozoneobs","totalozone","umkehrn14-1","umkehrn14-2","lidar","ozonesonde","rocketsonde","multiband","spectral","broadband","uv_index_hourly","contributors","stations","instruments","notifications","metrics_data_distribution_station","metrics_data_distribution_dataset","metrics_data_distribution_contributor","metrics_data_contribution","metrics_ftp_access_dataset","metrics_ftp_access_station","metrics_ftp_access_instrument","metrics_waf_access_dataset","metrics_waf_access_station","metrics_waf_access_instrument","metrics_ows_access_dataset","metrics_ows_access_station","metrics_ows_access_instrument","metrics_data_distribution_dataset_frequency","metrics_data_distribution_dataset_summary","metrics_data_distribution_station_frequency","metrics_data_distribution_station_summary","metrics_data_distribution_contributor_network_summary","metrics_data_distribution_contributor_network_frequency","metrics_network_station_count","metrics_waf_access_dataset_summary","metrics_waf_access_instrument_summary","metrics_waf_access_station_summary","metrics_ows_access_dataset_summary","metrics_ows_access_instrument_summary","metrics_ows_access_station_summary","filelist","ndacc","eubrewnet"]
                 }
              }
           },
           {
              "name": "styles",
              "description": "...",
              "required": true,
              "schema": {
                 "type": "array",
                 "items": {
                    "type": "string",
                    "enum": [ "default" ]
                 }
              }
           },
           {
              "name": "crs",
              "description": "...",
              "required": true,
              "schema": {
                 "type": "string",
                 "enum": ["EPSG:4326", ... ]
              }
           },
           {
              "name": "bbox",
              "description": "...",
              "required": true,
              "schema": {
                 "type": "array",
                 "items": {
                    "type": "number",
                    "format": "double"
                 },
                 "minItems": 4,
                 "maxItems": 4
              }
           },
           {
              "name": "width",
              "description": "...",
              "required": true,
              "schema": {
                 "type": "number",
                 "format": "integer",
                 "minimum": 600,
                 "maximum": 5000
              }
           },
           {
              "name": "height",
              "description": "...",
              "required": true,
              "schema": {
                 "type": "number",
                 "format": "integer",
                 "minimum": 600,
                 "maximum": 5000
              }
           },
           {
              "name": "format",
              "description": "...",
              "required": true,
              "schema": {
                 "type": "string",
                 "enum": ["image/png","image/jpeg","image/gif","image/jpg"]
              }
           },
           {
              "name": "transparent",
              "description": "...",
              "required": false,
              "schema": {
                 "type": "boolean",
                 "default": false
              }
           },
           {
              "name": "bgcolor",
              "description": "...",
              "required": false,
              "schema": {
                 "type": "string"
              }
           },
           {
              "name": "exceptions",
              "description": "...",
              "required": false,
              "schema": {
                 "type": "sting",
                 "enum": ["xml","json"],
                 "default": "xml"
              }
           }
        }
    },
    "links": [
       {  
          "href": "http://...",
          "rel": "related",
          "type": "application/geo+json",
          "title": "Link to cataglogue record for totalozone dataset/layer"
       }, 
       {  
          "href": "http://...",
          "rel": "related",
          "type": "application/geo+json",
          "title": "Link to cataglogue record for instruments dataset/layer"
       }, ... additional links to records describing offered datasets/layers ...
    ]
}

@pvretano
Copy link
Contributor

Further to my previous post, all the binding information to the service could, of course, be included directly in a dataset record but I figured that the dataset descriptions and the service descriptions would be separate records that are joined via links in the links section of each record. This, of course, is a decision that a community of use would make about how they want to map resource description into the catalogue and it depends on your view of the world.

The important information from my last post is how I envision that binding information would be included in a record.

@pvgenuchten
Copy link
Contributor Author

Thanx @pvretano, it could work for me.
What is less optimal in this approach is that there is no option to indicate the protocol. For OWS resources I can extract it from the templated url &service=xxx, not optimal, but ok-ish. Doing the same for OGC API's is more challenging, I would have to request the /collections/foo endpoint to know via it's property itemType what type of items it contains.

@pvretano
Copy link
Contributor

@pvgenuchten when you say protocol do you mean the HTTP METHOD (GET/PUT/PATCH/POST/DELETE) or a reference to the specific service specification (e.g. WMS, WFS, etc.) that defines the protocol?

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Feb 24, 2021

it is a term which is regularly used to indicate the type of OGC service, either wms, wfs, ogc-api-features, sparql, jdbc-sql, ldap, git etc
some iso19139 metadata profiles use the online-resource@protocol value to capture that element, i'm neutral about how we call it, but to me it is a valuable aspect to capture. It prevents setting up algorithms to parse url's or sniff endpoints to know what type of service is at the other side.

@pvretano
Copy link
Contributor

@pvgenuchten ah, ok. I have updated the example above to use the identifier http://www.opengis.net/def/serviceType/ogc/wms from the OGC definition server as the value of the resource type property.

@pvgenuchten
Copy link
Contributor Author

ah, so in this case you are describing a service, having a type=wms.
What i'm hoping for is to describe a dataset which has a number of associations, each having a type (wms/wfs/ogcapi-f/ogcapi-m)

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Feb 25, 2021

There are actually 7 types of links for which an example could be included in the spec on how to annotate it.

  • Links to alternate representations of the record (iso19115, dcat, schema.org, atom, xml, json, html)
  • Link to related records, describing siblings, collections, projects, services, etc
  • Link to the original location of the record, in case it is imported/harvested from elsewhere
  • Link to a document/webpage with more information about the dataset or access to the dataset
  • Link to a (web)application which facilitates access to a dataset
  • Link to a location to download a distribution of the dataset
  • Link to a service endpoint, to interact with the dataset

Parameters that can be used to facilitate these cases are:

  • Associations or Links section
  • rel attribute (e.g. record, service, document, alternate, etc)
  • type -> mediatype

Or define new parameters to differentiate

@pvgenuchten
Copy link
Contributor Author

record.yaml introduces a list of formats in which the record is available, I suggest to move format to a property of link (type).

@pvretano pvretano added this to To do in Part 1: Core via automation Mar 8, 2021
@pvretano
Copy link
Contributor

pvretano commented Mar 8, 2021

08-MAR-2021: @pvretano modify the schema of the link structure to allow an OPTIONAL parameters section where substitution variables for templated links can be defined thus providing the final bit of information required by a client to link to a resource such as an old style W*S service. @pvretano will also provide examples.
@kalxas pointed out that the parameters section could also include parameters that are not actually part of the URL template which would be useful in allowing the client to add additional parameters without them being specifically listed in the template.

@cportele
Copy link
Member

Looking at the code snippet, the proposal seems to be heavily influenced by the desire to support the WxS standards and with special rules (e.g., required). If this moves forward, I think it should be in another part; it should at least be in a separate requirements class.

If we put something in Part 1, then it should be of general use and consistent with RFC 6570 (URI templates). Maybe something along these lines:

  • To keep processing simple, we should restrict expressions to a single variable with no operators or modifiers.
  • A templated link (templated is true) can have another member variables.
  • variables is an object where each member is one variable. The value is a schema for the variable. Valid types are "integer", "number", "boolean", "string" or an array of such values (separated by comma).

The "map" example could then be expressed as shown below. This does not require any special knowledge and could be used in other contexts in OGC API standards, too.

{
   "rel": "map",
   "type": "image/png",
   "title": "Ozone map",
   "href": "https://geo.woudc.org/ows?service=WMS&version=1.3.0&request=GetMap&crs={crs}&bbox={bbox}&layers=totalozone&width={width}&height={height}&format=image/png",
   "templated": true,
   "variables": {
      "crs": {
         "description": "...",
         "type": "string",
         "enum": [ "EPSG:4326", "EPSG:3857" ]
      },
      "bbox": {
         "description": "...",
         "type": "array",
         "items": {
            "type": "number",
            "format": "double"
         },
         "minItems": 4,
         "maxItems": 4
      },
      "width": {
         "description": "...",
         "type": "number",
         "format": "integer",
         "minimum": 600,
         "maximum": 5000
      },
      "height": {
         "description": "...",
         "type": "number",
         "format": "integer",
         "minimum": 600,
         "maximum": 5000
      }
   }
}

@tomkralidis
Copy link
Contributor

Can we change variables to parameters ? We've just had a similar discussion today's EDR API SWG meeting.

@cportele
Copy link
Member

@tomkralidis - In general, we could pick any name we want, but why the overloaded parameters? This seems to be used in too many different meanings. RFC 6570 calls them variables (or expression, which is a superset, but if we would restrict the use of expressions to variables we could also be more precise by using the stricter term). I think it is always the safest choice and a good practice to use the proper term.

@pvretano
Copy link
Contributor

pvretano commented Mar 18, 2021

@cportele my feeling is that this should be in part 1 because of the need to be able to bind to found resources, especially services and particularly WxS service (which will be around for a bit) without the need of jumping through too many hoops. Templated links with a variables section gather all the information necessary to bind into one convenient bundle.

However, I was planning to use a separate, optional, conformance class for this capability. I'll add it as a separate conformance class and if we want to remove it, it should be easy enough.

Question: is there some inherent benefit of an object with keys being the variable names rather than an array of objects where the variable name is specified using a "name" or "id" key? Just curious. I suppose the key approach lets you ask the parser to just pull "variables.bbox" rather than searching the array for the appropriate variable .. so that is something.

@tomkralidis because of RFC 6570 and OpenSearch I think variables is a better name.

On a related note, I think templated links with a variable section should be something in Common ... eventually anyway.

@tomkralidis
Copy link
Contributor

Thanks @cportele / @pvretano . Fair enough for variables, then. Also +1 for this getting into Common at some point (something we want to implement in EDR just the same).

@cportele
Copy link
Member

@pvretano @tomkralidis - We seem to be approaching a common understanding:

Yes, the variables member should eventually be specified in Common Core (where the Link object is defined, with template support, just until now without any explanation of the variables). But as usual we should drive the development from one of the resources, in this case Records and then move the common concept to the Common series.

And if we can agree on a general approach consistent with RFC 6570 then I am all for it to include it in Part 1. I was just concerned to add something that is quite specific to the WxS KVP bindings and where software has to parse the URI template and potentially add or remove parameters from a URI template.

On the object vs array question: if there is a unique key, like in this case, it makes the structure clearer and, as you say, access by key is easier and faster than in the array approach where you have to iterate through the array.

@rob-metalinkage
Copy link

+1 for variables and templating to be a common canonical pattern - this is exactly the approach I landed on looking at how to make a distributed approach to data access a decade ago: https://confluence.csiro.au/public/SIRF/datanetwork-api

i guess the only question is whether it should be in core or whether you might need to support ultralight services with no or little self-description (because they are not intended to be interoperable outside closed systems) and self-describing services.

consider a DCAT resource describing a service via dcat:conformsTo - is there value in being able to say it supports self-description or not?

@dr-shorthair
Copy link

conformsTo is a Dublin Core term, not DCAT - dcterms:conformsTo

@pvretano pvretano moved this from To do to In progress in Part 1: Core Mar 22, 2021
@pvretano
Copy link
Contributor

Have updated the schema for links to include a variables section. For now the "variables" key is of type object but I am trying to define a JSON Schema of JSON Schema that defines the specified subset of JSON Schema that we will support. We have mentioned this concept of a "subset of JSON Schema" in various context so I imagine this might be useful elsewhere.

@pvretano
Copy link
Contributor

pvretano commented May 3, 2021

03-MAY-2021: This has been implemented in the current draft (i.e. the addition of the variables section). We will wait till the next SWG meeting for more feedback and then close.

@pvretano
Copy link
Contributor

pvretano commented Mar 2, 2022

@pvgenuchten can we close this? If yea, please do so. If not please update with what you think we need to do with this issue. Thanks.

@pvretano pvretano moved this from In Review to Waiting for Input/feedback in Part 1: Core Mar 2, 2022
@pvretano pvretano moved this from Waiting for Input/feedback to In Review in Part 1: Core Mar 2, 2022
@nmtoken
Copy link

nmtoken commented Aug 3, 2022

Not sure if this references the same issue, but shouldn't the Associations simple and URI Template Example in https://docs.ogc.org/DRAFTS/20-004.html include either a styles or sld or sld_body parameter otherwise the GetMap request will be invalid

@pvgenuchten
Copy link
Contributor Author

hi @pvretano, yes please close, thank you all for the discussion!

+1 nmtoken, the styles param seems required

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Part 1: Core
  
Done
Development

No branches or pull requests

8 participants