Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paginating and streaming nodes and links #75

Closed
duncandewhurst opened this issue Sep 6, 2022 · 3 comments
Closed

Paginating and streaming nodes and links #75

duncandewhurst opened this issue Sep 6, 2022 · 3 comments
Labels
Milestone

Comments

@duncandewhurst
Copy link
Collaborator

duncandewhurst commented Sep 6, 2022

We already have an open issue about formats for packaging multiple networks. However, we need to consider how a single network could be served via an API or streamed.

In Open Fibre, a single publisher is likely to publish just one network, but the network might be large (many links and nodes) so how to package and stream multiple networks is less important than how to serve via an API or stream a single large network.

Given that we expect most networks to be relatively small, I think that we should prioritise embedding data on nodes and links as the primary way of publishing OFDS data, but provide support for separate endpoints/bulk files for nodes and links to support pagination and streaming.

Proposal

Embedded nodes and links should be published in Network.nodes and Network.links, respectively. These fields may be omitted from API responses and large bulk files.

Links to API endpoints or bulk files for nodes and links may be provided in Network.relatedResources, which is an array of JSON Hyper-Schema Link Description Objects. Ideally, it would be named .links, but that would clash. Alternative name suggestions are welcome.

The 'type' of each link should be provided in relatedResources.rel. Its values are extension relationship types as recommended in JSON Hyper-Schema section 6.2.4, constructed according to the 'tag' URI scheme: tagURI = "tag:" taggingEntity ":" specific [ "#" fragment ].

The taggingEntity component is 'opentelecomdata.net,2022' based on the email address used in Contact. It should be noted that tags are not dereferenceable, the taggingEntity component merely serves as a way of making the tag unique, so this does not place any requirements on the opentelecomdata.net domain, other than it being assigned to the tagging entity on 1st Jan 2022.

{
  "nodes": [
    {...},
    {...}
  ],
  "links": [
    {...},
    {...}
  ],
  "relatedResources": [
    {
      "href": "http://example.com/api/nodes",
      "rel": "tag:opentelecomdata.net,2022:nodesAPI"
    },
    {
      "href": "http://example.com/nodes.jsonl",
      "rel": "tag:opentelecomdata.net,2022:nodesFile"
    },
    {
      "href": "http://example.com/api/links",
      "rel": "tag:opentelecomdata.net,2022:linksAPI"
    },
    {
      "href": "http://example.com/links.jsonl",
      "rel": "tag:opentelecomdata.net,2022:linksFile"
    },
  ]
}

Other approaches considered

Use oneOf

Allow Network.nodes and Network.links to be one of: embedded data, a link to an endpoint, or a link to a bulk file.

The problem with this approach is that it makes using and validating data difficult since one field can have various different types.

Handle links to endpoints and bulk files outside the schema

Provide guidance on how to serve data via API and how to publish large bulk files, e.g.

  • large bulk files: omit .nodes and .links from network.json and provide separate nodes.jsonl and links.jsonl files.
  • API: omit .nodes and .links from the data returned by your network endpoint and provide separate /nodes and /links endpoints.

The problem with this approach is that it makes discovery and aggregation difficult because there's no single point of access where a person (or more likely computer) could go to get all infromation on a network without having to look again at some other (non standardised) source to find what other files/apis there are.

Use the JSON api approach

https://jsonapi.org/format/#document-links allows a meta option which can contain information on file type and potentially other information such as counts, and API usage docs about the information in the link.

The problem with this approach is that link keys need to be IANA Link Relation Types, no relation types match the desired semantics, and, according to json-api/json-api#1076 (comment):

users aren't currently allowed to add their own keys to links objects either

@duncandewhurst duncandewhurst added this to the Alpha milestone Sep 6, 2022
@duncandewhurst
Copy link
Collaborator Author

The taggingEntity component is 'opentelecomdata.net,2022' based on the email address used in Contact. It should be noted that tags are not dereferenceable, the taggingEntity component merely serves as a way of making the tag unique, so this does not place any requirements on the opentelecomdata.net domain, other than it being assigned to the tagging entity on 1st Jan 2022.

@stevesong please could you confirm that you're happy with this? The rules for minting tags are described in more detail here.

@duncandewhurst duncandewhurst changed the title Support for pagination and streaming Paginating and streaming nodes and links Sep 9, 2022
@duncandewhurst duncandewhurst modified the milestones: Alpha, Beta Sep 14, 2022
@duncandewhurst
Copy link
Collaborator Author

The alpha schema and codelists added in #101 reflect the latest proposal in this issue. #90 adds related documentation.

This issue will remain open against the beta milestone to gather feedback from the alpha consultation.

@duncandewhurst
Copy link
Collaborator Author

We've not heard any further feedback on this issue so I'm going to close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant