Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically fetch URIs objects when framing #126

Closed
sharpaper opened this issue Jun 7, 2020 · 5 comments
Closed

Automatically fetch URIs objects when framing #126

sharpaper opened this issue Jun 7, 2020 · 5 comments

Comments

@sharpaper
Copy link

Let's say that I have a jsonld document that contains a property like

"knows": "https://example.org/alice"

I would like to apply a frame to this context, such that https://example.org/alice is automatically fetched and inserted into the final document, that is:

"knows": {
    "@id": "https://example.org/alice",
    "name": "Alice"
}

I've seen in the jsonld spec that it's possible to use @embed, but my question is: do I have to manually download the https://example.org/alice document, add it to the jsonld document, and only then apply the frame? Or can pyld automatically fetch the document if I use @embed?

@davidlehn
Copy link
Member

The @embed flag isn't used for that type of resource loading. I think you would have to load such resources yourself. The framing code can be used to merge and reshape the data for you though.

@sharpaper
Copy link
Author

Thanks for the reply.

  • Is there any other way than @embed to work with frames that will automatically fetch remote documents? Or is it completely outside the specification of jsonld?
  • "The framing code can be used to merge..." what would be an example of merging? How can I merge 2 separate docs and apply 1 frame (with pyld)?

@davidlehn
Copy link
Member

  • Re remote doc fetching: As a general problem, that's difficult to solve. You can't know which resources to load. In some small structure it may seem obvious, but if you expanded the data so there were properties linking off to lots of other resources, how would you know which ones to load? So I think that's going to have to be at an application level. You could scan your data for particular properties and load them. And once you have all that data loaded... on to part 2:
  • Here's a framing example. It's a bit old, there may be better ways to do this now, but hopefully it gets the idea across. You externally load all your people docs (like alice), make a @graph container doc with all the other docs, frame it, and extract and build your output:
#!/usr/bin/env python
# coding=utf-8

import os
import sys
# setup to run this code from pyld/tmp/example/
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../..', 'lib'))
from pyld import jsonld
import json

ctx = {
    "knows": {"@type": "@id", "@id": "https://example.com/knows"},
    "name": "https://example.com/name"
}
doc = {
    "@context": ctx,
    "knows": "https://example.org/alice"
}
alice = {
    "@id": "https://example.org/alice",
    "name": "Alice"
}

frame = {
    "@context": ctx,
    "@embed": "@always"
}
data = {
    "@context": ctx,
    "@graph": [
        doc,
        alice
    ]
}

print('---DATA---')
print(json.dumps(data, indent=2))

framed = jsonld.frame(data, frame)
print('---FRAMED---')
print(json.dumps(framed, indent=2))

output = framed["@graph"][0].copy()
output["@context"] = framed["@context"]
print('---OUTPUT---')
print(json.dumps(output, indent=2))
---DATA---
{
  "@context": {
    "knows": {
      "@type": "@id",
      "@id": "https://example.com/knows"
    },
    "name": "https://example.com/name"
  },
  "@graph": [
    {
      "@context": {
        "knows": {
          "@type": "@id",
          "@id": "https://example.com/knows"
        },
        "name": "https://example.com/name"
      },
      "knows": "https://example.org/alice"
    },
    {
      "@id": "https://example.org/alice",
      "name": "Alice"
    }
  ]
}
---FRAMED---
{
  "@context": {
    "knows": {
      "@type": "@id",
      "@id": "https://example.com/knows"
    },
    "name": "https://example.com/name"
  },
  "@graph": [
    {
      "knows": {
        "@id": "https://example.org/alice",
        "name": "Alice"
      }
    },
    {
      "@id": "https://example.org/alice",
      "name": "Alice"
    }
  ]
}
---OUTPUT---
{
  "knows": {
    "@id": "https://example.org/alice",
    "name": "Alice"
  },
  "@context": {
    "knows": {
      "@type": "@id",
      "@id": "https://example.com/knows"
    },
    "name": "https://example.com/name"
  }
}

@sharpaper
Copy link
Author

Thank you, that makes sense. However I think that a flag to automatically fetch remote documents might not be too difficult. What I mean is that @embed right now works like this: if the document contains a term "knows": "URL", both "@embed": "@always" and "@embed": "@never" produce the output "knows": "URL". However if the document contains the actual node, "@embed": "@never" produces"knows": "URL" wheres "@embed": "@always" produces "knows": { "@id":..., "name":... }. You're right that frames can be pretty complex but, I think, if "@embed": "@always" is already producing a URL in the output, it would be useful to have a flag that automatically tries to fetch that particular document. Or am I wrong?

@davidlehn
Copy link
Member

Doing that type of processing is beyond the scope of the API here.

Perhaps it's worth exploring a multi-step transformation in your app to get the same effect. Frame the data to find all the "knows" data, dereference that data in your app as appropriate, and merge the original data and all the "knows" results with a new frame. I think the output would be the same, it's just pushing the details and complexity up to the app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants