Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load a remote context via an unsupported network protocol #1650

Closed
anatoly-scherbakov opened this issue Jan 8, 2022 · 8 comments
Closed
Labels
enhancement New feature or request format: JSON-LD Related to JSON-LD format. marked for closing The issue or PR will be closed soon if no further feedback is provided. networking Related to networking.

Comments

@anatoly-scherbakov
Copy link
Contributor

Under RDFLIb 6.0.1, I am trying the following example:

import json

from rdflib import Graph


def test_rdflib_jsonld():
    graph = Graph()
    graph.parse(
        data=json.dumps({
            '@context': {
                '@import': 'ipfs://foo/bar/baz.json',
            },
        }),
        format='json-ld',
    )

which obviously fails:

tests/test_experiments/test_rdflib_jsonld.py:5 (test_rdflib_jsonld)
test_rdflib_jsonld.py:8: in test_rdflib_jsonld
    graph.parse(
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/graph.py:1258: in parse
    parser.parse(source, self, **args)  # type: ignore[call-arg]
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/plugins/parsers/jsonld.py:125: in parse
    to_rdf(data, conj_sink, base, context_data, version, generalized_rdf)
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/plugins/parsers/jsonld.py:144: in to_rdf
    return parser.parse(data, context, dataset)
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/plugins/parsers/jsonld.py:164: in parse
    context.load(local_context, context.base)
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/plugins/shared/jsonld/context.py:362: in load
    self._read_source(source, source_url, referenced_contexts)
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/plugins/shared/jsonld/context.py:426: in _read_source
    imported = self._fetch_context(
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/plugins/shared/jsonld/context.py:413: in _fetch_context
    source = source_to_json(source_url)
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/plugins/shared/jsonld/util.py:34: in source_to_json
    source = create_input_source(source, format="json-ld")
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/parser.py:325: in create_input_source
    ) = _create_input_source_from_location(
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/parser.py:374: in _create_input_source_from_location
    input_source = URLInputSource(absolute_location, format)
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/parser.py:218: in __init__
    file = _urlopen(req)
../../../../.pyenv/versions/octadocs/lib/python3.8/site-packages/rdflib/parser.py:206: in _urlopen
    return urlopen(req)
../../../../.pyenv/versions/3.8.1/lib/python3.8/urllib/request.py:222: in urlopen
    return opener.open(url, data, timeout)
../../../../.pyenv/versions/3.8.1/lib/python3.8/urllib/request.py:525: in open
    response = self._open(req, data)
../../../../.pyenv/versions/3.8.1/lib/python3.8/urllib/request.py:547: in _open
    return self._call_chain(self.handle_open, 'unknown',
../../../../.pyenv/versions/3.8.1/lib/python3.8/urllib/request.py:502: in _call_chain
    result = func(*args)
../../../../.pyenv/versions/3.8.1/lib/python3.8/urllib/request.py:1390: in unknown_open
    raise URLError('unknown url type: %s' % type)
E   urllib.error.URLError: <urlopen error unknown url type: ipfs>

My particular question is not specifically about adding IPFS support to RDFLib but in the venue of the latter's extensibility without modifying its core code base.

Question. Would it be possible to add support for new network protocols or new mechanisms of retrieval for JSON-LD contexts to RDFLib without modifying its source code? Is there a mechanism like loaders (https://github.com/digitalbazaar/pyld/blob/master/lib/pyld/documentloader/requests.py) in pyld? Or, perhaps, pyld loaders can be somehow reused when parsing JSON-LD documents with rdflib?

P. S. As a workaround, one could expand() the source JSON-LD document supplying an appropriate custom loader and then use the result as input for RDFLib but I'd be interested to see if there is a better solution.

@nicholascar
Copy link
Member

Would it be possible to add support for new network protocols or new mechanisms of retrieval for JSON-LD contexts to RDFLib without modifying its source code?

I don't think so! I'm not aware that this issue have come up before and recently we removed all use of fancier network packages like requests and fell back to urllib in the Standard Library to reduce dependencies, so the code base is a but dumber I suppose and unlikely to handle fancy things like this.

But if you just wrapped the current import function with one that could handle other protocols, I'm sure that would be fine.

@anatoly-scherbakov
Copy link
Contributor Author

@nicholascar thank you for the response!

But if you just wrapped the current import function with one that could handle other protocols

I must confess I am not sure I understand, so let me rephrase. As a JSON-LD feature, @context allows to specify URL of the context to read, and it seems to me that JSON-LD support in rdflib is hard coded to only support HTTP(S) or local filesystem paths.

Is possible to override that, is there a hook in rdflib where a developer can insert their own downloader of @context URLs, say, based by scheme?

@aucampia aucampia added the enhancement New feature or request label Jan 16, 2022
@nicholascar
Copy link
Member

it seems to me that JSON-LD support in rdflib is hard coded to only support HTTP(S) or local filesystem paths.

Yes, I think that's the case.

Is possible to override that, is there a hook in rdflib where a developer can insert their own downloader of @context URLs, say, based by scheme?

There is no way that is planned for in the rdflib code base, so I think the method that reads the @context value will need to be extended to handle things other than file paths or HTTP(S) links.

So overall, you will have to find that function and add new logic to it!

@aucampia aucampia added the format: JSON-LD Related to JSON-LD format. label May 17, 2022
@aucampia
Copy link
Member

@aucampia aucampia added the networking Related to networking. label Aug 21, 2022
@aucampia
Copy link
Member

aucampia commented Mar 25, 2023

@anatoly-scherbakov as far as I can tell it should work fine to just install a custom URL opener, that supports IPFS for example, using urllib.request.install_opener. If this does not work, I would be happy to fix anything wrong with it.

@aucampia aucampia added the marked for closing The issue or PR will be closed soon if no further feedback is provided. label Mar 25, 2023
@aucampia
Copy link
Member

I would also be open to pull requests that implement custom handlers, like IPFS, into rdfib._contrib

@anatoly-scherbakov
Copy link
Contributor Author

Thank you. At the moment I do not have a particular use case for this but I will keep this pointer in mind. Thanks!

@aucampia
Copy link
Member

Closing this as we have identified how this can be done using python's standard library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request format: JSON-LD Related to JSON-LD format. marked for closing The issue or PR will be closed soon if no further feedback is provided. networking Related to networking.
Projects
None yet
Development

No branches or pull requests

3 participants