Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not automatically resolving remote references (in 4.18) breaks backwards compatibility #1089

Closed
joooeey opened this issue Apr 18, 2023 · 7 comments
Labels
Bug Something doesn't work the way it should.

Comments

@joooeey
Copy link

joooeey commented Apr 18, 2023

Even the most basic use case of fetching a HTTPS ref fails. I.e. with the simple schema:

{"$ref": "https://geojson.org/schema/Point.json"}

I've tried a few others.

Python 3.11.2 (main, Mar 27 2023, 23:42:44) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import jsonschema
>>> jsonschema.validate(
...     instance={"type": "Point", "coordinates": [0, 1]},
...     schema={"$ref": "https://geojson.org/schema/Point.json"},
... )
Traceback (most recent call last):
  File "/home/lukas/anaconda3/envs/json/lib/python3.11/site-packages/jsonschema/validators.py", line 406, in _validate_reference
    resolved = self._resolver.lookup(ref)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lukas/anaconda3/envs/json/lib/python3.11/site-packages/referencing/_core.py", line 582, in lookup
    raise exceptions.Unresolvable(ref=ref) from None
referencing.exceptions.Unresolvable: https://geojson.org/schema/Point.json

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/lukas/anaconda3/envs/json/lib/python3.11/site-packages/jsonschema/validators.py", line 1263, in validate
    error = exceptions.best_match(validator.iter_errors(instance))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lukas/anaconda3/envs/json/lib/python3.11/site-packages/jsonschema/exceptions.py", line 424, in best_match
    best = next(errors, None)
           ^^^^^^^^^^^^^^^^^^
  File "/home/lukas/anaconda3/envs/json/lib/python3.11/site-packages/jsonschema/validators.py", line 331, in iter_errors
    for error in errors:
  File "/home/lukas/anaconda3/envs/json/lib/python3.11/site-packages/jsonschema/_validators.py", line 284, in ref
    yield from validator._validate_reference(ref=ref, instance=instance)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lukas/anaconda3/envs/json/lib/python3.11/site-packages/jsonschema/validators.py", line 408, in _validate_reference
    raise exceptions._WrappedReferencingError(err)
jsonschema.exceptions._WrappedReferencingError: <exception str() failed>
>>> 

In 4.17.3, the jsonschema.validate(...) call prints nothing (returns None) as expected.

@Julian
Copy link
Member

Julian commented Apr 18, 2023

Oh, right, this is definitely semi-intentional (see the docs) -- as doing so is a security risk in general which has only stuck around this long for backwards compat -- I did certainly forget that if you provide neither a ref resolver nor a registry (for the new API) that this does immediately break that compatibility.

Will have to think about what to do / whether to re-introduce a deprecation period for this case.

Though you should certainly stop doing it!

@Julian Julian added the Bug Something doesn't work the way it should. label Apr 18, 2023
@Julian Julian changed the title RefResolver is not working with remote refs in 4.18a01, 4.18a02, 4.18a03 and 4.18a04 Not automatically resolving remote references (in 4.18) breaks backwards compatibility Apr 18, 2023
@joooeey
Copy link
Author

joooeey commented Apr 18, 2023

I see. I might have a look at this once 4.18.0 is final. For my use case, security risks appear minimal. I would hope that this library itself is protected against JSON-injection attacks?

I am however very concerned about readability and quick editing. Therefore, I might implement a registry that ignores tag: URIs, fetches https: URIs, and caches the files somewhere in the Python package, so that users do not need an internet connection.

@Julian

This comment was marked as off-topic.

@joooeey

This comment was marked as off-topic.

@Julian

This comment was marked as off-topic.

@joooeey

This comment was marked as off-topic.

Julian added a commit that referenced this issue Apr 25, 2023
Changing this without deprecation is backwards incompatible, so we
re-introduce a warning.

This only applies if you have not configured neither a Registry nor a legacy
RefResolver. Both of the former 2 cases already have 'correct' behavior (the
former will not automatically retrieve references and is not backwards
incompatible as it is a new API, and the latter will do so but is already
fully deprecated by this release).

Cloess: #1089
@Julian
Copy link
Member

Julian commented Apr 25, 2023

OK, this has now been re-enabled (with specific deprecation warnings for the behavior even if you weren't configuring RefResolver as in your example).

It will stick around until RefResolver is removed (at which point we'll fall back to referencing.Registry's behavior which is to not do this retrieval).

@Julian Julian closed this as completed Apr 25, 2023
Julian added a commit that referenced this issue Jun 12, 2023
In other words when someone does *not* do Validator(..., registry=...)

This both limits the case which we *do* still perform automatic remote
ref retrieval (in a way that still preserves backwards compatibility
given that the registry argument did not previously exist, so someone
using it must be doing something new) as well as fixes an issue with
creating a validator with a registry that has a retrieve function.

Previously the latter raised a ValueError (coming from referencing which
doesn't want you combining registries unless they agree about retrieval
functions. See the added test(s) for details).

Fixes the issue mentioned here:

#1065 (comment)

Refs: #1089
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something doesn't work the way it should.
Projects
None yet
Development

No branches or pull requests

2 participants