Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3688] Fix some edge cases when parsing JSON Schemas #3937

Merged
merged 2 commits into from Feb 28, 2024

Conversation

Viicos
Copy link
Contributor

@Viicos Viicos commented Feb 26, 2024

Part of #3688

  • type can be an array
  • Exception when trying to iterate over a reference which points to a non-object sub schema

Copy link

codecov bot commented Feb 26, 2024

Codecov Report

Attention: Patch coverage is 96.15385% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 96.07%. Comparing base (1673fbc) to head (02ac8df).
Report is 24 commits behind head on master.

Files Patch % Lines
...s/registrations/contrib/objects_api/json_schema.py 96.15% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3937      +/-   ##
==========================================
- Coverage   96.35%   96.07%   -0.29%     
==========================================
  Files         715      715              
  Lines       22415    22437      +22     
  Branches     2574     2575       +1     
==========================================
- Hits        21599    21556      -43     
- Misses        565      626      +61     
- Partials      251      255       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 86 to 87
if "object" in type_list and "properties" in json_schema:
required = json_schema.get("required", [])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the type can be absent, this is valid:

required:
  - foo
properties:
  foo:
    type: string

and is equivalent to:

required:
  - foo
type: object  # or ['object']
properties:
  foo:
    type: string

Copy link
Contributor Author

@Viicos Viicos Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the case. If the type is absent, any data should validate. A thread is available here.

Unfortunately, this isn't explicitly stated in the spec (or I couldn't find it). However, the introduction seems to imply it:

The most basic schema is a blank JSON object, which constrains nothing, allows anything, and describes nothing:
{}
By adding validation keywords to the schema, you can apply constraints to an instance. For example, you can use the type keyword to constrain an instance to an object, array, string, number, boolean, or null.

I'm surprised to see the Python jsonschema library does not respect this, nor does this online validator, but I guess we'll have to comply with how jsonschema does the validation

Copy link
Contributor Author

@Viicos Viicos Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum in fact apparently not: OAI/OpenAPI-Specification#1388 (comment).

So to be precise:

required:
  - foo
properties:
  foo:
    type: string

is equivalent to:

required:
  - foo
type: ["object", "array", "number", ...]
properties:
  foo:
    type: string

and depending on the instance being validated, required/properties/minItems/... constraints apply

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah, I missed some nuance, TIL!

I was using that same JSON schema validator, so definitely didn't use the spec or a python library to validate my statement.

I think we should take a pragmatic approach here for our practical means. Maybe it's a good idea to create an issue in the object types API to show some warnings if such schemas get uploaded (missing a type key) because they may unintentionally be too loose.

Copy link
Contributor Author

@Viicos Viicos Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I ended up not relying on the type value, and only assert it is correctly set to object if provided and properties is set as a sanity check

Comment on lines 95 to 96
case {"type": "object"}:
yield from _iter_json_schema(v, json_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
case {"type": "object"}:
yield from _iter_json_schema(v, json_path)
case {"type": "object"} | {"properties": dict}:
yield from _iter_json_schema(v, json_path)
case {"type": [*types]} and "object" in types:
yield from _iter_json_schema(v, json_path)

I think?

This would also need the check that object is in the nested type array.

The pattern matching itself can probably be done in some clever way, or perhaps you would do the matching only on the properties key and add an assert inside the match:

type_ = v.get("type", "object")
assert isinstance(type_, (str, list))
assert type_ == "object" or "object" in type_

@sergei-maertens sergei-maertens merged commit f3377de into master Feb 28, 2024
24 of 26 checks passed
@sergei-maertens sergei-maertens deleted the json-schema-edge-cases branch February 28, 2024 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants