Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecursionError with new refactored serialization #28741

Closed
2 tasks done
uranusjr opened this issue Jan 5, 2023 · 3 comments
Closed
2 tasks done

RecursionError with new refactored serialization #28741

uranusjr opened this issue Jan 5, 2023 · 3 comments
Assignees
Labels
area:core kind:bug This is a clearly a bug
Milestone

Comments

@uranusjr
Copy link
Member

uranusjr commented Jan 5, 2023

Apache Airflow version

main (development)

What happened

>>> from airflow.utils.json import XComEncoder
>>> XComEncoder().encode(set())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/airflow/airflow/utils/json.py", line 91, in encode
    return super().encode(o)
  File "/usr/local/lib/python3.7/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/lib/python3.7/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/opt/airflow/airflow/utils/json.py", line 82, in default
    return serialize(o)
  File "/opt/airflow/airflow/serialization/serde.py", line 97, in serialize
    if isinstance(o, _primitives):
RecursionError: maximum recursion depth exceeded in __instancecheck__

What you think should happen instead

The encoder should correctly encode a value, or raise TypeError if it is not encodable.

How to reproduce

Simply use XComEncoder to encode anything that’s not encodable by the built-in JSONEncoder (i.e. something that’d go through the custom default hook).

Operating System

Any

Versions of Apache Airflow Providers

Irrelevant

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else

The problem here is because serialize does not guarantee to return a JSON-serializable object. So when default receives an set, it returns a set, and the encoder would try to serialize that set again, calling default, and eventually the stack explodes.

The simplest solution would be to add an additional flag to serialize so it can return a JSON-compatible type if requested.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@uranusjr uranusjr added kind:bug This is a clearly a bug area:core labels Jan 5, 2023
@uranusjr uranusjr self-assigned this Jan 5, 2023
@uranusjr
Copy link
Member Author

uranusjr commented Jan 5, 2023

Problematic code introduced in #28067.

@potiuk
Copy link
Member

potiuk commented Jan 17, 2023

cc: @bolkedebruin

@potiuk potiuk added this to the Airflow 2.6.0 milestone Jan 17, 2023
@bolkedebruin
Copy link
Contributor

This is not an issue anymore

from airflow.utils.json import XComEncoder
XComEncoder().encode(set())

'{"__classname__": "builtins.set", "__version__": 1, "__data__": []}'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core kind:bug This is a clearly a bug
Projects
None yet
4 participants