Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom deprecation proxy implementing __format__ #19734

Closed
wants to merge 1 commit into from

Conversation

uranusjr
Copy link
Member

Fix #19716.

@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Nov 21, 2021
@uranusjr
Copy link
Member Author

Hmm, we need to jump through some hoops serialising this into virtualenv.

@potiuk
Copy link
Member

potiuk commented Nov 21, 2021

Idea @uranusjr: seems that all those "deprecated" objects are datetime objects. Why don't we simply create a "deprecated datetime" which extends directly from datetime and raise deprecation warning at use?

We could override the class instance type to pass all the checks for isinstance and issubclass (obviously). This is less "generic" than using lazy_proxy, but it should behave much more "backwards-compatible" and serialization problems would be gone.

We do not really use the "lazyness" of the proxy, this is perfectly fine to instantiate those objects at creation time (they will probably take less memory than the proxy anyway).

@potiuk
Copy link
Member

potiuk commented Nov 21, 2021

The only problem is to generate the "warning" at first use, but I bet this could be easily implemented by a custom get_attr() method that could keep a boolean flag whether warning was emited or not from this particular object. Sounds easy.

@uranusjr
Copy link
Member Author

That wouldn’t help the serialisation issue though because those custom datetimes are still not unpicklable without Airflow being available in the virtual environment.

@potiuk
Copy link
Member

potiuk commented Nov 21, 2021

That wouldn’t help the serialisation issue though because those custom datetimes are still not unpicklable without Airflow being available in the virtual environment.

If we serialize it as a plain datetime it would actually.

The only reason we need a "proxy" are the warnings on use. but serializing/deserializing as plain datetime is exactly what we want IMHO. It could also be done with the custom proxy - but the more 'custom' code we add to the proxy, the less useful using the `lazy proxy' is.

@uranusjr
Copy link
Member Author

If we serialize it as a plain datetime it would actually.

How? We can’t just serialise that plain datetime object into the virtual environment, because that’s make it impossible to show the lazy deprecation warning to the user. We must serialise something custom. But doing that would cause deserialisation issues because code inside the virtual environment does not have access to that custom object’s definition.

@potiuk
Copy link
Member

potiuk commented Nov 22, 2021

How? We can’t just serialise that plain datetime object into the virtual environment, because that’s make it impossible to show the lazy deprecation warning to the user. We must serialise something custom. But doing that would cause deserialisation issues because code inside the virtual environment does not have access to that custom object’s definition.

We don't have to log the warning while in the venv. The warning will be logged while serializing - we do not have to addtionally log the warning while the task executes.

@uranusjr
Copy link
Member Author

Serialisation does not know exactly what variables will be used by the user inside the environment. We can’t just log everything because it’d emit a ton of superfulous warnings for all those deprecated variables even if the user never uses them.

@potiuk
Copy link
Member

potiuk commented Nov 22, 2021

Serialisation does not know exactly what variables will be used by the user inside the environment. We can’t just log everything because it’d emit a ton of superfulous warnings for all those deprecated variables even if the user never uses them.

Oh yeah, indeed. Good point.

I think the reasonable approach will be then to employ custom serialization step and add a small package (locally created dynamically so that we do not have to add it to PyPI) that will automatically be added while venv is created. I think will have to do the same for the @task.docker decorator as well.

@ashb
Copy link
Member

ashb commented Nov 22, 2021

Instead of a custom proxy object for some values could we not make a custom context dict class that returns lazily evaluates and returns actual objects instead?

Cos there was also another problem where previois_executiob_date is None was never true (because the lazy object proxy can't catch this case) - if there is this there could be more bugs.

@potiuk
Copy link
Member

potiuk commented Nov 22, 2021

Instead of a custom proxy object for some values could we not make a custom context dict class that returns lazily evaluates and returns actual objects instead?

Yeah. Agree - if we could make it works across seralization in virtualenv/docker cases that woudl be best apprach.

@uranusjr
Copy link
Member Author

But that custom context object will still be iterated through too eagerly when being serialised for the virtual environment and wouldn’t solve the actual problem at hand. But maybe a combination of the two approaches would work best: a custom context object that emits deprecation warnings on access but returns plain built-in objects, and a tiny module injected into the virtual environment to support deserialisation of that custom context object.

@uranusjr uranusjr deleted the context-proxy-format branch April 15, 2022 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Airflow 2.2.2] execution_date Proxy object - str formatting error
3 participants