-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIP-58: Add object storage backend for xcom #37058
Conversation
This adds the possibility to store xcoms on a object storage supported backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a first pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good! I'm just wondering if we should move the introduced XCom backend and its configurations to the common.io
provider 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments. Looks good overall. Not tested by running though, was just reading.
@hussein-awala as mentioned in the topic start we can two multiple ways:
If we chose 2 then it makes sense to move it into common.io (with all the extras that it requires cause it is a 'new' capability for a provider). As it is backwards compatible and by default does the same as BaseXCom (storing data in the db) option 1 might make more sense. Then it should not go in common.io. For now (and preventing all the stuff that comes with a provider adjustment) I have assumed option 1 down the line. |
I think I''d support @hussein-awala 's idea here. There is a great feature which is non-obvious when thing are added in provider, namely dependency independence and ability to upgrade/downgrde separately from Airflow. And it's not much different than what we already have with the other secret backends - which are also implemented in providers, so this one would just follow the suite, and turns Binding such "auxiliary" functionality into Airflow core is also against our "Airflow-as-a-platform" approach. Without my suggestions applied (where I considered adding a DB change to make it clearer what is local and what remote), in my view this one simply builds on top of the Public API that we already have and there is no particular reason it should be in "core" airflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment.
(@bolkedebruin, moving this back to 2.9 as we won't ship this in a core patch release) |
@jedcunningham it was agreed to have this in 2.8.2 as we consider this a bug fix more than a feature release. What has changed? Edit: I'm okay with targeting 2.9 it was just in this thread that we decided to use 2.8.2. However checking my code for the provider I'm checking against 2.9 so it's fine to me just unexpected. |
Sorry, that decision was in a collapsed discussion I didn't expand :) It doesn't "feel" like a bugfix to me, but I'm happy to step back on this. @ephraimbuddy is the rm ultimately. |
O yes the "expansion bug" ;-). I'm fine either way. Targeting 2.8.2 will require a code change. |
@ephraimbuddy @bolkedebruin - I want to promote this PR - just confirming that it's going out in 2.8.2 for real? I agree with @jedcunningham that this feels like a new feature to me, but can never object to people getting something early. |
Side comment: We can sneakily pretend it's a bug-fix (as if - it should have behaved this way already, we just have not realized that. We had a lot of discussion on semver and classifying things and I am always in a position that SemVer is about the "intent" not technically breaking things, or technically adding new methods and fields and we can make individual decisions on some features or changes to classify them differently (mostly based on our assesment on how likely it is things will get broken or how much this new feature is really "new") I think people take "compatibility" far too seriously and far too "technically" :). |
Just chenging >= 2.9.0 right ? That's the only thing to change as far as I understand @bolkedebruin ? |
@potiuk the code assumes >= 2.9.0 now. So no code change is required for targeting 2.9.0 from my point of view. Ok missed part of the thread: If targeting 2.8.2 we need to change the code in the provider to check for 2.8.2+: https://github.com/apache/airflow/blob/main/airflow/providers/common/io/xcom/__init__.py#L26 If targeting 2.9.0 no change is required. I'm okay either way. As we have marked Object Storage experimental and adding the API call to XCom can really be considered a bug fix but has no effect of existing installations, I think 2.8.2 is fine |
Ok. Thought a bit and I propose, we take it slower (reassigned it to 2.9.0). I think we have some other - more important issues to solvve for 2.8.*, it was a bit "bendig" the rules to put it at 2.8.2 - and also likely some stability issues to be implemented as result of fsspec/universal_pathlib#173 and upcoming 0.2 release of universal pathlib (also see #37311 ) are quite a bit more important to handle |
Fine to me! |
This adds the possibility to store xcoms on a object storage supported backend.
This is an initial implementation and I am open for feedback (duh ;-) ). The BaseXCom is a bit messy and required some bug fixes. Hence, I have implemented the backend currently separated from BaseXCom. While functionally it is quite easy to merge both.
Notes:
get_one
purge
function that is called when deleting / clearing xcoms so custom backend can purge xcom when required. This has also been reported as a bug: XCom delete in ui and db clean do not trigger delete for custom backend #31774common.io
but given that I'd like to merge it with the standard BaseXCom I though to put it where it is now. Can be discussed of course@potiuk @dstandish @hussein-awala @uranusjr
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.