-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-4374] Make enum-like-classes inherit from enum #5302
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5302 +/- ##
==========================================
- Coverage 79.03% 79.02% -0.01%
==========================================
Files 481 481
Lines 30201 30206 +5
==========================================
+ Hits 23868 23869 +1
- Misses 6333 6337 +4
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General concern (I understand it was a deliberate choice after as described in the PR description)
I think this might be a super-disruptive change - pretty much all DAGs with non-default triggering rules have to be changed to work with this change,. I think it might be not enough to mention that in the updating but possibly also to keep backwards-compatibility for a foreseeable future. There are likely thousands of DAGs in production for many companies and that single change might hold those companies from migrating to 2.0 (even though it's fairly easy to have a "global" change, I am sure many of those companies will prefer to do a gradual migration).
Maybe we should consider backwards-compatibility by converting the strings to Enums and writing "deprecated" warning rather than forbidding the strings? It will not complicate the code too much and it is much nicer approach for anyone trying to manage many DAGs in their production systems and be able to migrate to Airflow 2.0.
Just a thought that crossed my mind - I think we should at least discuss it and make sure that we accept consequences of this change.
Yes I realise this potentially introduces many breaking DAGs, therefore it should be merged into Airflow 2.0 at the least. My main reason for this change is there are so many different ways of writing Airflow code, it can be quite confusing to users (in my experience), therefore if we provide only a single possible way of providing TriggerRules and WeightRules, it should unify and improve user experience. As an intermediate, in the BaseOperator init we could do something like this, to continue support for strings: if trigger_rule not in TriggerRule:
try:
# trigger rule values such as "all_done" will be converted to the TriggerRule enum
trigger_rule = TriggerRule(trigger_rule)
# show deprecation warning here
except ValueError:
# trigger_rule object is neither TriggerRule enum or string known to TriggerRule
raise AirflowException(...) What do you think? |
Yeah. I think that could be much more friendly. Maybe I'd rather check for the type of the value, and add Type annotation Union[TriggerRule, string] to indicate that we support both for now. On the other hand, the question is when such deprecation warning should turn into error - if not now with 2.0.0. So I am a bit on the fence here (though leaning towards backwards-compatibility). Maybe others can also chime-in here with their thoughts and experience how painful it will be for the users to convert ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about making State an enum too?
DOWNSTREAM = 'downstream' | ||
UPSTREAM = 'upstream' | ||
ABSOLUTE = 'absolute' | ||
|
||
_ALL_WEIGHT_RULES = set() # type: Set[str] | ||
|
||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this now a static method? (Or use for wr in cls
?)
@@ -737,28 +737,28 @@ dependency settings. | |||
|
|||
All operators have a ``trigger_rule`` argument which defines the rule by which | |||
the generated task get triggered. The default value for ``trigger_rule`` is | |||
``all_success`` and can be defined as "trigger this task when all directly | |||
``TriggerRule.ALL_SUCCESS`` and can be defined as "trigger this task when all directly | |||
upstream tasks have succeeded". All other rules described here are based | |||
on direct parent tasks and are values that can be passed to any operator | |||
while creating tasks: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to mention in this doc that these aren't strings but constants/enums from module X.
self.assertTrue(TriggerRule.ONE_FAILED in TriggerRule) | ||
self.assertTrue(TriggerRule.NONE_FAILED in TriggerRule) | ||
self.assertTrue(TriggerRule.NONE_SKIPPED in TriggerRule) | ||
self.assertTrue(TriggerRule.DUMMY in TriggerRule) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests seem unnecessary now - it seems impossible they would ever fail.
warnings.warn( | ||
"Passing TriggerRule as a string is deprecated. " | ||
"Instead the TriggerRule enum should be used: {tr}.".format(tr=trigger_rule), | ||
category=DeprecationWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
category=DeprecationWarning, | |
category=DeprecationWarning, | |
stacklevel=2, |
That way the line number will be reported from a useful line number (i.e. the DAG file), not this file.
@@ -24,6 +24,21 @@ assists users migrating to a new version. | |||
|
|||
## Airflow Master | |||
|
|||
### TriggerRules and WeightRules only by enum members, not by string value | |||
Passing a string value to `trigger_rule` and `weight_rule` in operators has been removed and only the enum member can be provided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't true (anymore?) - it's just deprecated but not removed.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
not stale |
@BasPH is this something we want to get into 2.0 since its breaking change? |
@BasPH @ashb What's the status of this? We should either close or choose a release to target (2.2 or 3.0). @andrewgodwin IIRC did you work on something similar? |
Yes, my related PR is #15285 - different classes though it looks like? |
Note: breaking change!
This PR creates enum classes for
TriggerRule
andWeightRule
. Enums simplify the code base and improve in terms of memory usage & performance.I had to make a breaking change choice:
In the BaseOperator we currently support passing both a string (e.g.
"all_done"
) and enum member (e.g.TriggerRule.ALL_DONE
). When using enums, you cannot pass the string value without doing some try except, e.g (in BaseOperator.__init__):I suggest to remove support for passing the string value, this was only used in the qubole example operator and that way we can unify the way for providing trigger_rules and weight_rules. This helps with typing and prevents errors while developing since the IDE can hint enum members. So:
Make sure you have checked all steps below.
Jira
Description
Python 3.4 introduces the Enum type, which simplifies the WeightRule and TriggerRule class a lot.
is_valid
methods were removed because selecting an invalid Enum member will raise an AttributeError.Tests
Commits
Documentation
Code Quality
flake8