Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-4374] Make enum-like-classes inherit from enum #5302

Closed
wants to merge 1 commit into from
Closed

[AIRFLOW-4374] Make enum-like-classes inherit from enum #5302

wants to merge 1 commit into from

Conversation

BasPH
Copy link
Contributor

@BasPH BasPH commented May 18, 2019

Note: breaking change!

This PR creates enum classes for TriggerRule and WeightRule. Enums simplify the code base and improve in terms of memory usage & performance.

I had to make a breaking change choice:
In the BaseOperator we currently support passing both a string (e.g. "all_done") and enum member (e.g. TriggerRule.ALL_DONE). When using enums, you cannot pass the string value without doing some try except, e.g (in BaseOperator.__init__):

trigger_rule = "all_done"
if trigger_rule not in TriggerRule:
    try:
        trigger_rule = TriggerRule(trigger_rule)
    except ValueError:
        raise AirflowException(
            "The trigger_rule must be one of {all_triggers},"
            "'{d}.{t}'; received '{tr}'."
            .format(all_triggers=TriggerRule.all_triggers(), d=dag.dag_id if dag else "", t=task_id, tr=trigger_rule))

I suggest to remove support for passing the string value, this was only used in the qubole example operator and that way we can unify the way for providing trigger_rules and weight_rules. This helps with typing and prevents errors while developing since the IDE can hint enum members. So:

# Not possible anymore
task = DummyOperator(... trigger_rule="all_done" ...)
task = DummyOperator(... weight_rule="downstream" ...)

# Only valid way of setting trigger_rule/weight_rule
task = DummyOperator(... trigger_rule=TriggerRule.ALL_DONE ...)
task = DummyOperator(... weight_rule=WeightRule.DOWNSTREAM ...)

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"
    • https://issues.apache.org/jira/browse/AIRFLOW-4374
    • In case you are fixing a typo in the documentation you can prepend your commit with [AIRFLOW-XXX], code changes always need a Jira issue.
    • In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal (AIP).
    • In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Python 3.4 introduces the Enum type, which simplifies the WeightRule and TriggerRule class a lot. is_valid methods were removed because selecting an invalid Enum member will raise an AttributeError.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

Code Quality

  • Passes flake8

@BasPH BasPH changed the title [AIRFLOW-4374] Make enum-like-classes inherit from enum [WIP][AIRFLOW-4374] Make enum-like-classes inherit from enum May 18, 2019
@codecov-io
Copy link

codecov-io commented May 19, 2019

Codecov Report

Merging #5302 into master will decrease coverage by <.01%.
The diff coverage is 89.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5302      +/-   ##
==========================================
- Coverage   79.03%   79.02%   -0.01%     
==========================================
  Files         481      481              
  Lines       30201    30206       +5     
==========================================
+ Hits        23868    23869       +1     
- Misses       6333     6337       +4
Impacted Files Coverage Δ
...ow/contrib/example_dags/example_qubole_operator.py 0% <0%> (ø) ⬆️
airflow/example_dags/example_skip_dag.py 95.23% <100%> (+0.23%) ⬆️
airflow/utils/weight_rule.py 100% <100%> (ø) ⬆️
airflow/example_dags/example_branch_operator.py 100% <100%> (ø) ⬆️
airflow/utils/trigger_rule.py 100% <100%> (ø) ⬆️
airflow/models/baseoperator.py 93.56% <84.61%> (-0.41%) ⬇️
airflow/contrib/operators/ssh_operator.py 82.27% <0%> (-1.27%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 827d6d4...359e967. Read the comment docs.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General concern (I understand it was a deliberate choice after as described in the PR description)

I think this might be a super-disruptive change - pretty much all DAGs with non-default triggering rules have to be changed to work with this change,. I think it might be not enough to mention that in the updating but possibly also to keep backwards-compatibility for a foreseeable future. There are likely thousands of DAGs in production for many companies and that single change might hold those companies from migrating to 2.0 (even though it's fairly easy to have a "global" change, I am sure many of those companies will prefer to do a gradual migration).

Maybe we should consider backwards-compatibility by converting the strings to Enums and writing "deprecated" warning rather than forbidding the strings? It will not complicate the code too much and it is much nicer approach for anyone trying to manage many DAGs in their production systems and be able to migrate to Airflow 2.0.

Just a thought that crossed my mind - I think we should at least discuss it and make sure that we accept consequences of this change.

@BasPH
Copy link
Contributor Author

BasPH commented May 19, 2019

Yes I realise this potentially introduces many breaking DAGs, therefore it should be merged into Airflow 2.0 at the least.

My main reason for this change is there are so many different ways of writing Airflow code, it can be quite confusing to users (in my experience), therefore if we provide only a single possible way of providing TriggerRules and WeightRules, it should unify and improve user experience.

As an intermediate, in the BaseOperator init we could do something like this, to continue support for strings:

if trigger_rule not in TriggerRule:
    try:
        # trigger rule values such as "all_done" will be converted to the TriggerRule enum
        trigger_rule = TriggerRule(trigger_rule)
        # show deprecation warning here
    except ValueError:
        # trigger_rule object is neither TriggerRule enum or string known to TriggerRule
        raise AirflowException(...)

What do you think?

@potiuk
Copy link
Member

potiuk commented May 19, 2019

Yeah. I think that could be much more friendly. Maybe I'd rather check for the type of the value, and add Type annotation Union[TriggerRule, string] to indicate that we support both for now.

On the other hand, the question is when such deprecation warning should turn into error - if not now with 2.0.0. So I am a bit on the fence here (though leaning towards backwards-compatibility).

Maybe others can also chime-in here with their thoughts and experience how painful it will be for the users to convert ?

@BasPH BasPH changed the title [WIP][AIRFLOW-4374] Make enum-like-classes inherit from enum [AIRFLOW-4374] Make enum-like-classes inherit from enum May 19, 2019
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making State an enum too?

DOWNSTREAM = 'downstream'
UPSTREAM = 'upstream'
ABSOLUTE = 'absolute'

_ALL_WEIGHT_RULES = set() # type: Set[str]

@classmethod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this now a static method? (Or use for wr in cls?)

@@ -737,28 +737,28 @@ dependency settings.

All operators have a ``trigger_rule`` argument which defines the rule by which
the generated task get triggered. The default value for ``trigger_rule`` is
``all_success`` and can be defined as "trigger this task when all directly
``TriggerRule.ALL_SUCCESS`` and can be defined as "trigger this task when all directly
upstream tasks have succeeded". All other rules described here are based
on direct parent tasks and are values that can be passed to any operator
while creating tasks:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to mention in this doc that these aren't strings but constants/enums from module X.

self.assertTrue(TriggerRule.ONE_FAILED in TriggerRule)
self.assertTrue(TriggerRule.NONE_FAILED in TriggerRule)
self.assertTrue(TriggerRule.NONE_SKIPPED in TriggerRule)
self.assertTrue(TriggerRule.DUMMY in TriggerRule)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests seem unnecessary now - it seems impossible they would ever fail.

warnings.warn(
"Passing TriggerRule as a string is deprecated. "
"Instead the TriggerRule enum should be used: {tr}.".format(tr=trigger_rule),
category=DeprecationWarning,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
category=DeprecationWarning,
category=DeprecationWarning,
stacklevel=2,

That way the line number will be reported from a useful line number (i.e. the DAG file), not this file.

@@ -24,6 +24,21 @@ assists users migrating to a new version.

## Airflow Master

### TriggerRules and WeightRules only by enum members, not by string value
Passing a string value to `trigger_rule` and `weight_rule` in operators has been removed and only the enum member can be provided.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true (anymore?) - it's just deprecated but not removed.

@stale
Copy link

stale bot commented Sep 3, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Sep 3, 2019
@BasPH
Copy link
Contributor Author

BasPH commented Sep 6, 2019

not stale

@stale stale bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Sep 6, 2019
@potiuk potiuk added the pinned Protect from Stalebot auto closing label Sep 6, 2019
@ryw
Copy link
Member

ryw commented Sep 2, 2020

@BasPH is this something we want to get into 2.0 since its breaking change?

@ryw ryw removed the area:docs label Oct 27, 2020
@jhtimmins
Copy link
Contributor

@BasPH @ashb What's the status of this? We should either close or choose a release to target (2.2 or 3.0).

@andrewgodwin IIRC did you work on something similar?

@andrewgodwin
Copy link
Contributor

andrewgodwin commented Jun 18, 2021

Yes, my related PR is #15285 - different classes though it looks like?

@potiuk potiuk closed this Sep 18, 2021
potiuk pushed a commit that referenced this pull request Feb 20, 2022
closes: #19905
related: #5302,#18627

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pinned Protect from Stalebot auto closing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants