Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#8525 Add SQL Branch Operator #8942

Merged
merged 9 commits into from
Jun 1, 2020

Conversation

samuelkhtu
Copy link
Contributor

SQL Branch Operator allow user to execute a SQL query in any supported backend to decide which
branch of the DAG to follow.

The SQL branch operator expects SQL query to return any of the following:

  • Boolean: True/False
  • Integer: 0/1
  • String: true/y/yes/1/on/false/n/no/0/off

Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg
Copy link

boring-cyborg bot commented May 21, 2020

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, pylint and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://apache-airflow-slack.herokuapp.com/

@samuelkhtu samuelkhtu changed the title Add SQL Branch Operator #8525 #8525 Add SQL Branch Operator May 21, 2020
Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is very useful!

Tested several scenarios with Presto. Works as expected.
I think it's worth also changing the background color. It's very hard to read black text with bold red background
Screen Shot 2020-05-21 at 9 59 50
Screen Shot 2020-05-21 at 10 44 40

airflow/operators/sql_branch_operator.py Outdated Show resolved Hide resolved
airflow/operators/sql_branch_operator.py Outdated Show resolved Hide resolved
airflow/operators/sql_branch_operator.py Outdated Show resolved Hide resolved
@samuelkhtu
Copy link
Contributor Author

Hi @eladkal , the PR is ready to review again. Please take a look when you have a chance. Thanks!

tests/operators/test_sql_branch_operator.py Outdated Show resolved Hide resolved
airflow/operators/sql_branch_operator.py Outdated Show resolved Hide resolved
Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM left few minor comments

tested with Presto db
Screen Shot 2020-05-24 at 11 50 51

airflow/operators/sql_branch_operator.py Outdated Show resolved Hide resolved
@samuelkhtu
Copy link
Contributor Author

samuelkhtu commented May 24, 2020

Hi @eladkal , Thank you again for the review. The latest commit included changes for the last set of comments. Please take a look when you have a chance. Thanks.

Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

One last observation about the file name sql_branch_operator.py. By accepting this PR Airflow will have 2 sql related operator files in the core (check_operator.py, sql_branch_operator.py) so maybe the file name should be changed sql_branch_operator.py -> sql.py so that In the future check_operator.py can be deprecated by move the operators into sql.py Similar to the deprecation of python_operator.py by moving classes to python.py
Lets wait to see what others think of that.

@samuelkhtu
Copy link
Contributor Author

LGTM.

One last observation about the file name sql_branch_operator.py. By accepting this PR Airflow will have 2 sql related operator files in the core (check_operator.py, sql_branch_operator.py) so maybe the file name should be changed sql_branch_operator.py -> sql.py so that In the future check_operator.py can be deprecated by move the operators into sql.py Similar to the deprecation of python_operator.py by moving classes to python.py
Lets wait to see what others think of that.

Thanks @eladkal. If everyone decided to merge the operators into one, shouldn't we use a separate PR instead? I am new to the community, so I am not sure what is the next step? Is there anyone I need to talk to in order to finish this PR? Thanks.

@samuelkhtu samuelkhtu mentioned this pull request May 26, 2020
@mik-laj
Copy link
Member

mik-laj commented May 27, 2020

@samuelkhtu we can do it in a separate PR. This will allow us to build a better git history.

@samuelkhtu
Copy link
Contributor Author

@samuelkhtu we can do it in a separate PR. This will allow us to build a better git history.

Thanks @mik-laj and @eladkal , can someone help and approve this PR? I am ready to move on to the next issue. Thanks.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really great and sounds super useful - just two nits -if you can take a look

airflow/operators/sql_branch_operator.py Outdated Show resolved Hide resolved
airflow/operators/sql_branch_operator.py Show resolved Hide resolved
Samuel Tu and others added 9 commits June 1, 2020 12:14
SQL Branch Operator allow user to execute a SQL query in any supported backend to decide which
branch to follow. The SQL branch operator expect query to return True/False (Boolean) or
0/1 (Integer) or true/y/yes/1/on/false/n/no/0/off (String).
1. Modify foreground color to be more visible.
2. Replace hook.get_records with hook.get_first as SQL branch operator expects single row return.
3. Modify follow_task_ids_if_true/false parameters with List[str] type hint instead of str.
1. Add tuple and list return data type to SQL branch operator.
2. Add unit test for single return value
1. Query get_first should return row['column'] instead.
2. Fixed unit test to return list of result instead of list of list of result.
1. Remove unused import from unit test file.
2. Add check to verify required SQL paramter.
3. Remove blank line between param and parm type docstring.
@potiuk potiuk merged commit 55b9b8f into apache:master Jun 1, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Jun 1, 2020

Awesome work, congrats on your first merged pull request!

@potiuk
Copy link
Member

potiuk commented Jun 1, 2020

Nice @samuelkhtu !

@samuelkhtu
Copy link
Contributor Author

Nice @samuelkhtu !

Thank you everyone's help! @potiuk @mik-laj @eladkal

@samuelkhtu samuelkhtu deleted the 8525-SQLBranchOperator branch June 1, 2020 19:38
@potiuk potiuk added this to the Airflow 1.10.11 milestone Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants