Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-2715] Pick up the region setting while launching Dataflow templates #4139

Merged
merged 1 commit into from Nov 25, 2018
Merged

[AIRFLOW-2715] Pick up the region setting while launching Dataflow templates #4139

merged 1 commit into from Nov 25, 2018

Conversation

janhicken
Copy link
Contributor

@janhicken janhicken commented Nov 6, 2018

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    To launch an instance of a Dataflow template in the configured region,
    the API service.projects().locations().teplates() instead of
    service.projects().templates() has to be used. Otherwise, all jobs will
    always be started in us-central1.

In case there is no region configured, the default region us-central1 will get picked up.

To make it even worse, the polling for the job status already honors the
region parameter and will search for the job in the wrong region in the
current implementation. Because the job's status is not found, the
corresponding Airflow task will hang.

This PR is a second approach and follow-up of #4125

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    tests.contrib.hooks.test_gcp_dataflow_hook.DataFlowTemplateHookTest has been modified

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.

Code Quality

  • Passes flake8

@kaxil
Copy link
Member

kaxil commented Nov 6, 2018

Can you add this to DataflowTemplateOperator?

@janhicken
Copy link
Contributor Author

Do you mean some documentation?

@kaxil
Copy link
Member

kaxil commented Nov 7, 2018

Can you add test for the following scenarios?

  • When users doesn't set the region parameter?

@janhicken
Copy link
Contributor Author

janhicken commented Nov 8, 2018

This is already done in tests.contrib.hooks.test_gcp_dataflow_hook.DataFlowTemplateHookTest

@Fokko
Copy link
Contributor

Fokko commented Nov 13, 2018

@janhicken Restarting Travis, can you rebase onto master? It might be that master was failing at that time.

@codecov-io
Copy link

codecov-io commented Nov 13, 2018

Codecov Report

Merging #4139 into master will decrease coverage by 4.53%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4139      +/-   ##
==========================================
- Coverage   77.66%   73.12%   -4.54%     
==========================================
  Files         199      199              
  Lines       16290    17807    +1517     
==========================================
+ Hits        12652    13022     +370     
- Misses       3638     4785    +1147
Impacted Files Coverage Δ
airflow/www_rbac/security.py 63.88% <0%> (-28.73%) ⬇️
airflow/operators/docker_operator.py 75% <0%> (-22.68%) ⬇️
airflow/operators/bash_operator.py 70% <0%> (-21.38%) ⬇️
airflow/models.py 70.95% <0%> (-21.29%) ⬇️
airflow/api/common/experimental/trigger_dag.py 80.39% <0%> (-19.61%) ⬇️
airflow/www_rbac/views.py 65.24% <0%> (-7.08%) ⬇️
airflow/jobs.py 77.36% <0%> (+0.27%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6291e8...75f8141. Read the comment docs.

To launch an instance of a Dataflow template in the configured region,
the API service.projects().locations().teplates() instead of
service.projects().templates() has to be used. Otherwise, all jobs will
always be started in us-central1.

In case there is no region configured, the default region `us-central1`
will get picked up.

To make it even worse, the polling for the job status already honors the
region parameter and will search for the job in the wrong region in the
current implementation. Because the job's status is not found, the
corresponding Airflow task will hang.
@janhicken
Copy link
Contributor Author

Rebase has been done

@Fokko
Copy link
Contributor

Fokko commented Nov 18, 2018

Rerunning the failed tests.

@bolkedebruin bolkedebruin merged commit 9f7f5e4 into apache:master Nov 25, 2018
jlricon pushed a commit to jlricon/incubator-airflow that referenced this pull request Nov 25, 2018
* [AIRFLOW-3336] Add new TriggerRule for 0 upstream failures (apache#4182)

Add new TriggerRule that triggers only if all upstream do not fail (success or skipped tasks are allowed)

* [AIRFLOW-2715] Use region setting when launching Dataflow templates (apache#4139)

To launch an instance of a Dataflow template in the configured region,
the API service.projects().locations().teplates() instead of
service.projects().templates() has to be used. Otherwise, all jobs will
always be started in us-central1.

In case there is no region configured, the default region `us-central1`
will get picked up.

To make it even worse, the polling for the job status already honors the
region parameter and will search for the job in the wrong region in the
current implementation. Because the job's status is not found, the
corresponding Airflow task will hang.
tmiller-msft pushed a commit to cse-airflow/incubator-airflow that referenced this pull request Nov 27, 2018
…pache#4139)

To launch an instance of a Dataflow template in the configured region,
the API service.projects().locations().teplates() instead of
service.projects().templates() has to be used. Otherwise, all jobs will
always be started in us-central1.

In case there is no region configured, the default region `us-central1`
will get picked up.

To make it even worse, the polling for the job status already honors the
region parameter and will search for the job in the wrong region in the
current implementation. Because the job's status is not found, the
corresponding Airflow task will hang.
elizabethhalper pushed a commit to cse-airflow/incubator-airflow that referenced this pull request Dec 7, 2018
…pache#4139)

To launch an instance of a Dataflow template in the configured region,
the API service.projects().locations().teplates() instead of
service.projects().templates() has to be used. Otherwise, all jobs will
always be started in us-central1.

In case there is no region configured, the default region `us-central1`
will get picked up.

To make it even worse, the polling for the job status already honors the
region parameter and will search for the job in the wrong region in the
current implementation. Because the job's status is not found, the
corresponding Airflow task will hang.
aliceabe pushed a commit to aliceabe/incubator-airflow that referenced this pull request Jan 3, 2019
…pache#4139)

To launch an instance of a Dataflow template in the configured region,
the API service.projects().locations().teplates() instead of
service.projects().templates() has to be used. Otherwise, all jobs will
always be started in us-central1.

In case there is no region configured, the default region `us-central1`
will get picked up.

To make it even worse, the polling for the job status already honors the
region parameter and will search for the job in the wrong region in the
current implementation. Because the job's status is not found, the
corresponding Airflow task will hang.
ashb pushed a commit to ashb/airflow that referenced this pull request Mar 19, 2019
…pache#4139)

To launch an instance of a Dataflow template in the configured region,
the API service.projects().locations().teplates() instead of
service.projects().templates() has to be used. Otherwise, all jobs will
always be started in us-central1.

In case there is no region configured, the default region `us-central1`
will get picked up.

To make it even worse, the polling for the job status already honors the
region parameter and will search for the job in the wrong region in the
current implementation. Because the job's status is not found, the
corresponding Airflow task will hang.
wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Jul 29, 2019
…pache#4139)

To launch an instance of a Dataflow template in the configured region,
the API service.projects().locations().teplates() instead of
service.projects().templates() has to be used. Otherwise, all jobs will
always be started in us-central1.

In case there is no region configured, the default region `us-central1`
will get picked up.

To make it even worse, the polling for the job status already honors the
region parameter and will search for the job in the wrong region in the
current implementation. Because the job's status is not found, the
corresponding Airflow task will hang.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants