Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-5428] Dataflow with one job is not done correctly #6036

Merged
merged 1 commit into from Sep 15, 2019

Conversation

mik-laj
Copy link
Member

@mik-laj mik-laj commented Sep 6, 2019

Details information is available here: #4633 (comment)

[2019-09-06 03:20:51,974] {taskinstance.py:1042} ERROR - string indices must be integers
Traceback (most recent call last):
  File "/opt/airflow/airflow/models/taskinstance.py", line 917, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/opt/airflow/airflow/gcp/operators/dataflow.py", line 216, in execute
    self.jar, self.job_class, True, self.multiple_jobs)
  File "/opt/airflow/airflow/gcp/hooks/dataflow.py", line 372, in start_java_dataflow
    self._start_dataflow(variables, name, command_prefix, label_formatter, multiple_jobs)
  File "/opt/airflow/airflow/contrib/hooks/gcp_api_base_hook.py", line 307, in wrapper
    return func(self, *args, **kwargs)
  File "/opt/airflow/airflow/gcp/hooks/dataflow.py", line 327, in _start_dataflow
    variables['region'], self.poll_sleep, job_id, self.num_retries, multiple_jobs) \
  File "/opt/airflow/airflow/gcp/hooks/dataflow.py", line 76, in __init__
    self._jobs = self._get_jobs()
  File "/opt/airflow/airflow/gcp/hooks/dataflow.py", line 138, in _get_jobs
    self._job_id, job['name']
TypeError: string indices must be integers

CC: @Fokko You did a review of the changes that introduced these regressions. Could you check if I missed something?

I would like to test it using system tests. Unfortunately, they are not finished yet, but I work hard on it - #6035. When I confirm that the code works in all case, this change can be merge.


Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"
    • https://issues.apache.org/jira/browse/AIRFLOW-5428
    • In case you are fixing a typo in the documentation you can prepend your commit with [AIRFLOW-XXX], code changes always need a Jira issue.
    • In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal (AIP).
    • In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@mik-laj mik-laj requested a review from Fokko September 6, 2019 17:09
@mik-laj mik-laj added the provider:google Google (including GCP) related issues label Sep 6, 2019
@mik-laj mik-laj self-assigned this Sep 6, 2019
@potiuk
Copy link
Member

potiuk commented Sep 8, 2019

Need rebase @mik-laj :(

@mik-laj
Copy link
Member Author

mik-laj commented Sep 10, 2019

I rebased, but I still working on system tests.

@mik-laj mik-laj changed the title [AIRFLOW-5428][WIP] Dataflow with one job is not done correctly [AIRFLOW-5428] Dataflow with one job is not done correctly Sep 13, 2019
@mik-laj
Copy link
Member Author

mik-laj commented Sep 13, 2019

I finished system tests

@codecov-io
Copy link

Codecov Report

Merging #6036 into master will decrease coverage by 4.32%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6036      +/-   ##
==========================================
- Coverage   80.03%   75.71%   -4.33%     
==========================================
  Files         594      607      +13     
  Lines       34773    36785    +2012     
==========================================
+ Hits        27832    27850      +18     
- Misses       6941     8935    +1994
Impacted Files Coverage Δ
airflow/gcp/hooks/dataflow.py 85.16% <100%> (+9.56%) ⬆️
airflow/gcp/example_dags/example_bigquery.py 63.21% <0%> (-36.79%) ⬇️
airflow/www/app.py 70.44% <0%> (-26.12%) ⬇️
...ontrib/operators/bigquery_table_delete_operator.py 85.71% <0%> (-6.6%) ⬇️
airflow/contrib/operators/bigquery_to_bigquery.py 87.87% <0%> (-5.67%) ⬇️
airflow/contrib/operators/bigquery_to_gcs.py 88.23% <0%> (-5.52%) ⬇️
airflow/contrib/operators/bigquery_get_data.py 80% <0%> (-4.22%) ⬇️
airflow/contrib/operators/gcs_list_operator.py 88% <0%> (-3.67%) ⬇️
airflow/contrib/operators/gcs_operator.py 88.88% <0%> (-3.42%) ⬇️
airflow/contrib/operators/gcs_delete_operator.py 89.65% <0%> (-3.21%) ⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dd175fa...8c3ae60. Read the comment docs.

@potiuk potiuk merged commit 52d9e6a into apache:master Sep 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants