Skip to content

[AIRFLOW-6038] AWS DataSync example_dags added#6675

Closed
baolsen wants to merge 11 commits intoapache:masterfrom
baolsen:datasync-example-dags
Closed

[AIRFLOW-6038] AWS DataSync example_dags added#6675
baolsen wants to merge 11 commits intoapache:masterfrom
baolsen:datasync-example-dags

Conversation

@baolsen
Copy link
Contributor

@baolsen baolsen commented Nov 27, 2019

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"
    • https://issues.apache.org/jira/browse/AIRFLOW-XXX
    • In case you are fixing a typo in the documentation you can prepend your commit with [AIRFLOW-XXX], code changes always need a Jira issue.
    • In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal (AIP).
    • In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    Added Amazon AWS how-to documentation scaffolding, plus example DAGs for AWS DataSync Operators with their respective how-to guides.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Documentation & examples.

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@baolsen
Copy link
Contributor Author

baolsen commented Nov 27, 2019

EDIT: Managed to figure this out, it was the imports in the example_dags.

Hi @potiuk

Please may I ask for your assistance with this.
My build is failing due to cyclic import (it was failing even before your recent fixes).

I am not sure where to start debugging.
I think the root cause is the new documentation files I've added, but I'm not sure why this would cause cyclic import problems in other files and I have no idea which one of my files is causing the issue.

Any advice would be appreciated.

Perhaps it is how I am importing "airflow" and "airflow exceptions" in my example dags?

Here is where the build is failing during "static checks":
************* Module airflow.example_dags.example_http_operator
airflow/example_dags/example_http_operator.py:1:0: R0401: Cyclic import (airflow.executors -> airflow.executors.kubernetes_executor -> airflow.kubernetes.pod_generator) (cyclic-import)

@mik-laj mik-laj added the provider:amazon AWS/Amazon - related issues label Nov 27, 2019
@codecov-io
Copy link

Codecov Report

Merging #6675 into master will decrease coverage by 0.42%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6675      +/-   ##
==========================================
- Coverage   83.86%   83.43%   -0.43%     
==========================================
  Files         668      670       +2     
  Lines       37537    37590      +53     
==========================================
- Hits        31479    31362     -117     
- Misses       6058     6228     +170
Impacted Files Coverage Δ
...mazon/aws/example_dags/example_datasync_complex.py 0% <0%> (ø)
...amazon/aws/example_dags/example_datasync_simple.py 0% <0%> (ø)
airflow/providers/amazon/aws/operators/datasync.py 25.9% <0%> (ø) ⬆️
airflow/kubernetes/volume_mount.py 44.44% <0%> (-55.56%) ⬇️
airflow/kubernetes/volume.py 52.94% <0%> (-47.06%) ⬇️
airflow/kubernetes/pod_launcher.py 45.25% <0%> (-46.72%) ⬇️
airflow/kubernetes/refresh_config.py 50.98% <0%> (-23.53%) ⬇️
...rflow/contrib/operators/kubernetes_pod_operator.py 78.2% <0%> (-20.52%) ⬇️
airflow/configuration.py 89.13% <0%> (-3.63%) ⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ac2d0be...833fae7. Read the comment docs.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @baolsen . Thanks for all the effort to make AWS operators better :). Finally we have someone who takes care of that!

However, I think that might be a good opportunity to simplify those datasync operators. I think the using complex logic in the dag (decide task) is not needed as long as we make AWS operators idempotent on their own similarly as we did with GCP operators. Pls take a look at my comments and see what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need this line any more (python 3).

@baolsen
Copy link
Contributor Author

baolsen commented Dec 10, 2019

Build is passing but I'm closing this PR for now while I test against my real-world AWS account.

@baolsen
Copy link
Contributor Author

baolsen commented Dec 10, 2019

Re-opened as
#6773

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants