Skip to content

Add google_api_to_s3_transfer example dags and system tests#8581

Merged
feluelle merged 1 commit intoapache:masterfrom
feluelle:tests/google-api-to-s3-transfer
May 7, 2020
Merged

Add google_api_to_s3_transfer example dags and system tests#8581
feluelle merged 1 commit intoapache:masterfrom
feluelle:tests/google-api-to-s3-transfer

Conversation

@feluelle
Copy link
Member

This PR / change adds google_api_to_s3_transfer systems tests by running the newly added example dags. It also adds a aws system helper which can be used for providing aws context like connection or a s3 bucket for example to the system tests so you do not need to care about how to create or delete resources.


Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

cc @xinbinhuang @mustafagok - PTAL :)

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it !

@potiuk
Copy link
Member

potiuk commented Apr 27, 2020

❤️

@potiuk
Copy link
Member

potiuk commented Apr 27, 2020

There are some problems with test collection though :( see the GA errors..

@xinbinhuang xinbinhuang mentioned this pull request Apr 29, 2020
5 tasks
@xinbinhuang
Copy link
Contributor

Awesome! Thanks for creating the aws_system_helper.py!

@feluelle
Copy link
Member Author

There are some problems with test collection though :( see the GA errors..

..due to my changes? I committed my code suggestion to check if the errors are persistent.

@feluelle
Copy link
Member Author

 E   airflow.exceptions.AirflowException: Invalid arguments were passed to BranchPythonOperator (task_id: check_and_transform_video_ids). Invalid arguments were:
E   *args: ()
E   **kwargs: {'provide_context': True}

That's because I need provide_context=True in backported packages for 1.10.*

@potiuk When I run backport packages script in breeze with airflow version 2.0dev it does not add {'provide_context': True} to the backported packages.

How do I get it to work for both 2.0 and 1.10.* ?

@feluelle
Copy link
Member Author

Do we currently miss to generate backport packages for core operators which are not included in providers like branch operator?

@potiuk
Copy link
Member

potiuk commented Apr 29, 2020

@feluelle - We don't have (and did not plan to) have backported packages for the core operators. They are far too much "core", we think and tied to the airflow core.

We could reconsider that - but for now, - those operators are so "core" that we left them in the place they are (almost - we have some deprecation notices) because we think if we change it, it will mean far, far far too many changes across 100% of DAGs out there and we wanted to avoid that.

The way we handle it now - we have the bowler automated refactoring when the packages are being prepared and we do some small refactorings like that. Those are very small changes:

You can see some examples here:
https://github.com/apache/airflow/blob/master/backport_packages/setup_backport_packages.py#L167

I think @turbaszek can help with it as he wrote most of the bowler rules.

@potiuk
Copy link
Member

potiuk commented Apr 29, 2020

And @feluelle -> actually, it should be very easy. You do not need to have provide_context in those operators. We already have the rule to add "provide_context" to Python operator automatically via the bowler rule I linked when the packages are prepared so the only change will be to add it to BranchPythonOperator as well.

@potiuk
Copy link
Member

potiuk commented Apr 29, 2020

And maybe there are other *Python operators as well :)?

@potiuk
Copy link
Member

potiuk commented Apr 29, 2020

@turbaszek -> I just looked at the code - should not that be removed?

I think we should add provide_context to all *Python operators out there :). No matter where they are used.

    .is_filename(include=r"mlengine_operator_utils.py$")  

@feluelle
Copy link
Member Author

Thanks @potiuk for the information. So would I, for now, just add the file to the is_filename expression or should we really run it on all *PythonOperators. PTAL @turbaszek :)

@feluelle feluelle force-pushed the tests/google-api-to-s3-transfer branch from 5526d2c to fc47514 Compare April 30, 2020 07:48
Comment on lines +254 to +259
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I added this specific case here. If we want to remove the .is_filename filter this should be a separate PR. WDYT @potiuk ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth limiting changes and not changing all files automatically. If we need to change another example, it will be easy to add here., ..so LGTM

@mustafagok
Copy link
Contributor

I see that you have much bigger problems, I would like to suggest a small change, about naming "aws_system_helpers.py" as "amazon_system_helpers.py" since it contains AmazonSystemTest class and @pytest.mark.system("amazon")

- add amazon system helper for easier testing amazon aws systems / services
- fix TESTING docs
@feluelle feluelle force-pushed the tests/google-api-to-s3-transfer branch from fc47514 to 14fa416 Compare May 4, 2020 18:26
@feluelle
Copy link
Member Author

feluelle commented May 4, 2020

I see that you have much bigger problems, I would like to suggest a small change, about naming "aws_system_helpers.py" as "amazon_system_helpers.py" since it contains AmazonSystemTest class and @pytest.mark.system("amazon")

Fixed. 👍

@feluelle
Copy link
Member Author

feluelle commented May 5, 2020

@mik-laj or @turbaszek want to have a last look? I would like to merge it (Jarek approved it already) but I also would like to have an opinion on the change in setup_backport_packages.py.

@feluelle feluelle merged commit ff5b701 into apache:master May 7, 2020
@feluelle
Copy link
Member Author

feluelle commented May 7, 2020

The docs build failed on master due to missing :doc:How to use <howto/operator/amazon/aws/google_api_to_s3_transfer> - follow-up PR is coming. #8761

I am wondering why the build docs task succeeded on this branch but fails on master.

@mik-laj
Copy link
Member

mik-laj commented May 7, 2020

You didn't do rebase before merging it. This check has been added 3 days ago. 923f423

@feluelle
Copy link
Member Author

feluelle commented May 7, 2020

Thank you Kamil 👍 :)

@feluelle feluelle deleted the tests/google-api-to-s3-transfer branch May 7, 2020 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments