Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating Google AutoML example_dags to sys tests #32368

Merged
merged 3 commits into from
Jul 7, 2023

Conversation

amoghrajesh
Copy link
Contributor

related: #32295

Moving the example_dags for google cloud sensors to sys tests.
This is a step by step effort to migrate all the google cloud tests to sys tests.

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:providers area:system-tests provider:google Google (including GCP) related issues labels Jul 5, 2023
@amoghrajesh amoghrajesh requested a review from potiuk July 5, 2023 10:52
ENV_ID = os.environ.get("SYSTEM_TESTS_ENV_ID")
DAG_ID = "example_automl_classification"
GCP_PROJECT_ID = os.environ.get("SYSTEM_TESTS_GCP_PROJECT", "default")
GCP_AUTOML_LOCATION = "us-central1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we hardcode these values? Shouldn't it be fetched from env var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the convention of the other system tests written, for example in #30003 and also in the automl folder. Does it make sense to auto hard code them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @VladaZakharova might help here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main principle is to keep these tests self-sufficient. But they need an external resource and infra on which those resources run. We can control those resources, like creating a bucket on the fly, but if a user has a certain infra, IMO it would be best to give flexibility to the user to set up the test given the infra rather than looking at the test and setting up the infra. That would end up making these tests less useful wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice point. Any suggestions that i can follow? It will help me tweak the DAG according to your use cases

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, all the tests look good, except for the first point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Do you mean to say that we should have it as:

GCP_PROJECT_ID = os.environ.get("GCP_PROJECT_ID", "your-project-id")
GCP_AUTOML_LOCATION = os.environ.get("GCP_AUTOML_LOCATION", "us-central1")
GCP_AUTOML_TEXT_CLS_BUCKET = os.environ.get("GCP_AUTOML_TEXT_CLS_BUCKET", "gs://INVALID BUCKET NAME")

And so on?

@amoghrajesh amoghrajesh requested a review from Adaverse July 6, 2023 06:09
@amoghrajesh
Copy link
Contributor Author

@Adaverse I have handled your review comments in the newest commit.
cc @potiuk @VladaZakharova

Comment on lines 106 to +107
# create_dataset_task >> delete_datasets_task

Copy link
Contributor

@Adaverse Adaverse Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we need a watcher wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can have a watcher too for each of these tests

@amoghrajesh amoghrajesh requested a review from Adaverse July 7, 2023 10:12
Copy link
Contributor

@Adaverse Adaverse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration part looks fine, given the example dags had no issue, to begin with.

@potiuk potiuk merged commit 6c854dc into apache:main Jul 7, 2023
42 checks passed
@amoghrajesh amoghrajesh mentioned this pull request Jul 7, 2023
81 tasks
syedahsn pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Jul 11, 2023

---------

Co-authored-by: Amogh Desai <adesai@adesai-MBP16.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers area:system-tests provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants