Skip to content

[AIRFLOW-1273] Add Google Cloud ML version and model operators#2379

Closed
N3da wants to merge 1 commit intoapache:masterfrom
N3da:master
Closed

[AIRFLOW-1273] Add Google Cloud ML version and model operators#2379
N3da wants to merge 1 commit intoapache:masterfrom
N3da:master

Conversation

@N3da
Copy link
Contributor

@N3da N3da commented Jun 19, 2017

JIRA

Description

Includes the following changes:

  • gcp_cloudml_hooks.py to work Google Cloud ML engine Rest APIs for model and versions (docs:
    https://cloud.google.com/ml-engine/reference/rest/)
  • cloudml_operator.py which contains CloudMLVersionOperator and CloudMLModelOperators
  • unit tests for the added hooks.
  • Adding a default value for google_cloud_default connection.

Tests

  • Added unit tests for all added methods in gcp_cloudml_hooks.py.
    [Link to Travis:
    https://travis-ci.org/N3da/incubator-airflow/builds/244668457]

@mention-bot
Copy link

@N3da, thanks for your PR! By analyzing the history of the files in this pull request, we identified @artwr, @jlowin and @criccomini to be potential reviewers.

@N3da N3da changed the title Add Google Cloud ML version and model operators [AIRFLOW-1273] Add Google Cloud ML version and model operators Jun 19, 2017
@codecov-io
Copy link

codecov-io commented Jun 19, 2017

Codecov Report

Merging #2379 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2379      +/-   ##
==========================================
- Coverage   69.26%   69.24%   -0.02%     
==========================================
  Files         146      146              
  Lines       11231    11232       +1     
==========================================
- Hits         7779     7778       -1     
- Misses       3452     3454       +2
Impacted Files Coverage Δ
airflow/utils/db.py 84.42% <100%> (+0.12%) ⬆️
airflow/jobs.py 74.87% <0%> (-0.41%) ⬇️
airflow/models.py 87.27% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e870a8e...4eb5466. Read the comment docs.

@N3da N3da force-pushed the master branch 3 times, most recently from 3b44196 to 349adce Compare June 20, 2017 21:42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can import the logging level from settings.py (settings.LOGGING_LEVEL)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use .format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use .format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

settings.LOGGING_LEVEL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@N3da N3da force-pushed the master branch 3 times, most recently from 956bbeb to 12c871b Compare June 22, 2017 18:42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline after """

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some pydocs here would be helpful. The pydocs from these classes gets rolled into the public documentation automatically, so being extra verbose is actually good for the community.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some pydocs here would be helpful. The pydocs from these classes gets rolled into the public documentation automatically, so being extra verbose is actually good for the community.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how useful this is. Was there any particular reason you added it? If you want to keep it, having an example extra={} defined (as with beeline above) could be of some use, though it'd point to some service account json file that doesn't actually exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. This was needed for the unit tests which start from the empty db, and it would throw when initializing the gcp_cloudml_hook without having a value for google_cloud_default (and the tests would be skipped). Not sure if the extra args would be useful though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline after """

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to consider: Airflow operators, themselves, have a retry setting that developers can use in the case of a failure. If you simply allow failures to propagate up, the task will fail, and the developer can decide whether the task should retry. It might be cleaner/easier to remove this retry logic, and just allow failures to go up the stack to Airflow, where config can dictate how to handle failures.

It is nice to have slightly more sophisticated logic, as you do, to handle retry HTTP error codes, and fail outright on others, so I could see an argument for keeping this. Just something to consider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chris. The purpose of retry here is slightly different. Some of our http calls (for example for creating a Version, return a long running [Operation](https://cloud.google.com/ml-engine/reference/rest/Shared.Types/ListOperationsResponse#Operation) which the user needs to keep polling to determine when it's done. This function is to avoid polling every n seconds which might hit user's quota restrictions. In the case of errors we mostly don't want to add extra retry logic on top of existing generic logic (except for the 429 status case which can be retried).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought renamed it to _poll_... instead of _retry_... to make it more clear.

Includes Google Cloud ML hooks for version and model operations,
and their unit tests.

https://issues.apache.org/jira/browse/AIRFLOW-1273
@criccomini
Copy link
Contributor

LGTM merged!

@asfgit asfgit closed this in 534a0e0 Jun 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants