Skip to content

[AIRFLOW-1272] Google Cloud ML Batch Prediction Operator#2390

Closed
jiwang576 wants to merge 13 commits intoapache:masterfrom
jiwang576:GCP_CML_batch_prediction
Closed

[AIRFLOW-1272] Google Cloud ML Batch Prediction Operator#2390
jiwang576 wants to merge 13 commits intoapache:masterfrom
jiwang576:GCP_CML_batch_prediction

Conversation

@jiwang576
Copy link

@jiwang576 jiwang576 commented Jun 22, 2017

Dear Airflow maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

  • My PR addresses the following AIRFLOW-1272 issues and references them in the PR title.

Description

  • Here are some details about my PR, including screenshots of any UI changes:

  • Added CloudMLBatchPredictionOperator, which is a wrapper around the Google Cloud ML Batch Prediction API.

  • Augmented the CloudMLHook in PR#2379, which handles submitting job request to Google Cloud ML Engine.

Tests

  • My PR adds the following unit tests:

  • tests.contrib.operators.test_cloudml_operator:CloudMLBatchPredictionOperatorTest

  • tests.contrib.hooks.test_gcp_cloudml_hook:TestCloudMLHook, which builds upon the same test class in PR#2379

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

There should be an Operator for the Google Cloud ML Batch Prediction
Job, which supports executing Tensorflow neural network prediction as a
service.
Documentation is here:
https://cloud.google.com/ml-engine/docs/how-tos/batch-predict.
@mention-bot
Copy link

@jiwang576, thanks for your PR! By analyzing the history of the files in this pull request, we identified @artwr, @jlowin and @criccomini to be potential reviewers.

@codecov-io
Copy link

codecov-io commented Jun 22, 2017

Codecov Report

Merging #2390 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2390   +/-   ##
=======================================
  Coverage   69.27%   69.27%           
=======================================
  Files         146      146           
  Lines       11224    11224           
=======================================
  Hits         7775     7775           
  Misses       3449     3449

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e414844...c65b8d9. Read the comment docs.

"""CloudML job operations helper class."""

def __init__(self, cloudml, project_name, job_id, job_spec=None):
assert project_name is not None and project_name is not ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should compared against '' with != since logically we are checking equality, not identity. I'm not even sure that '' is guaranteed to be interned.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! The fix was committed.

Changed precondition assertions inside the gcp_cloudml_hook.py.
@criccomini
Copy link
Contributor

Please fix merge conflicts

@criccomini
Copy link
Contributor

criccomini commented Jun 27, 2017

Please fix all comments that are:

"""like this

to be

"""
like this
"""

merge_conn(
models.Connection(
conn_id='bigquery_default', conn_type='bigquery'))
merge_conn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes these are duplicate code when PR2379 was not merged into Airflow. I will get rid of those.
Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conn_id='presto_default', conn_type='presto',
host='localhost',
schema='hive', port=3400))
merge_conn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why two google_cloud_default entries?


def create_job(self, project_name, job):
"""
Creates a CloudML Job, and returns the Job object, which can be waited
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"which can be waited upon"

Based on the return line below cloudml_job.wait_for_done(10), it looks to me like the waiting is done before create_job returns.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch! We wrote a different docstring for that. Hopefully it reveals clearly what it does.

Copy link
Contributor

@criccomini criccomini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rebase looks kind of funky. It's showing a bunch of #2379. Is everything as intended?

"""

template_fields = [
'_model',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want project_name templated, too

Copy link
Author

@jiwang576 jiwang576 Jun 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chris! the project_name is not intended for templating on purpose.

To answer the question of the funky rebase, I changed some of the code related to PR2379 for styles only. The main code changes are on the CloudMLBatchOperator (PR2390).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for clarification.

Copy link
Contributor

@criccomini criccomini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, pretty good shape. Just a few nits/questions.

job_id)
except errors.HttpError as e:
if e.resp.status == 404:
logging.error(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really an error? Seems like info is more appropriate

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for the suggestion!

logging.error(
'Job with job_id {} does not exist. Will create it.'
.format(job_id))
finished_prediction_job = hook.create_job(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should wait_for_job_done be called here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusing interface. It's not needed as create_job is calling wait_for_job_done under the cover.
To make it clearer we refactored the hook interface and how the operator uses it. Now only create_job (which creates and wait until a job is finished) is "exposed" and used by the operator.

merge_conn(
models.Connection(
conn_id='bigquery_default', conn_type='bigquery'))
merge_conn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ji Wang added 5 commits June 27, 2017 21:23
This commit simplified the CloudMLHook class regarding job requests.
Public interface of the hook now only exposes the create_job API,
which contains the logic of creating the job, test for validation, and
polling for the job until it's finished.
@criccomini
Copy link
Contributor

cc @N3da I'm going to give you a day to look at this, since it's touching some of your stuff. I'll merge tomorrow if I don't hear from you.

@N3da
Copy link
Contributor

N3da commented Jun 28, 2017

@criccomini Thanks for the heads up. Ship it! :)

@criccomini
Copy link
Contributor

@jiwang576 OK to merge this? Seems like you're still committing

@jiwang576
Copy link
Author

jiwang576 commented Jun 28, 2017

@criccomini Hi Chris, it's good to go. @N3da and I talked offline and I made some minor changes. Thanks for asking!

@asfgit asfgit closed this in e92d6bf Jun 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants