Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add dataproc_jars to templated fields in relevant dataproc operators #3305

Closed
wants to merge 1 commit into from

Conversation

mchalek
Copy link
Contributor

@mchalek mchalek commented May 2, 2018

JIRA

Description

  • Here are some details about my PR, including screenshots of any UI changes:

This is a very minor change, adding only one field (dataproc_jars) to the list of templated fields for the various DataProc operators that submit jobs requiring jar files. Note that this field is always called dataproc_jars in the operators, even though each operator's constructor has a different name for this field. So the change may look a slight bit confusing, because it might appear that the templated field should vary by operator, but this is not the case.

Tests

  • My PR does not add tests because this is such a small change, and there are no tests currently in-place to test the templated fields. However we have been running Airflow at Etsy with a subset of these changes (only the change for the DataProcHadoopOperator) successfully for about a month.

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • No documentation added because none exists which specifies the templated field list for dataproc operators.

Code Quality

  • Passes git diff upstream/master -u -- "*.py" | flake8 --diff

This commit makes it possible to use jinja templates when passing
JAR file URIs to the DataProc operators that require JAR files,
specifically the DataProc Hive, Pig, SparkSql, Spark, Hadoop and
PySpark operators.
@codecov-io
Copy link

Codecov Report

Merging #3305 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3305   +/-   ##
=======================================
  Coverage   75.87%   75.87%           
=======================================
  Files         197      197           
  Lines       14710    14710           
=======================================
  Hits        11161    11161           
  Misses       3549     3549

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 71954a5...200fc1b. Read the comment docs.

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asfgit asfgit closed this in c5fa8cd May 3, 2018
@mchalek
Copy link
Contributor Author

mchalek commented May 3, 2018

Awesome thanks @Fokko !

aliceabe pushed a commit to aliceabe/incubator-airflow that referenced this pull request Jan 3, 2019
This commit makes it possible to use jinja
templates when passing
JAR file URIs to the DataProc operators that
require JAR files,
specifically the DataProc Hive, Pig, SparkSql,
Spark, Hadoop and
PySpark operators.

Closes apache#3305 from mchalek/template-dataproc-jars
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants