Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Google Deployment Manager Hook #9159

Merged
merged 7 commits into from
Jul 15, 2020

Conversation

SamWheating
Copy link
Contributor

@SamWheating SamWheating commented Jun 5, 2020

Contributing a hook I made for interacting with Google Deployment Manager - I use it in a DAG which cleans up test deployments nightly.

In its current state, this hook only supports the deployments.list and deployments.delete endpoints in but support for other available endpoints can be added later.


Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg boring-cyborg bot added the provider:google Google (including GCP) related issues label Jun 5, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Jun 5, 2020

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, pylint and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://apache-airflow-slack.herokuapp.com/

SamWheating and others added 3 commits June 5, 2020 16:27
Co-authored-by: Ephraim Anierobi <4122866+ephraimbuddy@users.noreply.github.com>
:param order_by: A field name to order by, ex: "creationTimestamp desc"
:type order_by: Optional[str]
:param page_token: specifies a page_token to use
:type page_token: str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other handles, we always return all pages. I don't think the user should worry about pagination. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option comes straight from the API:

maxResults, unsigned integer:
The maximum number of results per page that should be returned. If the number of available results is larger than maxResults, Compute Engine returns a nextPageToken that can be used to get the next page of results in subsequent list requests. Acceptable values are 0 to 500, inclusive. (Default: 500)

So there's no way to get all of the entries, though I would assume its super rare to have >500 deployments in the same project. Can we leave this param in just so the hook doesn't become unusable past 500 deployments?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh now I see what you mean - thanks for the examples, that really helps. Will update and push changes now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - I've updated the hook to fetch all deployments. I also removed the ability to specify maxResults, since it actually specifies max results per page which is irrelevant if we're fetching all pages.


@mock.patch("airflow.providers.google.cloud.hooks.gdm.GoogleDeploymentManagerHook.get_conn")
def test_list_deployments_with_params(self, mock_get_conn):
_ = self.gdm_hook.list_deployments(project_id=TEST_PROJECT, deployment_filter=FILTER,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add assertion for return value?

:param project_id: The project ID for this request.
:type project_id: str
:param deployment_filter: A filter expression which limits resources returned in the response.
:type filter: str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:type filter: str
:type deployment_filter: str

@SamWheating SamWheating requested a review from mik-laj June 11, 2020 18:19
@mik-laj
Copy link
Member

mik-laj commented Jul 10, 2020

I wonder what the status of this change is. Do you need help?

@SamWheating
Copy link
Contributor Author

@mik-laj I've made the requested changes, does it look acceptable to you?

{'id': 'deployment2', 'name': 'test-deploy2'}])

@mock.patch("airflow.providers.google.cloud.hooks.gdm.GoogleDeploymentManagerHook.get_conn")
def test_delete_deployment(self, mock_get_conn):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test when this operation fails?

Copy link
Member

@mik-laj mik-laj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only one nit.

@mik-laj mik-laj merged commit 9f01795 into apache:master Jul 15, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 15, 2020

Awesome work, congrats on your first merged pull request!

@mik-laj
Copy link
Member

mik-laj commented Jul 15, 2020

Thank you for this contribution. What are your plans for the next contribution? :-D

manesioz added a commit to manesioz/airflow that referenced this pull request Jul 16, 2020
* Tests should also be triggered when there is just setup.py change (apache#9690)

So far tests were not triggered when only requirements changed,
but this is quite needed in fact.

* Update FlaskAppBuilder to v3 (apache#9648)

* Some Pylint fixes in airflow/models/taskinstance.py (apache#9674)

* Update migrations to ensure compatibility with Airflow 1.10.* (apache#9660)

closes apache#9640

* Fix _process_executor_events method to use in-memory try_number (apache#9692)

* use the correct claim name in the webserver (apache#9688)

* Update Thumbtack points of contact in Airflow Users list (apache#9701)

The previously-listed person is no longer at the company

* generate go client from openapi spec (apache#9502)

* generate go client from openapi spec

* move openapi codegen to seperate workflow

* [AIRFLOW-XXXX] Remove unnecessary docstring in AWSAthenaOperator

* Add health API endpoint  (apache#8144) (apache#9277)

* Add AWS StepFunctions integrations to the aws provider (apache#8749)

* Move gcs & wasb task handlers to their respective provider packages (apache#9714)

* Allow AWSAthenaHook to get more than 1000/first page of results (apache#6075)

Co-authored-by: Dylan Joss <dylanjoss@gmail.com>

* Add Dag Runs CRUD endpoints (apache#9473)

* Make airflow/migrations/env.py Pylint Compatible (apache#9670)

* Get Airflow configs with sensitive data from Secret Backends (apache#9645)

* YAML file supports extra json parameters (apache#9549)

Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
Co-authored-by: Vinay <vinay@synctactic.ai>
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>

* fix grammar in prereq tasks gcp operator docs (apache#9728)

* Add The Climate Corporation to user list (apache#9726)

* Add Qingping Hou to committers list (apache#9725)

* Add new fantastic team member of Polidea. (apache#9724)

* Error in description after deployment (apache#9723)

* Error in description after deployment
Co-authored-by: Daniel Debny <daniel.debny@polidea.com>

* Skip one version of Python for each test.

Skip one version of Python for each test.

* Add read-only endpoints for DAG Model (apache#9045)

Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>
Co-authored-by: Tomek Urbaszek <tomasz.urbaszek@polidea.com>

* Ensure Kerberos token is valid in SparkSubmitOperator before running `yarn kill` (apache#9044)

do a kinit before yarn kill if keytab and principal is provided

* Update example DAG for AI Platform operators (apache#9727)

* Fix warning about incompatible plugins (apache#9704)

One condition was bad and warns when the plugin is for admin and FAB flask.

* Update local_task_job.py (apache#9746)

Removing the suicide joke.

* Tests are working for newly added backport providers (apache#9739)

* Tests are working for newly added backport providers

* Pre-create Celery db result tables before running Celery worker (apache#9719)

Otherwise at large scale this can end up with some tasks failing as they
try to create the result table at the same time.

This was always possible before, just exceedingly rare, but in large
scale performance testing where I create a lot of tasks quickly
(especially in my HA testing) I hit this a few times.

This is also only a problem for fresh installs/clean DBs, as once these
tables exist the possible race goes away.

This is the same fix from apache#8909, just for runtime, not test time.

* Support extra config options for Sentry (apache#8911)

For now only dsn can be configured through the airflow.cfg. Need support 'http_proxy' option for example (it can't be configured through the environment variables). This change implements solution for supporting all existed (and maybe future) options for sentry configuration.

* Use namedtuple for TaskInstanceKeyType (apache#9712)

* Use namedtuple for TaskInstanceKeyType

* Add TargetQueryValue to KEDA Autoscaler (apache#9748)

Co-authored-by: Daniel Imberman <daniel@astronomer.io>

* Add unit tests for mlengine_operator_utils (apache#9702)

* Mask other forms of password arguments in SparkSubmitOperator (apache#9615)

This is a follow-up to apache#6917 before modifying the masking code.
Related: apache#9595.

* Use absolute paths in howto guides (apache#9758)

* Fix StackdriverTaskHandler + add system tests (apache#9761)

Co-authored-by: Tomek Urbaszek <tomasz.urbaszek@polidea.com>
Co-authored-by: Tomek Urbaszek <tomasz.urbaszek@polidea.com>

* Check project structure in sensors/transfers directories (apache#9764)

* Add tests for yandex hook (apache#9665)

* improve type hinting for celery provider (apache#9762)

* Add ME-Br to who uses Airflow list (apache#9770)

* Add 1.10.11 Changelog & Update UPDATING.md (apache#9757)

* Links Breeze documentation to new Breeze video (apache#9768)

* Fix is_terminal_support_colors functtion (apache#9734)

* Add type hinting for discord provider (apache#9773)

* Fix typo in the word "Airflow" (apache#9772)

* Add Google Stackdriver link (apache#9765)

* Improve type hinting to provider microsoft  (apache#9774)

* Unit tests jenkins hook (apache#9767)

* Fixes failing formatting of DAG file containing {} in docstring (apache#9779)

* Upgrade to latest isort (5.0.8) (apache#9782)

* Add API Endpoint - DagRuns Batch (apache#9556)

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>

* Improve typing coverage in scheduler_job.py (apache#9783)

* Enable pretty output in mypy (apache#9785)

* provide_session keep return type (apache#9787)

* Refactor Google operators guides (apache#9766)

* Refactor Google guides

* fixup! Refactor Google guides

* fixup! fixup! Refactor Google guides

* Fix small errors in image building documentation (apache#9792)

* Backfill reset_dagruns set DagRun to NONE state (apache#9756)

* Add DAG Source endpoint (apache#9322)

* The group of embedded DAGs should be root to be OpenShift compatible (apache#9794)

* Add docs for replace_microseconds parameters in trigger DAG endpoint (apache#9793)

* Add multiple file upload functionality to GCS hook (apache#8849)

Co-authored-by: Timothy Healy <healz@timothys-air.lan>

* Keep functions signatures in decorators (apache#9786)

* Use paths relative to root docs dir  in *include directives (apache#9797)

* Add Migration guide from the experimental API to the REST API (apache#9771)


Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>

* Update paths in .github/boring-cyborg.yml (apache#9799)

* Update paths in .github/boring-cyborg.yml

* fixup! Update paths in .github/boring-cyborg.yml

* Minor typo fix in OpenAPI specification (apache#9809)

* Enable annotations to be added to the webserver service (apache#9776)

* Make airflow package type check compatible (apache#9791)

* Update README to add Py 3.8 in supported versions (apache#9804)

* Remove unnecessary comprehension (apache#9805)

* Add type annotations for redis provider (apache#9815)

* Remove package.json and yarn.lock from the prod image (apache#9814)

Closes apache#9810

* For now cloud tools are not needed in CI (apache#9818)

Currently there is "unbound" variable error printed in CI logs
because of that.

* Python 3.8.4 release breaks our builds (apache#9820)

* Allow `replace` flag in gcs_to_gcs operator. (apache#9667)

* Allow `replace` flag in gcs_to_gcs operator.
If we are not replacing, list all files in the Destination GCS bucket and only keep those files which are present in Source GCS bucket and not in Destination GCS bucket

* Add kylin operator (apache#9149)

Co-authored-by: yongheng.liu <yongheng.liu@kyligence.io>

* Fix SqlAlchemy-Flask failure with python 3.8.4 (apache#9821)

* Add API Reference docs (redoc) to sphinx (apache#9806)

* Add Google Deployment Manager Hook (apache#9159)

Co-authored-by: Ephraim Anierobi <4122866+ephraimbuddy@users.noreply.github.com>

* Remove HTTP guide index in docs (apache#9796)

* Improve type hinting to provider cloudant (apache#9825)

Co-authored-by: Refael Y <refael@seadata.co.il>

* Add option to delete by prefix to S3DeleteObjectsOperator (apache#9350)

Co-authored-by: Felix Uellendall <feluelle@users.noreply.github.com>

* Add CloudVisionDeleteReferenceImageOperator  (apache#9698)

* Add note in Updating.md about the change in `run_as_user` default (apache#9822)

Until Airflow 1.10.10 the default run_as_user config (https://airflow.readthedocs.io/en/1.10.10/configurations-ref.html#run-as-user) which defaulted it to root user `0` (https://github.com/apache/airflow/blob/96697180d79bfc90f6964a8e99f9dd441789177c/airflow/contrib/executors/kubernetes_executor.py#L295-L301)

In Airflow 1.10.11 we changed it to `50000`

* Improve typing in airflow/models/pool.py (apache#9835)

* Remove global variable with API auth backend (apache#9833)

* Fix Writing Serialized Dags to DB (apache#9836)

* Update gcp to google in docs (apache#9839)

Co-authored-by: Ashwin Shankar <ashankar@slack-corp.com>

* BigQueryTableExistenceSensor needs to specify keyword arguments (apache#9832)

* Add guide for AI Platform (previously Machine Learning Engine) Operators  (apache#9798)

* Change DAG.clear to take dag_run_state (apache#9824)

* Change DAG.clear to take dag_run_state

* fix lint

* fix tests

* assign var

* extend original clause

* Rename DagBag.store_serialized_dags to Dagbag.read_dags_from_db (apache#9838)

* Update more occurrences of gcp to google (apache#9842)

* Add Dynata to the Airflow users list (apache#9846)

* Fix S3FileTransformOperator to support S3 Select transformation only (apache#8936)

Documentation for S3FileTransformOperator states that users
can skip transformation script if S3 Select experession is
specified, but in this case the created file is always
zero bytes long.

This fix changes the behaviour, so in case of no transformation
given, the source file (a result of S3Select) is uploaded.

* Fix DagRun.conf when using trigger_dag API (apache#9853)

fixes apache#9852

* Helm chart can now place arbitrary config settings in to airflow.cfg (apache#9816)

Rather than only allowing specific pre-determined config settings, this
change allows the user to place _any_ config setting they like in the
generated airflow.cfg, including overwriting the "generated defaults".

This providers a nicer interface for the users of the chart (even if the
could already set these via the env vars).

* Fix typo in datafusion operator (apache#9859)

Co-authored-by: michalslowikowski00 <michal.slowikowski@polidea.com>

* Fix Experimental API Client (apache#9849)

* Add imagePullSecrets to the create user job (apache#9802)

So that it can pull the specified image from a private registry.

* Group CI scripts in subdirectories (apache#9653)

Reviewed the scripts and removed some of the old unused ones.

Co-authored-by: Jarek Potiuk <jarek.potiuk@polidea.com>
Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>
Co-authored-by: Aneesh Joseph <admin@coderplus.com>
Co-authored-by: Dylan Joss <dylanjoss@gmail.com>
Co-authored-by: QP Hou <qph@scribd.com>
Co-authored-by: Cooper Gillan <cooper.gillan@gmail.com>
Co-authored-by: Omair Khan <omairkhan064@gmail.com>
Co-authored-by: chamcca <40579012+chamcca@users.noreply.github.com>
Co-authored-by: lindsable <47788186+lindsable@users.noreply.github.com>
Co-authored-by: Vinay G B <vinaygb665@gmail.com>
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
Co-authored-by: Vinay <vinay@synctactic.ai>
Co-authored-by: Vismita Uppalli <32617204+vuppalli@users.noreply.github.com>
Co-authored-by: Jeff Melching <jmelching@users.noreply.github.com>
Co-authored-by: Daniel Debny <34335344+debek@users.noreply.github.com>
Co-authored-by: James Timmins <james@astronomer.io>
Co-authored-by: Tomek Urbaszek <tomasz.urbaszek@polidea.com>
Co-authored-by: Morgan Racine <griffonlord@gmail.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Bolgov Andrey <anbolgov@gmail.com>
Co-authored-by: Daniel Imberman <daniel.imberman@gmail.com>
Co-authored-by: Daniel Imberman <daniel@astronomer.io>
Co-authored-by: chipmyersjr <chip.myersjr@gmail.com>
Co-authored-by: Jacek Kołodziej <kolodziejj@gmail.com>
Co-authored-by: Kanthi <subkanthi@gmail.com>
Co-authored-by: morrme <morrme@users.noreply.github.com>
Co-authored-by: Nitai Bezerra da Silva <nitaibezerra@protonmail.com>
Co-authored-by: Rafferty Chen <raffertychen@gmail.com>
Co-authored-by: Mauricio De Diana <mdediana@gmail.com>
Co-authored-by: Guilherme Da Silva Gonçalves <GUISIGO@GMAIL.COM>
Co-authored-by: takunnithan <takunnithan@gmail.com>
Co-authored-by: Chao-Han Tsai <milton0825@gmail.com>
Co-authored-by: Tobiasz Kędzierski <tobiasz.kedzierski@polidea.com>
Co-authored-by: Tim Healy <healyt22@gmail.com>
Co-authored-by: Timothy Healy <healz@timothys-air.lan>
Co-authored-by: Adam Dobrawy <ad-m@users.noreply.github.com>
Co-authored-by: Vicken Simonian <vsimon@gmail.com>
Co-authored-by: Alexander Sutcliffe <41974784+scrambldchannel@users.noreply.github.com>
Co-authored-by: royberkoweee <61996194+royberkoweee@users.noreply.github.com>
Co-authored-by: yongheng.liu <56812134+liuyonghengheng@users.noreply.github.com>
Co-authored-by: yongheng.liu <yongheng.liu@kyligence.io>
Co-authored-by: Sam Wheating <samwheating@gmail.com>
Co-authored-by: Ephraim Anierobi <4122866+ephraimbuddy@users.noreply.github.com>
Co-authored-by: rafyzg <43478973+rafyzg@users.noreply.github.com>
Co-authored-by: Refael Y <refael@seadata.co.il>
Co-authored-by: Shoichi Kagawa <e411z7t40w@gmail.com>
Co-authored-by: Felix Uellendall <feluelle@users.noreply.github.com>
Co-authored-by: ashwinshankar77 <ashwinshankar77@gmail.com>
Co-authored-by: Ashwin Shankar <ashankar@slack-corp.com>
Co-authored-by: Nathan Hadfield <nathan.hadfield@king.com>
Co-authored-by: Neil Bhandari <52170027+neil3handari@users.noreply.github.com>
Co-authored-by: Mariusz Strzelecki <szczeles@gmail.com>
Co-authored-by: Michał Słowikowski <michalslowikowski00@gmail.com>
Co-authored-by: michalslowikowski00 <michal.slowikowski@polidea.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants