Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability for Dataproc operator to create cluster in GKE cluster #22852

Merged
merged 7 commits into from
Apr 25, 2022

Conversation

MaksYermak
Copy link
Contributor

Create ability for Dataproc operator to create dataproc cluster in Google Kubernetes Engine cluster. Also in this PR was updated google-cloud-container library version and GKE operators code which uses this library.

Co-authored-by: Wojciech Januszek januszek@google.com
Co-authored-by: Lukasz Wyszomirski wyszomirski@google.com
Co-authored-by: Maksim Yermakou maksimy@google.com


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added provider:cncf-kubernetes Kubernetes provider related issues area:providers kind:documentation provider:google Google (including GCP) related issues labels Apr 8, 2022
@@ -292,7 +292,9 @@ def create_cluster(
region: str,
project_id: str,
cluster_name: str,
cluster_config: Union[Dict, Cluster],
cluster_config: Union[Dict, Cluster, None],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cluster_config: Union[Dict, Cluster, None],
cluster_config: Union[Dict, Cluster, None] = None,

cluster_config: Union[Dict, Cluster],
cluster_config: Union[Dict, Cluster, None],
virtual_cluster_config: Optional[Dict] = None,
run_in_gke_cluster: Optional[bool] = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
run_in_gke_cluster: Optional[bool] = False,

Comment on lines 335 to 348
cluster = (
{
"project_id": project_id,
"cluster_name": cluster_name,
"virtual_cluster_config": virtual_cluster_config,
}
if run_in_gke_cluster
else {
"project_id": project_id,
"cluster_name": cluster_name,
"config": cluster_config,
"labels": labels,
}
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cluster = (
{
"project_id": project_id,
"cluster_name": cluster_name,
"virtual_cluster_config": virtual_cluster_config,
}
if run_in_gke_cluster
else {
"project_id": project_id,
"cluster_name": cluster_name,
"config": cluster_config,
"labels": labels,
}
)
cluster = {
"project_id": project_id,
"cluster_name": cluster_name,
"virtual_cluster_config": virtual_cluster_config,
}
if virtual_cluster_config is not None:
cluster['virtual_cluster_config'] = virtual_cluster_config
if cluster_config is not None:
cluster['config'] = cluster_config

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to this, we give full control over the parameters to the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed run_in_gke_cluster flag from the code

@MaksYermak MaksYermak force-pushed the dataproc-gke-operators branch 2 times, most recently from 894b036 to a83eddd Compare April 12, 2022 15:43
@potiuk
Copy link
Member

potiuk commented Apr 13, 2022

Needs some fixes.

@MaksYermak MaksYermak force-pushed the dataproc-gke-operators branch 2 times, most recently from 583fe3a to 060d548 Compare April 19, 2022 09:22
@MaksYermak
Copy link
Contributor Author

@potiuk could you review this PR one more time?

@potiuk potiuk merged commit 1e9765b into apache:main Apr 25, 2022
@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Apr 25, 2022
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Apr 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) full tests needed We need to run full set of tests for this PR to merge kind:documentation provider:cncf-kubernetes Kubernetes provider related issues provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants