Skip to content
This repository has been archived by the owner on Nov 29, 2023. It is now read-only.

Documentation for passing the cluster configuration to update_cluster #111

Closed
go-dustin opened this issue Dec 8, 2020 · 6 comments · Fixed by #218
Closed

Documentation for passing the cluster configuration to update_cluster #111

go-dustin opened this issue Dec 8, 2020 · 6 comments · Fixed by #218
Assignees
Labels
api: dataproc Issues related to the googleapis/python-dataproc API. type: docs Improvement to the documentation for an API. type: question Request for information or clarification. Not an issue.

Comments

@go-dustin
Copy link

go-dustin commented Dec 8, 2020

When trying to update a running cluster, you are required to pass in the cluster configuration. The documentation isn't clear on what that should be and expects you to know what the proto-buffer should be. Which creates a bit of circular problem, if you don't know what the proto-buffer is supposed to be you have no point of reference, so you won't know what to pass. We could use an example explaining how this works.

"If a dict is provided, it must be of the same form as the protobuf message Cluster"

response = client.update_cluster(project_id, region, cluster_name, cluster, update_mask)


In the meantime, can someone please explain how to this works? Ideally I'd be able to get the config from an existing cluster and then pass that with the update_mask to increase/decrease the number of workers.

@product-auto-label product-auto-label bot added the api: dataproc Issues related to the googleapis/python-dataproc API. label Dec 8, 2020
@go-dustin go-dustin changed the title Better documentation for update_cluster Documentation for passing the cluster configuration to update_cluster Dec 8, 2020
@meredithslota meredithslota added type: docs Improvement to the documentation for an API. type: question Request for information or clarification. Not an issue. labels Dec 8, 2020
@bradmiro
Copy link
Contributor

bradmiro commented Dec 8, 2020

Hey @go-dustin, thanks for your feedback here. I had to do some digging myself to figure out the correct way to do this and agree this is a great opportunity for us to improve our documentation. The following worked for me, in this case to double the number of workers on the cluster.

from google.cloud import dataproc_v1 as dataproc

project_id = <project_id>
region = <region>
cluster_name = <cluster>

client = dataproc.ClusterControllerClient(
    client_options={"api_endpoint": f"{region}-dataproc.googleapis.com:443"}
)

cluster = client.get_cluster(
    project_id=project_id, region=region, cluster_name=cluster_name
)

new_num_instances = cluster.config.worker_config.num_instances * 2
mask = { 
    "paths":  {
        "config.worker_config.num_instances": str(new_num_instances)
    }
}
cluster.config.worker_config.num_instances = new_num_instances // Must update the cluster info itself

operation = client.update_cluster(
    project_id=project_id, 
    region=region, 
    cluster=cluster,
    cluster_name=cluster_name,
    update_mask=mask
)

updated_cluster = operation.result()

print(updated_cluster.config.worker_config.num_instances == new_num_instances) // Should be true

Let me know if this helps or if I can answer any further questions :)

@go-dustin
Copy link
Author

@bradmiro this was exactly what i needed thank you very much!

@go-dustin go-dustin reopened this Dec 10, 2020
@bradmiro
Copy link
Contributor

Glad to hear it @go-dustin! Closing this, feel free to reopen if you run into more issues!

@go-dustin
Copy link
Author

This answered my question but shouldn't we leave this open so the documentation gets updated?

@bradmiro
Copy link
Contributor

Hey @go-dustin , sorry for the delay in reply. I am tracking this internally to get our documentation updated :) thanks again!

@loferris
Copy link
Contributor

loferris commented Jul 2, 2021

I'm creating a PR to add the above fix to our samples documentation!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api: dataproc Issues related to the googleapis/python-dataproc API. type: docs Improvement to the documentation for an API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants