Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Vertex Step Operator network parameter #2398

Original file line number Diff line number Diff line change
Expand Up @@ -122,9 +122,38 @@ more about how ZenML builds these images and how you can customize them.

#### Additional configuration

You can specify the service account, network and reserved IP ranges to use for the VertexAI `CustomJob` by passing the `service_account`, `network` and `reserved_ip_ranges` parameters to the `step-operator register` command:

```shell
zenml step-operator register <STEP_OPERATOR_NAME> \
--flavor=vertex \
--project=<GCP_PROJECT> \
--region=<REGION> \
--service_account=<SERVICE_ACCOUNT> # optionally specify the service account to use for the VertexAI CustomJob
--network=<NETWORK> # optionally specify the network to use for the VertexAI CustomJob
--reserved_ip_ranges=<RESERVED_IP_RANGES> # optionally specify the reserved IP range to use for the VertexAI CustomJob
```

For additional configuration of the Vertex step operator, you can pass `VertexStepOperatorSettings` when defining or
running your pipeline. Check out
the [SDK docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-gcp/#zenml.integrations.gcp.flavors.vertex\_step\_operator\_flavor.VertexStepOperatorSettings)
running your pipeline.

```python
from zenml import step
from zenml.integrations.gcp.flavors.vertex_step_operator_flavor import VertexStepOperatorSettings

@step(step_operator= <NAME>, settings=settings= {"step_operator.vertex": vertex_operator_settings = VertexStepOperatorSettings(
accelerator_type = "NVIDIA_TESLA_T4" # see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/MachineSpec#AcceleratorType
accelerator_count = 1
machine_type = "n1-standard-2" # see https://cloud.google.com/vertex-ai/docs/training/configure-compute#machine-types
disk_type = "pd-ssd" # see https://cloud.google.com/vertex-ai/docs/training/configure-storage#disk-types
disk_size_gb = 100 # see https://cloud.google.com/vertex-ai/docs/training/configure-storage#disk-size
)})
def trainer(...) -> ...:
"""Train a model."""
# This step will be executed in Vertex.
```

Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-gcp/#zenml.integrations.gcp.flavors.vertex\_step\_operator\_flavor.VertexStepOperatorSettings)
for a full list of available attributes and [this docs page](/docs/book/user-guide/advanced-guide/pipelining-features/pipeline-settings.md) for
more information on how to specify settings.

Expand Down
13 changes: 13 additions & 0 deletions src/zenml/integrations/gcp/flavors/vertex_step_operator_flavor.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,13 @@ class VertexStepOperatorConfig( # type: ignore[misc] # https://github.com/pydan
Attributes:
region: Region name, e.g., `europe-west1`.
encryption_spec_key_name: Encryption spec key name.
network: The full name of the Compute Engine network to which the Job should be peered.
For example, projects/12345/global/networks/myVPC
reserved_ip_ranges: A list of names for the reserved ip ranges under the VPC network that can be used
for this job. If set, we will deploy the job within the provided ip ranges. Otherwise, the job
will be deployed to any ip ranges under the provided VPC network.
service_account: Specifies the service account for workload run-as account. Users submitting jobs
must have act-as permission on this run-as account.
"""

region: str
Expand All @@ -79,6 +86,12 @@ class VertexStepOperatorConfig( # type: ignore[misc] # https://github.com/pydan
# will be applied to all Vertex AI resources if set
encryption_spec_key_name: Optional[str] = None

network: Optional[str] = None

reserved_ip_ranges: Optional[str] = None
strickvl marked this conversation as resolved.
Show resolved Hide resolved

service_account: Optional[str] = None

@property
def is_remote(self) -> bool:
"""Checks if this stack component is running remotely.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,14 @@ def launch(
"boot_disk_size_gb": settings.boot_disk_size_gb,
},
}
]
],
"service_account": self.config.service_account,
"network": self.config.network,
"reserved_ip_ranges": (
self.config.reserved_ip_ranges.split(",")
if self.config.reserved_ip_ranges
else []
),
},
"labels": job_labels,
"encryption_spec": {
Expand Down
Loading