Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions goldens/Basic_cluster_create.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,17 +39,19 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
gcloud beta container clusters describe golden-cluster --region us-central1 --project golden-project --format="value(currentMasterVersion)"
[XPK] Creating 1 node pool or pools of tpu7x-8
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8')
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8', requires_placement_policy=True)
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --region=us-central1 --format="csv[no-heading](name)"
[XPK] Creating 1 node pool or pools of tpu7x-8
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8')
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8', requires_placement_policy=True)
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools describe 0 --cluster golden-cluster --project=golden-project --region=us-central1 --format="value(locations)"
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
kubectl get configmap golden-cluster-resources-configmap -o=custom-columns="ConfigData:data" --no-headers=true
[XPK] Existing node pool names ['0']
[XPK] To complete NodepoolCreate-golden-cluster-np-0 we are executing gcloud beta container node-pools create golden-cluster-np-0 --region=us-central1 --cluster=golden-cluster --project=golden-project --node-locations=us-central1-a --machine-type=tpu7x-standard-4t --host-maintenance-interval=AS_NEEDED --spot --enable-gvnic --node-version=0 --num-nodes=1 --scopes=storage-full,gke-default,"https://www.googleapis.com/auth/cloud-platform" --placement-type=COMPACT --max-pods-per-node 15 --tpu-topology=2x2x1
[XPK] Task: `Retrieve resource policy` is implemented by the following command not running since it is a dry run.
gcloud compute resource-policies describe golden-cluster-placement-policy --project=golden-project --region=us-central1
[XPK] To complete NodepoolCreate-golden-cluster-np-0 we are executing gcloud beta container node-pools create golden-cluster-np-0 --region=us-central1 --cluster=golden-cluster --project=golden-project --node-locations=us-central1-a --machine-type=tpu7x-standard-4t --host-maintenance-interval=AS_NEEDED --spot --placement-policy=golden-cluster-placement-policy --enable-gvnic --node-version=0 --num-nodes=1 --scopes=storage-full,gke-default,"https://www.googleapis.com/auth/cloud-platform" --placement-type=COMPACT --max-pods-per-node 15 --tpu-topology=2x2x1
[XPK] Breaking up a total of 1 commands into 1 batches
[XPK] Pretending all the jobs succeeded
[XPK] Create or delete node pool request complete.
Expand Down
4 changes: 2 additions & 2 deletions goldens/Cluster_create_private.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,13 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
gcloud beta container clusters describe golden-cluster-private --region us-central1 --project golden-project --format="value(currentMasterVersion)"
[XPK] Creating 1 node pool or pools of v5p-8
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=1, device_type='v5p-8')
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=1, device_type='v5p-8', requires_placement_policy=False)
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools list --cluster golden-cluster-private --project=golden-project --region=us-central1 --format="csv[no-heading](name)"
[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run.
gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a
[XPK] Creating 1 node pool or pools of v5p-8
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=1, device_type='v5p-8')
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=1, device_type='v5p-8', requires_placement_policy=False)
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools describe 0 --cluster golden-cluster-private --project=golden-project --region=us-central1 --format="value(locations)"
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
Expand Down
4 changes: 2 additions & 2 deletions goldens/Cluster_create_with_gb200-4.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,13 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
gcloud beta container clusters describe golden-cluster --region us-central1 --project golden-project --format="value(currentMasterVersion)"
[XPK] Creating 1 node pool or pools of gb200-4
We assume that the underlying system is: SystemCharacteristics(topology='1x72', vms_per_slice=1, gke_accelerator='nvidia-gb200', gce_machine_type='a4x-highgpu-4g', chips_per_vm=4, accelerator_type=2, device_type='gb200-4')
We assume that the underlying system is: SystemCharacteristics(topology='1x72', vms_per_slice=1, gke_accelerator='nvidia-gb200', gce_machine_type='a4x-highgpu-4g', chips_per_vm=4, accelerator_type=2, device_type='gb200-4', requires_placement_policy=True)
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --region=us-central1 --format="csv[no-heading](name)"
[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run.
gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a
[XPK] Creating 1 node pool with 2 nodes of gb200-4
Underlyingly, we assume that means: SystemCharacteristics(topology='1x72', vms_per_slice=1, gke_accelerator='nvidia-gb200', gce_machine_type='a4x-highgpu-4g', chips_per_vm=4, accelerator_type=2, device_type='gb200-4')
Underlyingly, we assume that means: SystemCharacteristics(topology='1x72', vms_per_slice=1, gke_accelerator='nvidia-gb200', gce_machine_type='a4x-highgpu-4g', chips_per_vm=4, accelerator_type=2, device_type='gb200-4', requires_placement_policy=True)
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools describe 0 --cluster golden-cluster --project=golden-project --region=us-central1 --format="value(locations)"
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
Expand Down
8 changes: 5 additions & 3 deletions goldens/NAP_cluster-create.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,17 +39,19 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
gcloud beta container clusters describe golden-cluster --region us-central1 --project golden-project --format="value(currentMasterVersion)"
[XPK] Creating 1 node pool or pools of tpu7x-8
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8')
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8', requires_placement_policy=True)
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --region=us-central1 --format="csv[no-heading](name)"
[XPK] Creating 1 node pool or pools of tpu7x-8
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8')
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8', requires_placement_policy=True)
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools describe 0 --cluster golden-cluster --project=golden-project --region=us-central1 --format="value(locations)"
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
kubectl get configmap golden-cluster-resources-configmap -o=custom-columns="ConfigData:data" --no-headers=true
[XPK] Existing node pool names ['0']
[XPK] To complete NodepoolCreate-golden-cluster-np-0 we are executing gcloud beta container node-pools create golden-cluster-np-0 --region=us-central1 --cluster=golden-cluster --project=golden-project --node-locations=us-central1-a --machine-type=tpu7x-standard-4t --host-maintenance-interval=AS_NEEDED --enable-gvnic --node-version=0 --num-nodes=1 --scopes=storage-full,gke-default,"https://www.googleapis.com/auth/cloud-platform" --placement-type=COMPACT --max-pods-per-node 15 --tpu-topology=2x2x1
[XPK] Task: `Retrieve resource policy` is implemented by the following command not running since it is a dry run.
gcloud compute resource-policies describe golden-cluster-placement-policy --project=golden-project --region=us-central1
[XPK] To complete NodepoolCreate-golden-cluster-np-0 we are executing gcloud beta container node-pools create golden-cluster-np-0 --region=us-central1 --cluster=golden-cluster --project=golden-project --node-locations=us-central1-a --machine-type=tpu7x-standard-4t --host-maintenance-interval=AS_NEEDED --placement-policy=golden-cluster-placement-policy --enable-gvnic --node-version=0 --num-nodes=1 --scopes=storage-full,gke-default,"https://www.googleapis.com/auth/cloud-platform" --placement-type=COMPACT --max-pods-per-node 15 --tpu-topology=2x2x1
[XPK] Breaking up a total of 1 commands into 1 batches
[XPK] Pretending all the jobs succeeded
[XPK] Create or delete node pool request complete.
Expand Down
8 changes: 5 additions & 3 deletions goldens/NAP_cluster-create_with_pathways.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,17 +39,19 @@ kubectl wait deployment/coredns --for=condition=Available=true --namespace=kube-
[XPK] Task: `Determine current gke master version` is implemented by the following command not running since it is a dry run.
gcloud beta container clusters describe golden-cluster --region us-central1 --project golden-project --format="value(currentMasterVersion)"
[XPK] Creating 1 node pool or pools of tpu7x-8
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8')
We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8', requires_placement_policy=True)
[XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --region=us-central1 --format="csv[no-heading](name)"
[XPK] Creating 1 node pool or pools of tpu7x-8
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8')
Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=1, device_type='tpu7x-8', requires_placement_policy=True)
[XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run.
gcloud beta container node-pools describe 0 --cluster golden-cluster --project=golden-project --region=us-central1 --format="value(locations)"
[XPK] Task: `GKE Cluster Get ConfigMap` is implemented by the following command not running since it is a dry run.
kubectl get configmap golden-cluster-resources-configmap -o=custom-columns="ConfigData:data" --no-headers=true
[XPK] Existing node pool names ['0']
[XPK] To complete NodepoolCreate-golden-cluster-np-0 we are executing gcloud beta container node-pools create golden-cluster-np-0 --region=us-central1 --cluster=golden-cluster --project=golden-project --node-locations=us-central1-a --machine-type=tpu7x-standard-4t --host-maintenance-interval=AS_NEEDED --enable-gvnic --node-version=0 --num-nodes=1 --scopes=storage-full,gke-default,"https://www.googleapis.com/auth/cloud-platform" --placement-type=COMPACT --max-pods-per-node 15 --tpu-topology=2x2x1
[XPK] Task: `Retrieve resource policy` is implemented by the following command not running since it is a dry run.
gcloud compute resource-policies describe golden-cluster-placement-policy --project=golden-project --region=us-central1
[XPK] To complete NodepoolCreate-golden-cluster-np-0 we are executing gcloud beta container node-pools create golden-cluster-np-0 --region=us-central1 --cluster=golden-cluster --project=golden-project --node-locations=us-central1-a --machine-type=tpu7x-standard-4t --host-maintenance-interval=AS_NEEDED --placement-policy=golden-cluster-placement-policy --enable-gvnic --node-version=0 --num-nodes=1 --scopes=storage-full,gke-default,"https://www.googleapis.com/auth/cloud-platform" --placement-type=COMPACT --max-pods-per-node 15 --tpu-topology=2x2x1
[XPK] To complete NodepoolCreate-cpu-np we are executing gcloud beta container node-pools create cpu-np --node-version=0 --cluster=golden-cluster --project=golden-project --node-locations=us-central1-a --region=us-central1 --num-nodes=1 --machine-type=n2-standard-64 --scopes=storage-full,gke-default,"https://www.googleapis.com/auth/cloud-platform" --enable-autoscaling --min-nodes=1 --max-nodes=20
[XPK] Breaking up a total of 2 commands into 1 batches
[XPK] Pretending all the jobs succeeded
Expand Down
4 changes: 1 addition & 3 deletions src/xpk/core/nodepool.py
Original file line number Diff line number Diff line change
Expand Up @@ -268,9 +268,7 @@ def run_gke_node_pool_create_command(
return 1

placement_args = ''
if system.accelerator_type == AcceleratorType['GPU'] and is_topology_valid(
system.topology
):
if system.requires_placement_policy and is_topology_valid(system.topology):
placement_policy = f'{args.cluster}-placement-policy'
ensure_resource_policy_exists(placement_policy, args, system.topology)
placement_args = f' --placement-policy={placement_policy}'
Expand Down
Loading
Loading