Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple tpunodes / multiple runs of tpu_app #129

Closed
concretevitamin opened this issue Dec 28, 2021 · 0 comments · Fixed by #168
Closed

Support multiple tpunodes / multiple runs of tpu_app #129

concretevitamin opened this issue Dec 28, 2021 · 0 comments · Fixed by #168
Assignees
Labels

Comments

@concretevitamin
Copy link
Member

sky tpunode  # ...Success.

sky tpunode -c new 

# The second command, however, reuses the tpu name "sky_tpu".
# which will give:

I 12-28 11:58:55 cloud_vm_ray_backend.py:560] Launching on GCP us-central1 (us-central1-a)
E 12-28 11:59:03 cloud_vm_ray_backend.py:511] Updated property [core/project].
E 12-28 11:59:03 cloud_vm_ray_backend.py:511] ERROR: (gcloud.compute.tpus.create) ALREADY_EXISTS: Resource 'projects/intercloud-320520/locations/us-central1-a/nodes/sky_tpu' already exists
E 12-28 11:59:03 cloud_vm_ray_backend.py:511] - '@type': type.googleapis.com/google.rpc.ResourceInfo
E 12-28 11:59:03 cloud_vm_ray_backend.py:511]   resourceName: projects/intercloud-320520/locations/us-central1-a/nodes/sky_tpu
E 12-28 11:59:03 cloud_vm_ray_backend.py:511]
I 12-28 11:59:03 cloud_vm_ray_backend.py:518] TPU sky_tpu already exists; skipped creation.

this means 2 host VMs are created but they connect (? to verify) to the same underlying TPU. called "sky_tpu".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants