Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Controller] Supporting multiple controllers when ClusterOwnerIdentity changes #3371

Open
romilbhardwaj opened this issue Mar 26, 2024 · 1 comment

Comments

@romilbhardwaj
Copy link
Collaborator

When the cluster owner identity changes (e.g., user switches AWS account, changes GCP project or switches Kubernetes context) and the spot/serve controller is running in the previous identity, new sky serve up and sky spot launch would fail with sky.exceptions.ClusterOwnerIdentityMismatchError: 'sky-serve-controller-<hash>' (Kubernetes) is owned by account ['old_id'], but the activated account is ['new_id'].

In this case, we should launch a new controller under the new identity while letting the old one run.

Note that this would also need careful UX handling for sky status, sky serve status and sky spot queue to print out a clear message saying that a) multiple controllers are running and b) the identity of the services/jobs currently shown.

@romilbhardwaj
Copy link
Collaborator Author

A prototype for k8s is implemented here: b7c57b3. This overrides the controller name to be unique per-identity, and as a result multiple controllers can be launched.

Gotchas:

  • Service names must be unique across all identities (namespaces).
  • sky serve status will only show services in the current namespace specified in the kube context. To view service status from other namespaces, user must switch kubecontext and run sky serve status

We likely need a more robust fix for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant