Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8s] Share SSH jump pod #2826

Merged
merged 49 commits into from
Feb 21, 2024
Merged

Conversation

kbrgl
Copy link
Contributor

@kbrgl kbrgl commented Nov 28, 2023

This PR closes #2598 and adds plumbing to share the SSH jump pod across a Kubernetes namespace.

  1. Per-user secrets for the user's public SSH key are replaced with a single secret object, sky-ssh-key, that is created once and patched when future attempts to setup Kubernetes authentication are made.
  2. For initial jump pod creation, the manifest runs cat /etc/secret-volume/ssh-key-* > ~/.ssh/authorized_keys to set up the initial user's SSH key.
  3. When a new user tries to authenticate, the patched secret is re-mounted at /etc/secret-volume and this change is picked up by the LCM script which reruns cat /etc/secret-volume/ssh-key-* > ~/.ssh/authorized_keys.

The LCM script now uses 2 threads to poll: one to manage the lifecycle and terminate the jump pod, and one to reload keys on an interval.

Testing

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: bash tests/backward_compatibility_tests.sh

Tested locally.

cc @romilbhardwaj

@kbrgl kbrgl changed the title Share k8s SSH jump pod [k8s] Share k8s SSH jump pod Dec 20, 2023
@kbrgl kbrgl marked this pull request as ready for review December 20, 2023 09:09
@kbrgl kbrgl changed the title [k8s] Share k8s SSH jump pod [k8s] Share SSH jump pod Dec 20, 2023
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks @kbrgl! Took a skim through the code and it looks good. Left some nits. Will test it and approve if all is good.

sky/utils/kubernetes_utils.py Outdated Show resolved Hide resolved
sky/clouds/kubernetes.py Outdated Show resolved Hide resolved
sky/utils/kubernetes/ssh_jump_lifecycle_manager.py Outdated Show resolved Hide resolved
sky/utils/kubernetes/ssh_jump_lifecycle_manager.py Outdated Show resolved Hide resolved
sky/utils/kubernetes/ssh_jump_lifecycle_manager.py Outdated Show resolved Hide resolved
sky/utils/kubernetes/ssh_jump_lifecycle_manager.py Outdated Show resolved Hide resolved
kbrgl and others added 4 commits December 24, 2023 10:58
Co-authored-by: Romil Bhardwaj <romil.bhardwaj@gmail.com>
Co-authored-by: Romil Bhardwaj <romil.bhardwaj@gmail.com>
Co-authored-by: Romil Bhardwaj <romil.bhardwaj@gmail.com>
Co-authored-by: Romil Bhardwaj <romil.bhardwaj@gmail.com>
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kbrgl! Did some manual testing - seems to be working nicely. Backward compatibility has some issues, see comments.

sky/utils/kubernetes/ssh_jump_lifecycle_manager.py Outdated Show resolved Hide resolved
sky/authentication.py Outdated Show resolved Hide resolved
sky/utils/kubernetes/ssh_jump_lifecycle_manager.py Outdated Show resolved Hide resolved
@romilbhardwaj romilbhardwaj added this to the v0.5 milestone Feb 8, 2024
…ilot into kabir/issue-2598-jump

# Conflicts:
#	sky/clouds/kubernetes.py
#	sky/templates/kubernetes-ray.yml.j2
…o kabir/issue-2598-jump

# Conflicts:
#	sky/provision/kubernetes/instance.py
#	sky/provision/kubernetes/utils.py
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kbrgl! Pushed some nits and merge conflict fixes. Manually tested backward compat, now running smoke tests.

Once everything is ready, will merge this PR and push new container images to our image registry.

Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kbrgl! Smoke tests pass. Manually tested backward compatibility. Building and updating our docker image, will merge after that's done.

@romilbhardwaj
Copy link
Collaborator

romilbhardwaj commented Feb 21, 2024

  • Running k8s smoke tests - smoke tests pass, merging now.

@romilbhardwaj romilbhardwaj merged commit f3f6566 into skypilot-org:master Feb 21, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[k8s] Share SSH jump pod across users in the same namespace
2 participants