-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamped SSH Configuration Storage #2706
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting the PR @Vaibhav2001! The refactoring is very important to keep the ssh config modularized. Left several comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @Vaibhav2001! The code looks mostly good to me. Left several comments for making the code cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this PR @Vaibhav2001! Left several comments. There seems might be some issues with multi-node clusters. Could you test with multiple nodes as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @Vaibhav2001! Left several comments. : )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick update @Vaibhav2001! I just checked the PR again, but still found several issues with the docker and undefined variables. I sent a proposed fix below in the comment, could you help test out:
- With current PR: Launch a normal cluster (multi-node), re-launch the cluster, test the ssh, stop the cluster, check the ssh, restart the cluster, check ssh, terminate the cluster check the ssh.
- With current PR: launch a cluster (multi-node) with docker image (example), check ssh, stop and restart, check ssh
- With master branch: launch a cluster, switch to current PR, test
sky exec
, launch again, check ssh... - With master branch, try cluster with docker image and switch to the current PR.
It would be nice if you can figure out some additional tests to make sure this PR is correct before we can merge this. : )
Running these tests right now
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @Vaibhav2001! The code looks good to me! Once the tests are passed, it should be good to ship it.
I ran these tests and it seems to be working as intended. |
This is very useful @Vaibhav2001! |
Description:
This is in relation to #1765
Overview:
This PR introduces a significant change to the SSHConfigHelper, where the configuration for cluster host details is now stored in a dedicated file within the .sky/generated/ssh folder. The user's primary configuration file now includes an Include statement to reference this folder.
Backward Compatibility:
For users adding existing clusters, the previous configuration for those clusters will be removed from the main configuration file, and a new configuration file will be created in the
.sky/generated/ssh folder.
This change ensures that removal of existing clusters performed prior to this update functions as expected.Testing:
Extensive manual testing was conducted, as the existing tests were found to be inadequate for this specific use case. Testing primarily involved adding and removing clusters using sky launch and sky down commands, ensuring that a file is created and deleted for each cluster accordingly. To guarantee backward compatibility, clusters were added with the master and checked out on this branch to verify expected behavior.
Please Review: