Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AWS/LLM] Remove the conda channel that is not working on AWS and fix dependencies in LLMs #3206

Merged
merged 4 commits into from
Feb 22, 2024

Conversation

Michaelvll
Copy link
Collaborator

@Michaelvll Michaelvll commented Feb 21, 2024

sky launch -c test-mixtral --use-spot ./llm/mixtral/serve.yaml

It fail to work with our current master, due to the failure of accessing "https://aws-ml-conda-ec2.s3.us-west-2.amazonaws.com" when running conda create.

Fixes #3195

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
    • sky launch -c test-mixtral --cloud aws --use-spot ./llm/mixtral/serve.yaml
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: bash tests/backward_comaptibility_tests.sh

@Michaelvll Michaelvll changed the title [AWS] Remove the conda channel that is not working on AWS [AWS/LLM] Remove the conda channel that is not working on AWS and fix dependencies in LLMs Feb 21, 2024
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Michaelvll! BTW this might also be a signal that we should update our gcp base images soon (gcloud compute images list --project deeplearning-platform-release --format="value(NAME)" --no-standard-images no longer lists common-cpu-v20231105-debian-11-py310 as an image)...

Comment on lines +153 to +154
# Line "conda config --remove channels": remove the default channel set in the default AWS image as it cannot be accessed.
# UnavailableInvalidChannel: HTTP 404 NOT FOUND for channel <https://aws-ml-conda-ec2.s3.us-west-2.amazonaws.com>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know if this is required even for newer images, such as
common-cpu-v20240128-debian-11-py310? Can leave a TODO here to check and remove this line in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! (it is for AWS, so I tried the latest AWS image) Just tried the latest Deep Learning Image, but it seems it also has this channel in its default conda config, but indeed we should update the AMIs.

@Michaelvll Michaelvll merged commit 9d05415 into master Feb 22, 2024
19 checks passed
@Michaelvll Michaelvll deleted the fix-aws-conda branch February 22, 2024 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Creating user-level Ray on AWS doesn't seem to work
2 participants