Skip to content

[Ported][CI/Build] Share HuggingFace downloads between test runs #4874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 22 commits into from
Closed

[Ported][CI/Build] Share HuggingFace downloads between test runs #4874

wants to merge 22 commits into from

Conversation

DarkLight1337
Copy link
Member

@DarkLight1337 DarkLight1337 commented May 17, 2024

The models tests keep getting interrupted (presumably due to running too long). This PR attempts to reduce the running time via:

  • Share the HuggingFace cache between Kubernetes containers during CI by storing it in a hostPath volume.
  • Disabling graph construction (considering that the vLLM model is only run once per test, not including the profile run).
  • Reuse the HuggingFace cache between Docker containers via a shared volume.

@DarkLight1337 DarkLight1337 marked this pull request as draft May 17, 2024 03:00
@DarkLight1337
Copy link
Member Author

Using eager mode doesn't seem to lead to significant improvement. It seems that the bottleneck is in downloading the models, so we should parallelize this process.

@DarkLight1337
Copy link
Member Author

DarkLight1337 commented May 17, 2024

Tbh it is probably better if we have a way to avoid re-downloading the models each time. Any thoughts?

@rkooo567 rkooo567 self-assigned this May 17, 2024
@DarkLight1337
Copy link
Member Author

DarkLight1337 commented May 21, 2024

I'm not that experienced in Kubernetes but from my understanding, placing the HuggingFace cache inside a Volume should avoid the need to redownload the models when tests are run again in the same Pod.

@rkooo567 is it possible on your end to force the CI to run on the same Pod so we can test whether the cache actually works in this way?

@rkooo567
Copy link
Collaborator

Hmm I am not super familiar with how CI works actually (idk if we even use k8s under the hood). cc @simon-mo for thoughts..

@DarkLight1337 DarkLight1337 requested a review from khluu June 5, 2024 00:22
@DarkLight1337
Copy link
Member Author

DarkLight1337 commented Jun 5, 2024

@khluu since you're involved with CI, can you help out with this? Particularly the part concerning Kubernetes.

@simon-mo
Copy link
Collaborator

simon-mo commented Jun 5, 2024

I believe we should download the model each time. @robertgshaw2-neuralmagic mentioned that putting them on NFS is a bit tricky because it might reaches rate limit.

@simon-mo
Copy link
Collaborator

simon-mo commented Jun 5, 2024

hostPath a possible workaround.

@rkooo567
Copy link
Collaborator

rkooo567 commented Jun 5, 2024

Disabling graph construction (considering that the vLLM model is only run once per test, not including the profile run).

Hmm I am against this. Imo we should test the default config for those tests (especially the test_model)

@DarkLight1337
Copy link
Member Author

DarkLight1337 commented Jun 19, 2024

I have updated this PR to work on AWS pipeline.

Looks like this shaved around 10 minutes off the duration of model tests. Going to rerun the test just to be sure.

@DarkLight1337 DarkLight1337 changed the title [Draft][CI/Build] Optimize models tests [Draft][CI/Build] Avoid re-downloading models from HuggingFace Jun 19, 2024
@DarkLight1337
Copy link
Member Author

DarkLight1337 commented Jun 19, 2024

Looks like this shaved around 10 minutes off the duration of model tests. Going to rerun the test just to be sure.

This doesn't seem to be the case anymore. It's hard to determine the real effect since the test runs aren't necessarily performed on the same machine (from my understanding).

@DarkLight1337
Copy link
Member Author

Due to #5757, I have moved this PR to vllm-project/ci-infra#8.

@DarkLight1337 DarkLight1337 changed the title [Draft][CI/Build] Avoid re-downloading models from HuggingFace [Ported][CI/Build] Avoid re-downloading models from HuggingFace Jun 25, 2024
@DarkLight1337 DarkLight1337 changed the title [Ported][CI/Build] Avoid re-downloading models from HuggingFace [Ported][CI/Build] Share HuggingFace downloads between test runs Jun 25, 2024
@DarkLight1337 DarkLight1337 deleted the optimize-models-tests branch June 27, 2024 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants