Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CudaDevicePlacementMixin class #436

Merged
merged 11 commits into from
Mar 19, 2024
Merged

Add CudaDevicePlacementMixin class #436

merged 11 commits into from
Mar 19, 2024

Conversation

gabrielmbmb
Copy link
Member

@gabrielmbmb gabrielmbmb commented Mar 18, 2024

Description

This PR adds the CudaDevicePlacementMixin class, and updates vLLM, TransformersLLM and LlamaCppLLM to use it. The class allows defining a list of cuda_devices (ids of the devices), that then will be used to set the environment variable CUDA_VISIBLE_DEVICES for each process. This way, each LLM will use only the cuda_devices allowing the user to define which CUDA devices should be used for each LLM.

In addition, this mixin has an auto mode, which assigns one of the CUDA devices to each LLM avoiding the overlap i.e. loading more than one model in the same device. It's a bit naive, but we can improve it further in the future to take into account the model size and some other factors.

Apart from that this PR also changes:

  • When creating the multiprocessing.Pool, only len(self.dag) number of processes will be created, avoiding creating processes that won't do anything.
  • Fix an error in which the information of the batches generated by leaf steps wasn't being serialized by the _BatchManager as we were only serializing when adding a batch and not when registering it too.
  • Remove calling unneeded self._stop when registering the stop signal for the pipeline and paves the way for a more grateful stop that will be tackled in another PR.
  • Removes unnecessaries Locks that were being used to sync the access the manager.dicts (manager.dict already handles that)

@gabrielmbmb gabrielmbmb changed the title Add CUDALLM base class Add CudaDevicePlacementMixin class Mar 19, 2024
@gabrielmbmb gabrielmbmb self-assigned this Mar 19, 2024
@gabrielmbmb gabrielmbmb added the enhancement New feature or request label Mar 19, 2024
@gabrielmbmb gabrielmbmb added this to the 1.0.0 milestone Mar 19, 2024
@gabrielmbmb gabrielmbmb marked this pull request as ready for review March 19, 2024 13:31
@gabrielmbmb gabrielmbmb merged commit 11139bf into core-refactor Mar 19, 2024
4 checks passed
@gabrielmbmb gabrielmbmb deleted the cudallm branch March 19, 2024 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

1 participant