Add `CudaDevicePlacementMixin` class #436

gabrielmbmb · 2024-03-18T16:58:50Z

Description

This PR adds the CudaDevicePlacementMixin class, and updates vLLM, TransformersLLM and LlamaCppLLM to use it. The class allows defining a list of cuda_devices (ids of the devices), that then will be used to set the environment variable CUDA_VISIBLE_DEVICES for each process. This way, each LLM will use only the cuda_devices allowing the user to define which CUDA devices should be used for each LLM.

In addition, this mixin has an auto mode, which assigns one of the CUDA devices to each LLM avoiding the overlap i.e. loading more than one model in the same device. It's a bit naive, but we can improve it further in the future to take into account the model size and some other factors.

Apart from that this PR also changes:

When creating the multiprocessing.Pool, only len(self.dag) number of processes will be created, avoiding creating processes that won't do anything.
Fix an error in which the information of the batches generated by leaf steps wasn't being serialized by the _BatchManager as we were only serializing when adding a batch and not when registering it too.
Remove calling unneeded self._stop when registering the stop signal for the pipeline and paves the way for a more grateful stop that will be tackled in another PR.
Removes unnecessaries Locks that were being used to sync the access the manager.dicts (manager.dict already handles that)

gabrielmbmb added 7 commits March 18, 2024 13:24

Add CUDALLM class

bb505c3

Fix CUDA error

e44bd3e

Merge branch 'core-refactor' into cudallm

b8b3092

Fix _cache calls

b6021a5

Add pynvml dep

947d96e

Rename to CudaDevicePlacementMixin

0dbeaf3

Add unit tests

00741d7

gabrielmbmb changed the title ~~Add CUDALLM base class~~ Add CudaDevicePlacementMixin class Mar 19, 2024

Make pynvml optional

8ac28f5

gabrielmbmb requested review from alvarobartt and plaguss March 19, 2024 13:16

gabrielmbmb self-assigned this Mar 19, 2024

gabrielmbmb added the enhancement New feature or request label Mar 19, 2024

gabrielmbmb added this to the 1.0.0 milestone Mar 19, 2024

Mock pynvml package

efce579

gabrielmbmb marked this pull request as ready for review March 19, 2024 13:31

gabrielmbmb added 2 commits March 19, 2024 13:44

Add raising ImportError if cuda_devices == "auto"

dba6477

Fix unit test

5a27cd8

gabrielmbmb merged commit 11139bf into core-refactor Mar 19, 2024
4 checks passed

gabrielmbmb deleted the cudallm branch March 19, 2024 14:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `CudaDevicePlacementMixin` class #436

Add `CudaDevicePlacementMixin` class #436

gabrielmbmb commented Mar 18, 2024 •

edited

Add CudaDevicePlacementMixin class #436

Add CudaDevicePlacementMixin class #436

Conversation

gabrielmbmb commented Mar 18, 2024 • edited

Description

Add `CudaDevicePlacementMixin` class #436

Add `CudaDevicePlacementMixin` class #436

gabrielmbmb commented Mar 18, 2024 •

edited