Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange CUDA out of memory issue #13

Open
GreatGBL opened this issue Jun 12, 2024 · 4 comments
Open

Strange CUDA out of memory issue #13

GreatGBL opened this issue Jun 12, 2024 · 4 comments

Comments

@GreatGBL
Copy link

Strange CUDA out of memory issue; my previous program used to run, but now it cannot run both locally and on COLAB.
Code:

pip install 'automatikz[pdf] @ git+https://github.com/potamides/AutomaTikZ'
!git clone https://github.com/potamides/AutomaTikZ
!pip install -e AutomaTikZ[webui]

from automatikz.infer import TikzGenerator, load
generate = TikzGenerator(*load("nllg/tikz-clima-7b"), stream=True)

OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB. GPU

@potamides
Copy link
Owner

potamides commented Jun 12, 2024

If it has worked previously then it is indeed strange. Can you try downgrading transformers to 4.28 and/or torch to 2.0?

@GreatGBL
Copy link
Author

Thank you for your response. I have tested this, and it's not a version issue, as I successfully ran the 7B model directly.

Here's the comparison between the new versions (transformers 4.41.2 and torch 2.3.1) and the old versions (transformers 4.28 and torch 2.0):

New versions:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB. GPU

Old versions:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 15.99 GiB total capacity; 5.22 GiB already allocated; 9.48 GiB free; 5.22 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The 13B model is about 40GB, and if using a non-A100 GPU, the large chunks of the model have to stay on the CPU, then load in chunks to the GPU. I suspect this step might be encountering an issue?

Previously, I was able to run the 13B model on a 16GB GPU. Could it have just been luck?

@potamides
Copy link
Owner

Previously, I was able to run the 13B model on a 16GB GPU. Could it have just been luck?

Did you maybe load the model with device_map="auto"? That's the way we load it in the web ui.

@GreatGBL
Copy link
Author

Thank you for your reply. I would like to ask about the 'device_map="auto"' variable modification.
Specifically, should this variable be forcibly added to the method in the 【class TextGenerationPipeline(Pipeline)】
under myenv\lib\python3.10\site-packages\transformers\pipelines\text_generation.py?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants