Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory error happened. device: nvidia - A100 #3

Closed
Little-rookie-ee opened this issue Aug 1, 2023 · 1 comment
Closed

Comments

@Little-rookie-ee
Copy link

Error log:
Traceback (most recent call last):
File "pred.py", line 68, in
preds = get_pred(model, tokenizer, data, max_length, max_gen, prompt_format, dataset, device)
File "pred.py", line 27, in get_pred
output = model.generate(
File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/transformers/generation/utils.py", line 1522, in generate
return self.greedy_search(
File "/opt/conda/lib/python3.8/site-packages/transformers/generation/utils.py", line 2339, in greedy_search
outputs = self(
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 932, in forward
transformer_outputs = self.transformer(
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 828, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 638, in forward
layer_ret = layer(
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 542, in forward
attention_output, kv_cache = self.self_attention(
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 439, in forward
context_layer = self.core_attention(query_layer, key_layer, value_layer, attention_mask)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in _call_impl
return forward_call(*input, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 278, in forward
attention_scores = attention_scores.masked_fill(attention_mask, float("-inf"))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 18.79 GiB (GPU 0; 79.20 GiB total capacity; 49.97 GiB already allocated; 9.89 GiB free; 68.81 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The only thing I change is the part of load data in pred.py, I download the data from Huggingface and unzip it.
device :nvidia A100 menmory:80G
Is that the device`s problem?
The read data code and the data path show In below:
image
image

@Little-rookie-ee
Copy link
Author

Little-rookie-ee commented Aug 1, 2023

I soved the problem by changing the max_length = 10240 in pred.py file, the origin parameter is too large for my device.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant