Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

single 3090 OOM #8

Open
ljdavns opened this issue Feb 7, 2023 · 1 comment
Open

single 3090 OOM #8

ljdavns opened this issue Feb 7, 2023 · 1 comment

Comments

@ljdavns
Copy link

ljdavns commented Feb 7, 2023

The original CodeGeeX using this script failed(out of memory in 3900X(24 core)+32GB RAM+3090)

# With quantization (with more than 15GB RAM)
bash ./scripts/test_inference_quantized.sh <GPU_ID> ./tests/test_prompt.txt

so I switch to codegeex-fastertransformer, it seems still OOM

Traceback (most recent call last):
  File "api.py", line 105, in <module>
    if not codegeex.load(ckpt_path=args.ckpt_path):
  File "/workspace/codegeex-fastertransformer/examples/pytorch/codegeex/utils/codegeex.py", line 413, in load
    self.cuda()
  File "/workspace/codegeex-fastertransformer/examples/pytorch/codegeex/utils/codegeex.py", line 430, in cuda
    self.weights._map(lambda w: w.contiguous().cuda(self.device))
  File "/workspace/codegeex-fastertransformer/examples/pytorch/codegeex/utils/codegeex.py", line 177, in _map
    w[i] = func(w[i])
  File "/workspace/codegeex-fastertransformer/examples/pytorch/codegeex/utils/codegeex.py", line 430, in <lambda>
    self.weights._map(lambda w: w.contiguous().cuda(self.device))
RuntimeError: CUDA out of memory. Tried to allocate 200.00 MiB (GPU 0; 24.00 GiB total capacity; 23.11 GiB already allocated; 0 bytes free; 23.11 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentat
ion.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@ljdavns ljdavns changed the title single 3090 seems OOM single 3090 OOM Feb 7, 2023
@gramster
Copy link

The original one fails due to system memory, it needs more than 64GB. For this one you have to quantize to int8. It would be nice if there were instructions for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants