-
Notifications
You must be signed in to change notification settings - Fork 132
Open
Labels
bugSomething isn't workingSomething isn't working
Description
System Info
- CPU architecture: x86_64
- Host memory: 256GB
- GPU
- Name: NVIDIA A30
- Memory: 24GB
- Libraries
- TensorRT-LLM: v0.11.0
- TensorRT: 10.1.0
- CUDA: 12.6
- NVIDIA driver: 560.28.03
- Linux: Ubuntu 22.04
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
- Checkout v0.11.0 tag
- Install Python requirements
- Build GPT-J 6B engine, following the example
Expected behavior
A successful build
actual behavior
ubuntu$ python examples/gptj/convert_checkpoint.py --model_dir=gpt-j-6b --output_dir=gpt-j-6b/trt
[TensorRT-LLM] TensorRT-LLM version: 0.11.0
0.11.0
Weights loaded. Total time: 00:00:12
Traceback (most recent call last):
File "/h/pcoppock/data/mlos/apps/triton/../../third-party/tensorrtllm_backend/tensorrt_llm/examples/gptj/convert_checkpoint.py", line 382, in <module>
main()
File "/h/pcoppock/data/mlos/apps/triton/../../third-party/tensorrtllm_backend/tensorrt_llm/examples/gptj/convert_checkpoint.py", line 358, in main
covert_and_save(rank)
File "/h/pcoppock/data/mlos/apps/triton/../../third-party/tensorrtllm_backend/tensorrt_llm/examples/gptj/convert_checkpoint.py", line 353, in covert_and_save
safetensors.torch.save_file(
File "/data/pcoppock/mlos/.venv/lib/python3.10/site-packages/safetensors/torch.py", line 286, in save_file
serialize_file(_flatten(tensors), filename, metadata=metadata)
File "/data/pcoppock/mlos/.venv/lib/python3.10/site-packages/safetensors/torch.py", line 496, in _flatten
return {
File "/data/pcoppock/mlos/.venv/lib/python3.10/site-packages/safetensors/torch.py", line 500, in <dictcomp>
"data": _tobytes(v, k),
File "/data/pcoppock/mlos/.venv/lib/python3.10/site-packages/safetensors/torch.py", line 414, in _tobytes
raise ValueError(
ValueError: You are trying to save a non contiguous tensor: `lm_head.weight` which is not allowed. It either means you are trying to save tensors which are reference of each other in which case it's recommended to save only the full tensors, and reslice at load time, or simply call `.contiguous()` on your tensor to pack it before saving.
(.venv) ubuntu$
Checkpoint conversion fails with error "You are trying to save a noncontiguous tensor...."
additional notes
Conversion of Llama weights succeeds without error.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working