-
Notifications
You must be signed in to change notification settings - Fork 132
Open
Labels
bugSomething isn't workingSomething isn't working
Description
System Info
8 x 40 GB A100s
Llama-3 70B Instruct, bf16 TP-8
TensorRT-LLM 0.9.0 + Triton 24.04
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Set min_length to a high value (~512) and ask for a short answer in the prompt.`
Expected behavior
512 tokens returned.
actual behavior
Few tokens returned.
additional notes
N/A
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working