Key 'lora_config' not found

### System Info

I'm using Ubuntu 22.04 and 8x NVIDIA H100s

### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [X] My own task or dataset (give details below)

### Reproduction

Follow [this blog](https://developer.nvidia.com/blog/turbocharging-meta-llama-3-performance-with-nvidia-tensorrt-llm-and-nvidia-triton-inference-server/), but using Meta's Llama 3 70B Instruct model and accounting for using 8 GPUs.

### Expected behavior

I went through the blog about two weeks ago using Meta's Llama 3 70B model (again, accounting for the fact that I was using a different model and 8 GPUs), and it ended fine; I was able to host and query the server. I'd expect the same thing to happen with the Instruct model.

### actual behavior

Instead, I'm getting the following message when I try to run `launch_triton_server.py`:

> `backend_model.cc:691] ERROR: Failed to create instance: unexpected error when creating modelInstanceState: [json.exception.out_of_range.403] key 'lora_config' not found`

I'm not really sure why I'm getting this error message now; where should this `lora_config` be located? Why am I getting different behavior compared to the regular non-Instruct model?

### additional notes

I've gone in circles trying to get this to work for Instruct (have not tried the non-Instruct since the two weeks ago, maybe it won't work now either). I tried to do it on my own without following the blog, but then I kept getting errors, only to realize that the current TensorRT-LLM and tensorrtllm_backend version are incompatible. And now, even after following the versions suggested by the blog, I still can't get things to work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Key 'lora_config' not found #506

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Key 'lora_config' not found #506

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions