Skip to content

Key 'lora_config' not found #506

@LanceB57

Description

@LanceB57

System Info

I'm using Ubuntu 22.04 and 8x NVIDIA H100s

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Follow this blog, but using Meta's Llama 3 70B Instruct model and accounting for using 8 GPUs.

Expected behavior

I went through the blog about two weeks ago using Meta's Llama 3 70B model (again, accounting for the fact that I was using a different model and 8 GPUs), and it ended fine; I was able to host and query the server. I'd expect the same thing to happen with the Instruct model.

actual behavior

Instead, I'm getting the following message when I try to run launch_triton_server.py:

backend_model.cc:691] ERROR: Failed to create instance: unexpected error when creating modelInstanceState: [json.exception.out_of_range.403] key 'lora_config' not found

I'm not really sure why I'm getting this error message now; where should this lora_config be located? Why am I getting different behavior compared to the regular non-Instruct model?

additional notes

I've gone in circles trying to get this to work for Instruct (have not tried the non-Instruct since the two weeks ago, maybe it won't work now either). I tried to do it on my own without following the blog, but then I kept getting errors, only to realize that the current TensorRT-LLM and tensorrtllm_backend version are incompatible. And now, even after following the versions suggested by the blog, I still can't get things to work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions