New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

10s latency of lora inference caused by None base_model_name_or_path in adapter_config #430

Closed

4 tasks

thincal opened this issue Apr 21, 2024 · 0 comments · Fixed by #431

Closed

4 tasks

10s latency of lora inference caused by None base_model_name_or_path in adapter_config #430

thincal opened this issue Apr 21, 2024 · 0 comments · Fixed by #431

Contributor

thincal commented Apr 21, 2024

System Info

lorax: main or v0.9.0

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

launch the lorax
prepare one lora weights, with null base_model_name_or_path in adapter_config
inference with this adapter, but there is 10s delay to get first token

Expected behavior

Expect to get first token within 1 seconds instead of 10 seconds.

The text was updated successfully, but these errors were encountered:

thincal mentioned this issue

fix: checking the base_model_name_or_path of adapter_config and early return if null #431

Merged

3 tasks

tgaddair closed this as completed in #431

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment