feat: add dynamic model registration support to TGI inference #3417

mattf · 2025-09-11T13:18:16Z

What does this PR do?

adds dynamic model support to TGI

add new overwrite_completion_id feature to OpenAIMixin to deal with TGI always returning id=""

Test Plan

tgi: docker run --gpus all --shm-size 1g -p 8080:80 -v /data:/data ghcr.io/huggingface/text-generation-inference --model-id Qwen/Qwen3-0.6B

stack: TGI_URL=http://localhost:8080 uv run llama stack build --image-type venv --distro ci-tests --run

test: ./scripts/integration-tests.sh --stack-config http://localhost:8321 --setup tgi --subdirs inference --pattern openai

add new overwrite_completion_id feature to OpenAIMixin to deal with TGI always returning id="" test with - tgi: `docker run --gpus all --shm-size 1g -p 8080:80 -v /data:/data ghcr.io/huggingface/text-generation-inference --model-id Qwen/Qwen3-0.6B` stack: `TGI_URL=http://localhost:8080 uv run llama stack build --image-type venv --distro ci-tests --run` test: `./scripts/integration-tests.sh --stack-config http://localhost:8321 --setup tgi --subdirs inference --pattern openai`

Copilot

Pull Request Overview

This PR adds dynamic model registration support to Text Generation Inference (TGI) provider by implementing OpenAI-compatible endpoints and introducing a new feature to handle TGI's empty response IDs.

Key changes:

Enhanced TGI provider with OpenAI compatibility for dynamic model registration
Added overwrite_completion_id feature to handle providers that return empty IDs
Integrated TGI support into test infrastructure with new test setup and recording files

Reviewed Changes

Copilot reviewed 12 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/integration/suites.py	Added "tgi" test setup configuration for integration testing
tests/integration/recordings/responses/*.json	Test recording files capturing TGI API responses for various scenarios
tests/integration/inference/test_openai_completion.py	Removed TGI from exclusion lists, allowing OpenAI completion tests
llama_stack/providers/utils/inference/openai_mixin.py	Added `overwrite_completion_id` feature and ID generation logic
llama_stack/providers/remote/inference/tgi/tgi.py	Refactored to inherit from OpenAIMixin and support dynamic model registration

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

llama_stack/providers/utils/inference/openai_mixin.py

llama_stack/providers/remote/inference/tgi/tgi.py

llama_stack/providers/utils/inference/openai_mixin.py

…tack#3417) # What does this PR do? adds dynamic model support to TGI add new overwrite_completion_id feature to OpenAIMixin to deal with TGI always returning id="" ## Test Plan tgi: `docker run --gpus all --shm-size 1g -p 8080:80 -v /data:/data ghcr.io/huggingface/text-generation-inference --model-id Qwen/Qwen3-0.6B` stack: `TGI_URL=http://localhost:8080 uv run llama stack build --image-type venv --distro ci-tests --run` test: `./scripts/integration-tests.sh --stack-config http://localhost:8321 --setup tgi --subdirs inference --pattern openai`

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 11, 2025

mattf marked this pull request as ready for review September 11, 2025 13:26

mattf requested review from ashwinb, yanxi0830, hardikjshah, raghotham, ehhuang, terrytangyuan, leseb, bbrowning, reluctantfuturist and slekkala1 as code owners September 11, 2025 13:26

mattf mentioned this pull request Sep 11, 2025

Standardize Inference Providers to Use OpenAIMixin #3387

Open

mattf requested a review from Copilot September 11, 2025 14:58

Copilot AI reviewed Sep 11, 2025

View reviewed changes

mattf commented Sep 11, 2025

View reviewed changes

llama_stack/providers/utils/inference/openai_mixin.py Outdated Show resolved Hide resolved

Update llama_stack/providers/utils/inference/openai_mixin.py

0afc4d1

ashwinb approved these changes Sep 15, 2025

View reviewed changes

mattf merged commit f4ab154 into llamastack:main Sep 15, 2025
22 of 23 checks passed

wukaixingxp mentioned this pull request Sep 16, 2025

TGI Host Cannot Be Initialized for Recent Releases #2421

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add dynamic model registration support to TGI inference #3417

feat: add dynamic model registration support to TGI inference #3417

Uh oh!

mattf commented Sep 11, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

feat: add dynamic model registration support to TGI inference #3417

feat: add dynamic model registration support to TGI inference #3417

Uh oh!

Conversation

mattf commented Sep 11, 2025

What does this PR do?

Test Plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!