Fix test isolation for test_watsonx_gpt_oss_prompt_transformation#20474
Conversation
Set cached tokenizer config directly and mock both sync and async tokenizer functions to avoid race conditions when running with parallel test execution (-n 16). The issue was that parallel tests could populate the litellm.known_tokenizer_config cache between clearing it and when the code checked it. This caused the sync code path to be used instead of the async path, bypassing the mocked async functions. Fix: 1. Set cache directly instead of clearing it 2. Also mock sync versions _get_tokenizer_config and _get_chat_template_file This ensures the test is deterministic regardless of test execution order.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Greptile OverviewGreptile SummaryThis PR makes One issue to address before merge: the test writes into the global Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| tests/test_litellm/llms/watsonx/test_watsonx.py | Fixes intermittent async test by pre-populating litellm.known_tokenizer_config and mocking sync/async HF template fetch; however, it now mutates global cache without restoring, which can leak into later tests. |
Sequence Diagram
sequenceDiagram
participant T as pytest (xdist worker)
participant Test as test_watsonx_gpt_oss_prompt_transformation
participant L as litellm.acompletion
participant HF as huggingface_template_handler
participant C as litellm.known_tokenizer_config
participant W as AsyncHTTPHandler.post (mock)
T->>Test: run test
Test->>C: set C["openai/gpt-oss-120b"] = mock_tokenizer_config
Note over Test,HF: Patch async and sync HF helpers
Test->>L: acompletion(model=watsonx_text/openai/gpt-oss-120b, messages)
L->>HF: resolve chat template
alt cache hit / sync path
HF->>C: read C[hf_model]
HF-->>L: tokenizer_config with chat_template
else async path
HF->>HF: _aget_tokenizer_config (patched)
HF->>HF: _aget_chat_template_file (patched -> failure)
HF-->>L: tokenizer_config with chat_template
end
L->>W: POST completion (mock)
W-->>L: mock_completion_response
L-->>Test: response
Test->>Test: assert request input contains template tags
| # Set cached tokenizer config directly to avoid race conditions with parallel tests. | ||
| # When running with pytest-xdist (-n 16), another test might populate the cache between | ||
| # clearing it and the actual usage. By setting the cache directly, we ensure the correct | ||
| # template is always used regardless of test execution order. | ||
| hf_model = "openai/gpt-oss-120b" | ||
| if hf_model in litellm.known_tokenizer_config: | ||
| del litellm.known_tokenizer_config[hf_model] | ||
| litellm.known_tokenizer_config[hf_model] = mock_tokenizer_config |
There was a problem hiding this comment.
Global cache not restored
This test now writes to the module-level litellm.known_tokenizer_config (litellm.known_tokenizer_config[hf_model] = ...) but never restores the previous value. That makes the test order-dependent: subsequent tests in the same worker will see the mocked tokenizer config and may skip their intended code paths. Consider snapshotting the previous entry (or absence) and restoring it in a try/finally (or using a fixture) so the global cache is returned to its prior state when the test completes.
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/test_litellm/llms/watsonx/test_watsonx.py
Line: 286:291
Comment:
**Global cache not restored**
This test now writes to the module-level `litellm.known_tokenizer_config` (`litellm.known_tokenizer_config[hf_model] = ...`) but never restores the previous value. That makes the test order-dependent: subsequent tests in the same worker will see the mocked tokenizer config and may skip their intended code paths. Consider snapshotting the previous entry (or absence) and restoring it in a `try/finally` (or using a fixture) so the global cache is returned to its prior state when the test completes.
How can I resolve this? If you propose a fix, please make it concise.
Regression Fix
Failing Job:
litellm_mapped_tests_llmsCaused By: Test isolation issue introduced in 1017c3a
Author: @ishaan-jaff
What Broke
Test
test_watsonx_gpt_oss_prompt_transformationfails intermittently when running with parallel test execution (-n 16).Error:
Original Commit LOC
The test originally:
litellm.known_tokenizer_configcache_aget_tokenizer_config,_aget_chat_template_file)This Fix
The issue was a race condition with parallel tests:
Fix:
_get_tokenizer_configand_get_chat_template_fileThis ensures the test is deterministic regardless of test execution order.