-
Notifications
You must be signed in to change notification settings - Fork 304
fix(tests): add CI failure tolerance and fix 4 embedding tests (Section 5/5) #623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
cb26fa5 to
5cc12a3
Compare
a925516 to
ba72fb3
Compare
ba72fb3 to
fdf4b91
Compare
ae7829d to
1f5c6f9
Compare
beb4c7c to
e634bc6
Compare
e634bc6 to
f19a1ab
Compare
Signed-off-by: Yehudit Kerido <ykerido@ykerido-thinkpadp1gen7.raanaii.csb>
f19a1ab to
b9af3bc
Compare
| run: | | ||
| make build-e2e | ||
| - name: Free up disk space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good start, i think the /mnt directory can be used if possible (e.g. move models and symlink there)
| model_type | ||
| }; | ||
|
|
||
| // Validate model availability and fall back if necessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maynot be the best option to fallback to different model. Would you mind disabling Gemma test and model download in CI?
|
merging it to unblock other PRs |

Disable Gemma embedding model in CI tests since it's a gated model
requiring HF_TOKEN. Tests now use Qwen3-Embedding-0.6B exclusively.
This approach was discussed and approved by maintainers who decided
to focus on Qwen3 (non-gated) for CI tests.
See: https://github.com/vllm-project/semantic-router/issues/573#issuecomment-3607352121
Changes in candle-binding/semantic-router_test.go:
Changes in tools/make/models.mk:
Qwen3-Embedding-0.6B is fully open (no gating) and supports
Matryoshka dimension truncation (768/512/256/128) from its
native 1024 dimensions.
Resolves #573 (Section 5: Embedding Model Tests)