-
Notifications
You must be signed in to change notification settings - Fork 100
Open
Milestone
Description
Describe the bug
gemini does not support endpoints with /v1/..., and there is no way to override it.
Expected behavior
is it possible to add the ability to redefine endpoints for certain services/models?
Environment
Include all relevant environment information:
- OS [e.g. Ubuntu 20.04]: -
- Python version [e.g. 3.12.2]: 3.10
To Reproduce
I've tried using the gemini documentation for load testing via the guideline: https://ai.google.dev/gemini-api/docs/openai
export GUIDELLM__PREFERRED_ROUTE="chat_completions" &&
export GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS=512 &&
export GUIDELLM__MAX_CONCURRENCY=233 &&
export GUIDELLM__REQUEST_TIMEOUT=300 &&
export GUIDELLM__PREFERRED_PROMPT_TOKENS_SOURCE=local &&
export GUIDELLM__PREFERRED_OUTPUT_TOKENS_SOURCE=local &&
guidellm benchmark
--target https://generativelanguage.googleapis.com/v1beta/openai/
--rate-type sweep --rate 5
--model gemini-2.5-flash --processor ${processor}
--random-seed 1234
--max-requests=100
--data "prompt_tokens=4096,output_tokens=512"
--backend-args '{"extra_body":{"chat_template_kwargs":{"enable_thinking":false}}, "headers":{"Authorization": "Bearer GEMINI_API_KEY"}}'
--output-path "${output_dir}/benchmarks.json"Errors
The error is that guidellm could not get a list of available models by endpoint /v1/models
Additional context
Add any other context about the problem here. Also include any relevant files.