feat(models): Add realistic model profiles from models.dev#12
Merged
Conversation
Change endpoint structure from /v1/... to /{provider}/v1/...:
- /openai/v1/chat/completions
- /openai/v1/models
- /openai/v1/models/:model_id
- /openai/v1/responses
- /openresponses/v1/responses
Both /openai/v1/responses and /openresponses/v1/responses use the
same handler since the APIs are compatible.
- Update OpenAI examples to use /openai/v1/ base path - Add OpenResponses Python example (openresponses_client.py) - Update examples/README.md with comprehensive API documentation - Include quick API test examples with curl commands
Add comprehensive model profiles with context windows and capabilities sourced from models.dev. The /openai/v1/models endpoint now returns realistic model data including: - context_window: Maximum input token limit (e.g., 400K for GPT-5) - max_output_tokens: Maximum response length (e.g., 128K for GPT-5) - Accurate created timestamps and owned_by values - Model capabilities (function_calling, vision, json_mode, reasoning) This prepares for future context window limit emulation. Models covered: - GPT-5 family (gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-codex, etc.) - O-series reasoning models (o3, o3-mini, o4-mini) - GPT-4 family (gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-4.1) - Claude family (claude-3.5-sonnet through claude-opus-4.5)
The examples README should only contain examples, not API specs. Model profile documentation is already in specs/architecture.md.
Demonstrates the /openai/v1/models endpoint which returns realistic model profiles from models.dev, including context_window and max_output_tokens fields. Added to CI examples job.
Document the extended model fields (context_window, max_output_tokens) returned by the /openai/v1/models endpoint, sourced from models.dev.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/openai/v1/modelsendpoint now returnscontext_windowandmax_output_tokensfor each modellist_models.pyexample demonstrating the new endpointChanges
src/openai/models.rs: Static model registry with profiles baked in at compile timeModelstruct: Addedcontext_windowandmax_output_tokensfieldslist_modelsandget_modelnow return realistic profilesexamples/list_models.pyshowing model profiles grouped by ownerExample Response
{ "id": "gpt-5", "object": "model", "created": 1754524800, "owned_by": "openai", "context_window": 400000, "max_output_tokens": 128000 }Test Plan
list_models.pyexample runs successfullycargo clippyandcargo fmtpass