Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements an instruct model backup feature that allows reasoning-capable models to fallback to a designated instruct model when enable_thinking is set to false. The implementation adds database schema changes, API request/response fields, and the necessary logic to switch between reasoning and instruct models.
Key Changes:
- Added
supports_instructcapability flag to model catalogs andinstruct_model_idreference to provider models - Introduced
enable_thinkingparameter in chat completion requests to control reasoning behavior - Implemented model switching logic that uses the instruct model when
enable_thinking=false
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| services/llm-api/migrations/000014_add_supports_instruct.up.sql | Database migration adding supports_instruct column and instruct_model_id foreign key with appropriate indexes |
| services/llm-api/migrations/000014_add_supports_instruct.down.sql | Rollback migration removing the new columns and indexes |
| services/llm-api/internal/interfaces/httpserver/responses/model/model.go | Added supports_instruct and instruct_model_public_id fields to API responses |
| services/llm-api/internal/interfaces/httpserver/requests/models/model.go | Added supports_instruct and instruct_model_public_id fields to API requests |
| services/llm-api/internal/interfaces/httpserver/requests/chat/chat.go | Added enable_thinking parameter to chat completion requests |
| services/llm-api/internal/interfaces/httpserver/handlers/modelhandler/provider_model_handler.go | Implemented batch fetching and resolution of instruct model public IDs |
| services/llm-api/internal/interfaces/httpserver/handlers/modelhandler/provider_handler.go | Added GetProviderModelByID method for instruct model lookup |
| services/llm-api/internal/interfaces/httpserver/handlers/modelhandler/model_catalog_handler.go | Added handling for supports_instruct field updates |
| services/llm-api/internal/interfaces/httpserver/handlers/chathandler/chat_handler.go | Implemented model switching logic and message logging functionality |
| services/llm-api/internal/infrastructure/database/repository/modelrepo/model_catalog_repository.go | Added filtering support for supports_instruct field |
| services/llm-api/internal/infrastructure/database/gormgen/model_catalogs.gen.go | Generated code for supports_instruct field |
| services/llm-api/internal/infrastructure/database/dbschema/provider_model.go | Added InstructModelID field to database schema |
| services/llm-api/internal/infrastructure/database/dbschema/model_catalog.go | Added SupportsInstruct field to database schema |
| services/llm-api/internal/domain/model/provider_model_service.go | Added FindByID and FindByIDs methods for instruct model lookup |
| services/llm-api/internal/domain/model/provider_model.go | Added InstructModelID field to domain model |
| services/llm-api/internal/domain/model/model_catalog.go | Added SupportsInstruct field to domain model and filter |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| // Log all messages before calling the inference engine | ||
| h.logMessagesBeforeInference(ctx, llmRequest.Messages, request.Model) |
There was a problem hiding this comment.
The logMessagesBeforeInference function is called unconditionally for all requests. This could generate excessive logs in production. Consider making this conditional on a debug flag or log level check to avoid performance impact and log volume issues.
| // logMessagesBeforeInference logs all messages before calling the inference engine | ||
| // This is useful for debugging and auditing what gets sent to the LLM | ||
| func (h *ChatHandler) logMessagesBeforeInference(ctx context.Context, messages []openai.ChatCompletionMessage, model string) { | ||
| log.Info(). |
There was a problem hiding this comment.
This Info-level log will be generated for every chat request. Consider using Debug level instead to reduce log volume in production environments.
| log.Info(). | |
| log.Debug(). |
This PR implements an instruct model backup feature that allows reasoning-capable models to fallback to a designated instruct model when enable_thinking is set to false. The implementation adds database schema changes, API request/response fields, and the necessary logic to switch between reasoning and instruct models.
Key Changes: