-
Notifications
You must be signed in to change notification settings - Fork 79
feat(scheduler): add endpoint to retrieve model configurations #520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @doringeman, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new API endpoint, Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new endpoint, POST /engines/_configs, to retrieve active model configurations, which is a valuable addition for observability. The implementation is clear and follows existing patterns. My review includes a few suggestions to enhance robustness and maintainability. Specifically, I've pointed out a potential issue in JSON unmarshaling that could silently accept invalid data, a minor performance optimization for slice filtering, and a recommendation to add explicit JSON tags to a struct for better long-term maintenance. Overall, the changes are well-implemented.
87897ed to
e19011d
Compare
e19011d to
b0b8ad2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- The comment on
GetModelConfigsmentions reading the model from the request body, but the implementation uses amodelquery parameter; updating the comment to match the actual API will avoid confusion for future maintainers. - The custom JSON methods on
BackendModecould be made more robust and conventional by usingjson.Marshal(m.String())andstrconv.Unquote(orjson.Unmarshalinto a string) instead of manual string concatenation andstrings.Trim, which will handle escaping and edge cases correctly. - In
getAllRunnerConfigs, entries are silently skipped whenGetLocalfails; consider at least logging the error or otherwise surfacing it so that misconfigured or missing models are easier to diagnose when the new endpoint returns fewer configs than expected.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The comment on `GetModelConfigs` mentions reading the model from the request body, but the implementation uses a `model` query parameter; updating the comment to match the actual API will avoid confusion for future maintainers.
- The custom JSON methods on `BackendMode` could be made more robust and conventional by using `json.Marshal(m.String())` and `strconv.Unquote` (or `json.Unmarshal` into a string) instead of manual string concatenation and `strings.Trim`, which will handle escaping and edge cases correctly.
- In `getAllRunnerConfigs`, entries are silently skipped when `GetLocal` fails; consider at least logging the error or otherwise surfacing it so that misconfigured or missing models are easier to diagnose when the new endpoint returns fewer configs than expected.
## Individual Comments
### Comment 1
<location> `pkg/inference/scheduling/http_handler.go:354-356` </location>
<code_context>
w.WriteHeader(http.StatusAccepted)
}
+// GetModelConfigs returns model configurations. If a model is specified in the request body,
+// returns only configs for that model; otherwise returns all configs.
+func (h *HTTPHandler) GetModelConfigs(w http.ResponseWriter, r *http.Request) {
+ model := r.URL.Query().Get("model")
+
</code_context>
<issue_to_address>
**nitpick:** Comment mentions request body while implementation uses a query parameter
The comment refers to the request body, but the code reads `model` from `r.URL.Query().Get("model")`. Please update the comment to match the actual behavior, e.g. `If a model is specified via the "model" query parameter`.
</issue_to_address>
### Comment 2
<location> `pkg/inference/backend.go:51-52` </location>
<code_context>
+}
+
+// UnmarshalJSON implements json.Unmarshaler for BackendMode.
+func (m *BackendMode) UnmarshalJSON(data []byte) error {
+ s := strings.Trim(string(data), "\"")
+ mode, ok := ParseBackendMode(s)
+ if !ok {
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Unmarshal logic should parse JSON properly instead of trimming quotes manually
Using `strings.Trim` on the raw JSON assumes a simple quoted string and will misbehave with escaping, invalid JSON, or non-string values. Prefer delegating parsing to the JSON package by unmarshaling into a temporary string first, then converting:
```go
func (m *BackendMode) UnmarshalJSON(data []byte) error {
var s string
if err := json.Unmarshal(data, &s); err != nil {
return err
}
mode, ok := ParseBackendMode(s)
if !ok {
return fmt.Errorf("unknown backend mode: %q", s)
}
*m = mode
return nil
}
```
Suggested implementation:
```golang
// UnmarshalJSON implements json.Unmarshaler for BackendMode.
func (m *BackendMode) UnmarshalJSON(data []byte) error {
var s string
if err := json.Unmarshal(data, &s); err != nil {
return err
}
mode, ok := ParseBackendMode(s)
if !ok {
return fmt.Errorf("unknown backend mode: %q", s)
}
*m = mode
return nil
}
```
1. In `pkg/inference/backend.go`, ensure the imports include:
```go
import (
"encoding/json"
"fmt"
// ...
)
```
If `strings` is no longer used anywhere in this file, remove it from the imports to avoid an unused import error.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Add GET /engines/_configure endpoint that returns active model configurations, similar to Ollama's /api/show endpoint. Accepts optional "model" query parameter to filter configs for a specific model; returns all configs if no model specified. Signed-off-by: Dorin Geman <dorin.geman@docker.com>
b0b8ad2 to
e0a47ea
Compare
Add `docker model configure show [MODEL]` to display active model configurations. Also alias `configure` to `config` for convenience. Signed-off-by: Dorin Geman <dorin.geman@docker.com>
e0a47ea to
4830663
Compare
Add
GET /engines/_configureendpoint that returns active model configurations, similar to Ollama's /api/show endpoint. Accepts optional "model" query parameter to filter configs for a specific model; returns all configs if no model specified.E.g.,
Run:
Install the CLI:
Test: