-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend api/show and ollama show to return more model info #4881
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should look at the comments in the linked PR
server/routes.go
Outdated
var keys []string | ||
for k := range kv { | ||
keys = append(keys, k) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var keys []string | |
for k := range kv { | |
keys = append(keys, k) | |
} | |
keys := maps.Keys(kv) |
server/routes.go
Outdated
keys = append(keys, k) | ||
} | ||
|
||
kvMap := make(map[string]any) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a reason to copy the KVs
server/routes.go
Outdated
ggmlMap["kv"] = kvMap | ||
ggmlMap["tensors"] = ggml.Tensors() | ||
|
||
ggmlJson, err := json.Marshal(ggmlMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ggmlJson, err := json.Marshal(ggmlMap) | |
return json.Marshal(ggmlMap) |
Co-Authored-By: Patrick Devine <pdevine@sonic.net>
My bad I just pushed a copy of Patrick's code as a draft for reference, didn't mean to put it up for review yet |
Tokens and token_type are empty slices right now due to the < 5 condition, since llama3 overpopulates these like crazy. We can decide what direction we wanna go with this, maybe could add a flag to add them and remove them from ModelInfo for instance |
server/routes.go
Outdated
return nil, err | ||
} | ||
|
||
f.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The best way to close a file is to defer right after open succeeds, e.g.
f, err := os.Open(p)
if err != nil {
return nil, err
}
defer f.Close()
This ensure the file is closed even when the function returns early, e.g. DecodeGGML errors
server/routes.go
Outdated
} | ||
} | ||
|
||
kv["embedding_model"] = model.IsEmbedding() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about adding this as a KV since it's not part of the model itself. Ideally this can be captured as a top-level capabilities
server/routes.go
Outdated
switch v := kv[k].(type) { | ||
case []interface{}: | ||
if len(v) > 5 { | ||
kv[k] = []string{} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned this in the other PR but showing an empty list is very confusing. Instead, this should show how many items are in the list. TBH this doesn't even need a length check, just always output the count
if t, ok := kv[k].([]any); ok {
kv[k] = fmt.Sprintf("(%d items)", len(t))
}
server/routes.go
Outdated
return resp, nil | ||
} | ||
|
||
func getGGMLData(model *Model) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of returning []byte, this should return llm.KV
cmd/cmd.go
Outdated
@@ -623,32 +625,144 @@ func ShowHandler(cmd *cobra.Command, args []string) error { | |||
showType = "template" | |||
} | |||
|
|||
if flagsSet > 1 { | |||
switch flagsSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of a switch
here, we can early return:
if len(flagsSet) > 1 {
return errors.New("only one of '--license', '--modelfile', '--parameters', '--system', or '--template' can be specified")
}
if len(flagsSet) == 1 {
... do single flag stuff
return nil
}
.. do default stuff
return nil
cmd/cmd.go
Outdated
@@ -579,8 +579,10 @@ func ShowHandler(cmd *cobra.Command, args []string) error { | |||
return err | |||
} | |||
|
|||
if len(args) != 1 { | |||
if len(args) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already handled by https://github.com/ollama/ollama/blob/main/cmd/cmd.go#L1119 so we can just remove this check
server/routes.go
Outdated
@@ -720,9 +720,50 @@ func GetModelInfo(req api.ShowRequest) (*api.ShowResponse, error) { | |||
fmt.Fprint(&sb, model.String()) | |||
resp.Modelfile = sb.String() | |||
|
|||
ggmlData, err := getGGMLData(model.ModelPath, req.Verbose) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can use llm.LoadModel
:
ggml, err := llm.LoadModel(pending.model.ModelPath)
kv := ggml.KV()
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see we do some post processing for verbose below – above might still be useful for getGGMLData
cmd/cmd.go
Outdated
return res | ||
} | ||
|
||
func handleParams(s string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
formatParams
is a better name here
This is overall good! Let's add some tests to the handler, including:
|
@@ -212,6 +213,7 @@ func Test_Routes(t *testing.T) { | |||
"top_p 0.9", | |||
} | |||
assert.Equal(t, expectedParams, params) | |||
assert.InDelta(t, 0, showResp.ModelInfo["general.parameter_count"], 1e-9, "Parameter count should be 0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this field gets set to 0 by default in gguf.go, but might not be a robust test
server/routes_show_test.go
Outdated
"github.com/stretchr/testify/assert" | ||
) | ||
|
||
func TestShow(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not put this in routes_test.go
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting routes_create_test.go, routes_delete_test.go, routes_list_test.go
, I thought show could be nice to separate out as well given the projector test. If we want to get rid of this file and just stick everything in routes_test, I'd change the create model handler to include model and projector KVs that can be retrieved in a show handler, but we might want to keep this stuff isolated. Lmk what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah and if you mean literally just sticking this function in routes_test.go, yeah its just to be consistent with the existence of those three other files Mike created
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, sorry I missed those others ones. I'd start in routes.go
and move it out when it gets large enough, just to avoid sprawl.
server/routes_show_test.go
Outdated
t.Fatal(err) | ||
} | ||
|
||
assert.Equal(t, "test", resp.ModelInfo["general.architecture"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use t.Fatal
instead of assert
to stick to stdlib test functions
cmd/cmd.go
Outdated
{renderSubTable(modelData, false)}, | ||
} | ||
|
||
if slices.Contains(resp.Details.Families, "clip") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check for resp.ProjectorInfo
instead of looking for clip
in resp.Details.Families
README.md
Outdated
@@ -182,6 +182,12 @@ $ ollama run llama3 "Summarize this file: $(cat README.md)" | |||
Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications. | |||
``` | |||
|
|||
### Show model information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible to make a separate PR for the docs change? We can merge it once this is released
docs/api.md
Outdated
"model_info": { | ||
"general.architecture": "llama", | ||
"general.file_type": 2, | ||
"general.name": "Meta-Llama-3-8B-Instruct", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this field as well for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small items otherwise LGTM!
* API Show Extended * Initial Draft of Information Co-Authored-By: Patrick Devine <pdevine@sonic.net> * Clean Up * Descriptive arg error messages and other fixes * Second Draft of Show with Projectors Included * Remove Chat Template * Touches * Prevent wrapping from files * Verbose functionality * Docs * Address Feedback * Lint * Resolve Conflicts * Function Name * Tests for api/show model info * Show Test File * Add Projector Test * Clean routes * Projector Check * Move Show Test * Touches * Doc update --------- Co-authored-by: Patrick Devine <pdevine@sonic.net>
* API Show Extended * Initial Draft of Information Co-Authored-By: Patrick Devine <pdevine@sonic.net> * Clean Up * Descriptive arg error messages and other fixes * Second Draft of Show with Projectors Included * Remove Chat Template * Touches * Prevent wrapping from files * Verbose functionality * Docs * Address Feedback * Lint * Resolve Conflicts * Function Name * Tests for api/show model info * Show Test File * Add Projector Test * Clean routes * Projector Check * Move Show Test * Touches * Doc update --------- Co-authored-by: Patrick Devine <pdevine@sonic.net>
* API Show Extended * Initial Draft of Information Co-Authored-By: Patrick Devine <pdevine@sonic.net> * Clean Up * Descriptive arg error messages and other fixes * Second Draft of Show with Projectors Included * Remove Chat Template * Touches * Prevent wrapping from files * Verbose functionality * Docs * Address Feedback * Lint * Resolve Conflicts * Function Name * Tests for api/show model info * Show Test File * Add Projector Test * Clean routes * Projector Check * Move Show Test * Touches * Doc update --------- Co-authored-by: Patrick Devine <pdevine@sonic.net>
Building off of #3899
Resolves #3570, #2732, #3899