Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend api/show and ollama show to return more model info #4881

Merged
merged 23 commits into from
Jun 19, 2024
Merged

Conversation

royjhan
Copy link
Contributor

@royjhan royjhan commented Jun 6, 2024

Building off of #3899
Resolves #3570, #2732, #3899

Copy link
Contributor

@mxyng mxyng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should look at the comments in the linked PR

server/routes.go Outdated
Comment on lines 744 to 747
var keys []string
for k := range kv {
keys = append(keys, k)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var keys []string
for k := range kv {
keys = append(keys, k)
}
keys := maps.Keys(kv)

server/routes.go Outdated
keys = append(keys, k)
}

kvMap := make(map[string]any)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reason to copy the KVs

server/routes.go Outdated
ggmlMap["kv"] = kvMap
ggmlMap["tensors"] = ggml.Tensors()

ggmlJson, err := json.Marshal(ggmlMap)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ggmlJson, err := json.Marshal(ggmlMap)
return json.Marshal(ggmlMap)

Co-Authored-By: Patrick Devine <pdevine@sonic.net>
@royjhan royjhan changed the title API Show Extended (Resolves #3570) API Show Extended Jun 6, 2024
@royjhan
Copy link
Contributor Author

royjhan commented Jun 6, 2024

You should look at the comments in the linked PR

My bad I just pushed a copy of Patrick's code as a draft for reference, didn't mean to put it up for review yet

@royjhan royjhan marked this pull request as ready for review June 6, 2024 23:21
@royjhan royjhan requested a review from pdevine June 6, 2024 23:22
@royjhan
Copy link
Contributor Author

royjhan commented Jun 6, 2024

Tokens and token_type are empty slices right now due to the < 5 condition, since llama3 overpopulates these like crazy. We can decide what direction we wanna go with this, maybe could add a flag to add them and remove them from ModelInfo for instance

server/routes.go Outdated
return nil, err
}

f.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best way to close a file is to defer right after open succeeds, e.g.

f, err := os.Open(p)
if err != nil {
  return nil, err
}
defer f.Close()

This ensure the file is closed even when the function returns early, e.g. DecodeGGML errors

server/routes.go Outdated
}
}

kv["embedding_model"] = model.IsEmbedding()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about adding this as a KV since it's not part of the model itself. Ideally this can be captured as a top-level capabilities

server/routes.go Outdated
Comment on lines 748 to 753
switch v := kv[k].(type) {
case []interface{}:
if len(v) > 5 {
kv[k] = []string{}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in the other PR but showing an empty list is very confusing. Instead, this should show how many items are in the list. TBH this doesn't even need a length check, just always output the count

if t, ok := kv[k].([]any); ok {
    kv[k] = fmt.Sprintf("(%d items)", len(t))
}

server/routes.go Outdated
return resp, nil
}

func getGGMLData(model *Model) ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of returning []byte, this should return llm.KV

@royjhan royjhan marked this pull request as draft June 7, 2024 00:00
@royjhan royjhan changed the title API Show Extended Extend api/show and ollama show to return more model info Jun 10, 2024
@royjhan royjhan self-assigned this Jun 10, 2024
cmd/cmd.go Outdated
@@ -623,32 +625,144 @@ func ShowHandler(cmd *cobra.Command, args []string) error {
showType = "template"
}

if flagsSet > 1 {
switch flagsSet {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of a switch here, we can early return:

if len(flagsSet) > 1 {
  return errors.New("only one of '--license', '--modelfile', '--parameters', '--system', or '--template' can be specified")
}

if len(flagsSet) == 1 {
  ... do single flag stuff
  return nil
}

.. do default stuff
return nil

cmd/cmd.go Outdated
@@ -579,8 +579,10 @@ func ShowHandler(cmd *cobra.Command, args []string) error {
return err
}

if len(args) != 1 {
if len(args) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already handled by https://github.com/ollama/ollama/blob/main/cmd/cmd.go#L1119 so we can just remove this check

README.md Outdated Show resolved Hide resolved
server/routes.go Outdated
@@ -720,9 +720,50 @@ func GetModelInfo(req api.ShowRequest) (*api.ShowResponse, error) {
fmt.Fprint(&sb, model.String())
resp.Modelfile = sb.String()

ggmlData, err := getGGMLData(model.ModelPath, req.Verbose)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can use llm.LoadModel:

ggml, err := llm.LoadModel(pending.model.ModelPath)
kv := ggml.KV()
...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see we do some post processing for verbose below – above might still be useful for getGGMLData

@royjhan royjhan marked this pull request as ready for review June 16, 2024 05:31
cmd/cmd.go Outdated
return res
}

func handleParams(s string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatParams is a better name here

@jmorganca
Copy link
Member

This is overall good! Let's add some tests to the handler, including:

  1. basic test for model info
  2. test projector case

@@ -212,6 +213,7 @@ func Test_Routes(t *testing.T) {
"top_p 0.9",
}
assert.Equal(t, expectedParams, params)
assert.InDelta(t, 0, showResp.ModelInfo["general.parameter_count"], 1e-9, "Parameter count should be 0")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this field gets set to 0 by default in gguf.go, but might not be a robust test

"github.com/stretchr/testify/assert"
)

func TestShow(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put this in routes_test.go?

Copy link
Contributor Author

@royjhan royjhan Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting routes_create_test.go, routes_delete_test.go, routes_list_test.go, I thought show could be nice to separate out as well given the projector test. If we want to get rid of this file and just stick everything in routes_test, I'd change the create model handler to include model and projector KVs that can be retrieved in a show handler, but we might want to keep this stuff isolated. Lmk what you think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah and if you mean literally just sticking this function in routes_test.go, yeah its just to be consistent with the existence of those three other files Mike created

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, sorry I missed those others ones. I'd start in routes.go and move it out when it gets large enough, just to avoid sprawl.

t.Fatal(err)
}

assert.Equal(t, "test", resp.ModelInfo["general.architecture"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use t.Fatal instead of assert to stick to stdlib test functions

cmd/cmd.go Outdated
{renderSubTable(modelData, false)},
}

if slices.Contains(resp.Details.Families, "clip") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check for resp.ProjectorInfo instead of looking for clip in resp.Details.Families

README.md Outdated
@@ -182,6 +182,12 @@ $ ollama run llama3 "Summarize this file: $(cat README.md)"
Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
```

### Show model information
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible to make a separate PR for the docs change? We can merge it once this is released

docs/api.md Outdated
"model_info": {
"general.architecture": "llama",
"general.file_type": 2,
"general.name": "Meta-Llama-3-8B-Instruct",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this field as well for now

Copy link
Member

@jmorganca jmorganca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small items otherwise LGTM!

@royjhan royjhan merged commit fedf716 into main Jun 19, 2024
12 checks passed
@royjhan royjhan deleted the royh-show branch June 19, 2024 21:19
TensorTemplar pushed a commit to TensorTemplar/ollama that referenced this pull request Jun 28, 2024
* API Show Extended

* Initial Draft of Information

Co-Authored-By: Patrick Devine <pdevine@sonic.net>

* Clean Up

* Descriptive arg error messages and other fixes

* Second Draft of Show with Projectors Included

* Remove Chat Template

* Touches

* Prevent wrapping from files

* Verbose functionality

* Docs

* Address Feedback

* Lint

* Resolve Conflicts

* Function Name

* Tests for api/show model info

* Show Test File

* Add Projector Test

* Clean routes

* Projector Check

* Move Show Test

* Touches

* Doc update

---------

Co-authored-by: Patrick Devine <pdevine@sonic.net>
afterthought325 pushed a commit to afterthought325/ollama that referenced this pull request Jun 30, 2024
* API Show Extended

* Initial Draft of Information

Co-Authored-By: Patrick Devine <pdevine@sonic.net>

* Clean Up

* Descriptive arg error messages and other fixes

* Second Draft of Show with Projectors Included

* Remove Chat Template

* Touches

* Prevent wrapping from files

* Verbose functionality

* Docs

* Address Feedback

* Lint

* Resolve Conflicts

* Function Name

* Tests for api/show model info

* Show Test File

* Add Projector Test

* Clean routes

* Projector Check

* Move Show Test

* Touches

* Doc update

---------

Co-authored-by: Patrick Devine <pdevine@sonic.net>
afterthought325 pushed a commit to afterthought325/ollama that referenced this pull request Jul 13, 2024
* API Show Extended

* Initial Draft of Information

Co-Authored-By: Patrick Devine <pdevine@sonic.net>

* Clean Up

* Descriptive arg error messages and other fixes

* Second Draft of Show with Projectors Included

* Remove Chat Template

* Touches

* Prevent wrapping from files

* Verbose functionality

* Docs

* Address Feedback

* Lint

* Resolve Conflicts

* Function Name

* Tests for api/show model info

* Show Test File

* Add Projector Test

* Clean routes

* Projector Check

* Move Show Test

* Touches

* Doc update

---------

Co-authored-by: Patrick Devine <pdevine@sonic.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More details in Model Info
3 participants