Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update message processing #5126

Merged
merged 4 commits into from
Jul 9, 2024
Merged

update message processing #5126

merged 4 commits into from
Jul 9, 2024

Conversation

mxyng
Copy link
Contributor

@mxyng mxyng commented Jun 19, 2024

this change changes the way messages are processed before handing off to the llm. there are a few areas worth mentioning:

  1. messages are now a first class component of the template. template rendering will only falling back to the previous iterative template if messages is unsupported by the template. however, new models should implement the previous prompt/response template for compatibility with older ollama versions

  2. the generate endpoint has been updated to use messages for prompt templating but the end result should be the same

  3. the chat endpoint has been updated to preprocess incoming messages

    • continuous messages of the same role are joined into a single message, separated with two newlines
    • content and image data can be interleaved by sending messages with alternating fields, e.g.
      [
          {"role": "user", "content": "Consider the following images:"},
          {"role": "user", "images": ["<base64 image data>", "<base64 image data>"]},
          {"role": "user", "content": "What is the difference between the two images?"}
      ]
      
    • system messages are aggregated and prepended to the last user message

@mxyng mxyng force-pushed the mxyng/messages branch 4 times, most recently from bcffa81 to 8087a37 Compare June 19, 2024 20:59
@mxyng mxyng marked this pull request as ready for review June 19, 2024 21:17
server/prompt.go Outdated Show resolved Hide resolved
server/prompt_test.go Outdated Show resolved Hide resolved
template/template_test.go Outdated Show resolved Hide resolved
server/prompt.go Outdated Show resolved Hide resolved
template/template.go Show resolved Hide resolved
template/template.go Outdated Show resolved Hide resolved
server/prompt.go Show resolved Hide resolved
server/prompt.go Show resolved Hide resolved
server/prompt.go Show resolved Hide resolved
server/routes.go Outdated Show resolved Hide resolved
server/routes.go Outdated Show resolved Hide resolved
template/template.go Show resolved Hide resolved
template/template.go Outdated Show resolved Hide resolved
@mxyng mxyng force-pushed the mxyng/messages branch 3 times, most recently from 78a13b7 to 8473b76 Compare June 20, 2024 18:46
break
}
}
// chatPrompt accepts a list of messages and returns the prompt and images that should be used for the next chat turn.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Thank you.

@mxyng mxyng force-pushed the mxyng/messages branch 3 times, most recently from 31d734e to 68c3ec8 Compare June 20, 2024 21:16
@mxyng mxyng changed the title draft: update message processing update message processing Jun 20, 2024
Copy link
Contributor

@BruceMacD BruceMacD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last comment about the response code in an error case, but otherwise this is looking ready to me.

server/images.go Outdated Show resolved Hide resolved
server/routes.go Show resolved Hide resolved
@mxyng mxyng force-pushed the mxyng/messages branch 2 times, most recently from dbfc9da to 0d82aa0 Compare June 21, 2024 21:34
server/images.go Outdated Show resolved Hide resolved
Copy link
Contributor

@BruceMacD BruceMacD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, excited for this one

server/routes.go Outdated
case err != nil:
c.AbortWithStatusJSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return
func (s *Server) scheduleRunner(ctx context.Context, name string, caps []Capability, requestOpts map[string]any, keepAlive *api.Duration) (*runnerRef, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing in caps is weird here – shouldn't the caller do that before scheduling the runner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the caller doesn't have access to the model so it can't check capabilities

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did before this change

Could we load the model and check the capabilities before deciding to schedule the model here? and then pass in the Model? It seems like a precondition the handler should check (ideally this function goes away and we do something like schduler.Schedule(model, opts) from the handler which would be a lot clearer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the function is misnamed. this doesn't schedule anything until some check including capabilities. it aggregates the various model and options related lines that chat, generate, and embeddings all have to call before scheduling a runner

server/routes.go Outdated Show resolved Hide resolved
@jmorganca
Copy link
Member

This is looking good. Love that messages are becoming a first-class part of the templating.

template/template.go Outdated Show resolved Hide resolved
@mxyng mxyng force-pushed the mxyng/messages branch 2 times, most recently from 056ba91 to eed3f96 Compare July 1, 2024 18:14
server/prompt.go Outdated Show resolved Hide resolved
Base automatically changed from mxyng/capabilities to main July 2, 2024 21:26
template/template.go Outdated Show resolved Hide resolved
template/template.go Outdated Show resolved Hide resolved
Copy link
Member

@jmorganca jmorganca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great- small comment on the templating so we don't overcommit to helpers yet unless we really have to

@jmorganca
Copy link
Member

LGTM. Will test this a bunch this evening (no need to block merging) but it looks great

mxyng added 4 commits July 5, 2024 13:16
ensure runtime model changes (template, system prompt, messages,
options) are captured on model updates without needing to reload the
server
@mxyng mxyng merged commit 9bbddc3 into main Jul 9, 2024
12 checks passed
@mxyng mxyng deleted the mxyng/messages branch July 9, 2024 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants