update message processing #5126

mxyng · 2024-06-19T00:32:04Z

this change changes the way messages are processed before handing off to the llm. there are a few areas worth mentioning:

messages are now a first class component of the template. template rendering will only falling back to the previous iterative template if messages is unsupported by the template. however, new models should implement the previous prompt/response template for compatibility with older ollama versions
the generate endpoint has been updated to use messages for prompt templating but the end result should be the same
the chat endpoint has been updated to preprocess incoming messages
- continuous messages of the same role are joined into a single message, separated with two newlines
- content and image data can be interleaved by sending messages with alternating fields, e.g.
```
[
    {"role": "user", "content": "Consider the following images:"},
    {"role": "user", "images": ["<base64 image data>", "<base64 image data>"]},
    {"role": "user", "content": "What is the difference between the two images?"}
]
```
- system messages are aggregated and prepended to the last user message

server/prompt.go

server/prompt_test.go

template/template_test.go

server/prompt.go

template/template.go

server/prompt.go

server/routes.go

template/template.go

bmizerany · 2024-06-20T19:53:08Z

server/prompt.go

-					break
-				}
-			}
+// chatPrompt accepts a list of messages and returns the prompt and images that should be used for the next chat turn.


Very nice! Thank you.

server/routes.go

BruceMacD

Last comment about the response code in an error case, but otherwise this is looking ready to me.

server/images.go

server/routes.go

server/images.go

BruceMacD

This looks good, excited for this one

jmorganca · 2024-07-01T02:30:50Z

server/routes.go

-	case err != nil:
-		c.AbortWithStatusJSON(http.StatusBadRequest, gin.H{"error": err.Error()})
-		return
+func (s *Server) scheduleRunner(ctx context.Context, name string, caps []Capability, requestOpts map[string]any, keepAlive *api.Duration) (*runnerRef, error) {


Passing in caps is weird here – shouldn't the caller do that before scheduling the runner?

the caller doesn't have access to the model so it can't check capabilities

It did before this change

Could we load the model and check the capabilities before deciding to schedule the model here? and then pass in the Model? It seems like a precondition the handler should check (ideally this function goes away and we do something like schduler.Schedule(model, opts) from the handler which would be a lot clearer

maybe the function is misnamed. this doesn't schedule anything until some check including capabilities. it aggregates the various model and options related lines that chat, generate, and embeddings all have to call before scheduling a runner

server/routes.go

jmorganca · 2024-07-01T02:35:18Z

This is looking good. Love that messages are becoming a first-class part of the templating.

template/template.go

server/prompt.go

template/template.go

jmorganca

Overall looks great- small comment on the templating so we don't overcommit to helpers yet unless we really have to

jmorganca · 2024-07-04T00:45:07Z

LGTM. Will test this a bunch this evening (no need to block merging) but it looks great

ensure runtime model changes (template, system prompt, messages, options) are captured on model updates without needing to reload the server

PcjsCorp approved these changes Jun 19, 2024

View reviewed changes

mxyng force-pushed the mxyng/messages branch 4 times, most recently from bcffa81 to 8087a37 Compare June 19, 2024 20:59

mxyng marked this pull request as ready for review June 19, 2024 21:17

mxyng force-pushed the mxyng/messages branch from 8087a37 to 3085bb0 Compare June 19, 2024 22:31

bmizerany reviewed Jun 19, 2024

View reviewed changes

server/prompt.go Outdated Show resolved Hide resolved

server/prompt_test.go Outdated Show resolved Hide resolved

template/template_test.go Outdated Show resolved Hide resolved

server/prompt.go Outdated Show resolved Hide resolved

BruceMacD reviewed Jun 19, 2024

View reviewed changes

mxyng force-pushed the mxyng/messages branch 3 times, most recently from 78a13b7 to 8473b76 Compare June 20, 2024 18:46

bmizerany reviewed Jun 20, 2024

View reviewed changes

mxyng force-pushed the mxyng/messages branch 3 times, most recently from 31d734e to 68c3ec8 Compare June 20, 2024 21:16

mxyng changed the title ~~draft: update message processing~~ update message processing Jun 20, 2024

mxyng force-pushed the mxyng/capabilities branch from 8844279 to d982a4c Compare June 21, 2024 01:52

mxyng force-pushed the mxyng/messages branch from 68c3ec8 to 68477e0 Compare June 21, 2024 01:52

BruceMacD reviewed Jun 21, 2024

View reviewed changes

server/routes.go Show resolved Hide resolved

BruceMacD reviewed Jun 21, 2024

View reviewed changes

mxyng force-pushed the mxyng/messages branch from 68477e0 to df414a1 Compare June 21, 2024 01:58

mxyng force-pushed the mxyng/capabilities branch from d982a4c to 3d8f174 Compare June 21, 2024 02:02

mxyng force-pushed the mxyng/messages branch from df414a1 to c832f8c Compare June 21, 2024 02:04

BruceMacD reviewed Jun 21, 2024

View reviewed changes

server/images.go Outdated Show resolved Hide resolved

server/routes.go Show resolved Hide resolved

mxyng force-pushed the mxyng/messages branch 2 times, most recently from dbfc9da to 0d82aa0 Compare June 21, 2024 21:34

mxyng mentioned this pull request Jun 22, 2024

add insert support to generate endpoint #5207

Merged

BruceMacD reviewed Jun 25, 2024

View reviewed changes

server/routes.go Show resolved Hide resolved

jmorganca reviewed Jun 25, 2024

View reviewed changes

server/images.go Outdated Show resolved Hide resolved

mxyng force-pushed the mxyng/messages branch from 0d82aa0 to d295815 Compare June 25, 2024 22:43

BruceMacD approved these changes Jun 26, 2024

View reviewed changes

jmorganca reviewed Jul 1, 2024

View reviewed changes

server/routes.go Outdated Show resolved Hide resolved

jmorganca reviewed Jul 1, 2024

View reviewed changes

template/template.go Outdated Show resolved Hide resolved

jmorganca reviewed Jul 1, 2024

View reviewed changes

template/template.go Show resolved Hide resolved

mxyng force-pushed the mxyng/capabilities branch from 3d8f174 to da8e2a0 Compare July 1, 2024 17:49

mxyng force-pushed the mxyng/messages branch 2 times, most recently from 056ba91 to eed3f96 Compare July 1, 2024 18:14

jmorganca reviewed Jul 1, 2024

View reviewed changes

server/prompt.go Outdated Show resolved Hide resolved

mxyng force-pushed the mxyng/messages branch from eed3f96 to f0d1bfd Compare July 1, 2024 20:41

Base automatically changed from mxyng/capabilities to main July 2, 2024 21:26

mxyng force-pushed the mxyng/messages branch from f0d1bfd to fcac99a Compare July 2, 2024 23:55

jmorganca reviewed Jul 3, 2024

View reviewed changes

template/template.go Outdated Show resolved Hide resolved

jmorganca reviewed Jul 3, 2024

View reviewed changes

template/template.go Outdated Show resolved Hide resolved

jmorganca reviewed Jul 3, 2024

View reviewed changes

template/template.go Show resolved Hide resolved

jmorganca reviewed Jul 3, 2024

View reviewed changes

jmorganca approved these changes Jul 4, 2024

View reviewed changes

mxyng added 4 commits July 5, 2024 13:16

update message processing

269ed6e

comments

2c3fe1f

fix model reloading

ac7a842

ensure runtime model changes (template, system prompt, messages, options) are captured on model updates without needing to reload the server

no funcs

326363b

mxyng force-pushed the mxyng/messages branch from 8486e51 to 326363b Compare July 5, 2024 20:17

mxyng merged commit 9bbddc3 into main Jul 9, 2024
12 checks passed

mxyng deleted the mxyng/messages branch July 9, 2024 16:20

josegtmonteiro mentioned this pull request Jul 19, 2024

Assistant doesn't continue from its last message #5775

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update message processing #5126

update message processing #5126

mxyng commented Jun 19, 2024 •

edited

Loading

bmizerany Jun 20, 2024

BruceMacD left a comment •

edited

Loading

BruceMacD left a comment

jmorganca Jul 1, 2024

mxyng Jul 1, 2024

jmorganca Jul 1, 2024

mxyng Jul 1, 2024

jmorganca commented Jul 1, 2024

jmorganca left a comment

jmorganca commented Jul 4, 2024

update message processing #5126

update message processing #5126

Conversation

mxyng commented Jun 19, 2024 • edited Loading

bmizerany Jun 20, 2024

Choose a reason for hiding this comment

BruceMacD left a comment • edited Loading

Choose a reason for hiding this comment

BruceMacD left a comment

Choose a reason for hiding this comment

jmorganca Jul 1, 2024

Choose a reason for hiding this comment

mxyng Jul 1, 2024

Choose a reason for hiding this comment

jmorganca Jul 1, 2024

Choose a reason for hiding this comment

mxyng Jul 1, 2024

Choose a reason for hiding this comment

jmorganca commented Jul 1, 2024

jmorganca left a comment

Choose a reason for hiding this comment

jmorganca commented Jul 4, 2024

mxyng commented Jun 19, 2024 •

edited

Loading

BruceMacD left a comment •

edited

Loading