Support prefill #463

simonw · 2024-04-18T23:12:56Z

Claude 3 and other models (like Reka) support prefill, where you can construct a chat but set the first tokens of the model's reply. I use that in datasette-query-assistant here: https://github.com/datasette/datasette-query-assistant/blob/a777a80bcb3b42933b2933de895f4f2eb9376e9d/datasette_query_assistant/__init__.py#L52-L62

LLM should offer this at both the CLI level and the Python API level.

The text was updated successfully, but these errors were encountered:

simonw · 2024-04-18T23:14:18Z

I like the term "prefill" for this. I think it's a CLI option:

llm -m claude-3-opus 'JSON list of US state names' --prefill '["'

And a Python argument:

model = llm.get_model("claude-3-opus")
response = model.prompt("JSON list of US state names", prefill='["')

simonw · 2024-04-18T23:33:38Z

Probably needs a supports_prefill = True option on the class here too:

llm/llm/models.py

Lines 243 to 248 in 9ad9ac6

    
           class Model(ABC, _get_key_mixin): 
        
               model_id: str 
        
               key: Optional[str] = None 
        
               needs_key: Optional[str] = None 
        
               key_env_var: Optional[str] = None 
        
               can_stream: bool = False

Or should I call that can_prefill? Need a consistent naming convention here.

The image branch is currently using supports_images:

llm/llm/models.py

Lines 255 to 261 in eaf50d8

    
           class Model(ABC, _get_key_mixin): 
        
               model_id: str 
        
               key: Optional[str] = None 
        
               needs_key: Optional[str] = None 
        
               key_env_var: Optional[str] = None 
        
               can_stream: bool = False 
        
               supports_images: bool = False

simonw · 2024-04-18T23:35:10Z

I'm tempted to switch can_stream to supports_streaming for consistency with new options. can_images doesn't sound as good as supports_images.

I also want a supports_system option, since some models don't support system prompts.

jph00 · 2024-04-19T20:53:06Z

A couple of things to consider:

Some APIs might include the prefill in the response, whereas some might not (currently, Claude does not)
It's possible to mock prefill through prompting, eg OpenAI:

In the latter case, you don't actually know whether the model will include the prefill in the response or not (OpenAI picks one of the other somewhat randomly).

Therefore, perhaps the design of this lib should be that the assistant response always includes the prefill -- and if it's not included in the API response, then its added. And for APIs that don't support prefill (which I guess is anything that doesn't have a completion-based API -- for now is that just OpenAI?) we modify the prompt to mock it.

Does that seem reasonable?

jph00 · 2024-04-20T03:24:33Z

Another thought @simonw -- I think there's 4 possibilities for API support:

Supported by the API, and it eats the prefill so you have to add it back
Supported by the API, and it includes the prefill in the response
Not supported by API, but reliably adds the prefill when asked (but may or may not include it in the response)
Not supported by API, and we haven't found any prompt that results in it reliably adding a prefill

So instead of a supports_prefill bool, how about an enum with these 4 options? For 1 it adds back the prefill, for 2 it doesn't, for 3 it checks whether it's there and adds it if not, and for 4 it raises an exception if a prefill is requested.

simonw · 2024-04-21T21:25:44Z

In an interesting twist... some of the OpenAI models apparently support this too! https://twitter.com/HamelHusain/status/1782149471624888512

I've found that many OpenAI users do not know about pre-fill with the Chat API

But... it looks like they are a little bit inconsistent about whether they continue the prompt without the prefill or if they answer with the prefill included:

https://twitter.com/HamelHusain/status/1782154898102211053

I found inconsistent behavior with the newest gpt-4-turbo that doesn't conform though (this is consistent across many runs)

simonw · 2024-04-21T21:27:56Z

Another thought @simonw -- I think there's 4 possibilities for API support:

Supported by the API, and it eats the prefill so you have to add it back

Supported by the API, and it includes the prefill in the response

Not supported by API, but reliably adds the prefill when asked (but may or may not include it in the response)

Not supported by API, and we haven't found any prompt that results in it reliably adding a prefill

So instead of a supports_prefill bool, how about an enum with these 4 options? For 1 it adds back the prefill, for 2 it doesn't, for 3 it checks whether it's there and adds it if not, and for 4 it raises an exception if a prefill is requested.

The supports_prefill boolean will actually just let the CLI tool know if it should throw an error if the user passes --prefill "something" - I'll leave it to custom Python code in each model implementation to handle whether or not that prefill needs to be added to the response. I expect this will be a bit fiddly for the GPT ones - might even need to say "if the response starts with an exact match for the prefill then don't prepend the prefill again".

jph00 · 2024-04-25T03:38:54Z

I expect this will be a bit fiddly for the GPT ones - might even need to say "if the response starts with an exact match for the prefill then don't prepend the prefill again".

This is what we decided to do for OpenAI. I think it is a nice approach actually.

simonw added the enhancement New feature or request label Apr 18, 2024

jph00 linked a pull request Apr 25, 2024 that will close this issue

Add prefill option, along with implementation for OpenAI #473

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support prefill #463

Support prefill #463

simonw commented Apr 18, 2024

simonw commented Apr 18, 2024

simonw commented Apr 18, 2024 •

edited

Loading

simonw commented Apr 18, 2024 •

edited

Loading

jph00 commented Apr 19, 2024

jph00 commented Apr 20, 2024

simonw commented Apr 21, 2024

simonw commented Apr 21, 2024 •

edited

Loading

jph00 commented Apr 25, 2024

Support prefill #463

Support prefill #463

Comments

simonw commented Apr 18, 2024

simonw commented Apr 18, 2024

simonw commented Apr 18, 2024 • edited Loading

simonw commented Apr 18, 2024 • edited Loading

jph00 commented Apr 19, 2024

jph00 commented Apr 20, 2024

simonw commented Apr 21, 2024

simonw commented Apr 21, 2024 • edited Loading

jph00 commented Apr 25, 2024

simonw commented Apr 18, 2024 •

edited

Loading

simonw commented Apr 18, 2024 •

edited

Loading

simonw commented Apr 21, 2024 •

edited

Loading