Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support prefill #463

Open
simonw opened this issue Apr 18, 2024 · 8 comments · May be fixed by #473
Open

Support prefill #463

simonw opened this issue Apr 18, 2024 · 8 comments · May be fixed by #473
Labels
enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Apr 18, 2024

Claude 3 and other models (like Reka) support prefill, where you can construct a chat but set the first tokens of the model's reply. I use that in datasette-query-assistant here: https://github.com/datasette/datasette-query-assistant/blob/a777a80bcb3b42933b2933de895f4f2eb9376e9d/datasette_query_assistant/__init__.py#L52-L62

LLM should offer this at both the CLI level and the Python API level.

@simonw simonw added the enhancement New feature or request label Apr 18, 2024
@simonw
Copy link
Owner Author

simonw commented Apr 18, 2024

I like the term "prefill" for this. I think it's a CLI option:

llm -m claude-3-opus 'JSON list of US state names' --prefill '["'

And a Python argument:

model = llm.get_model("claude-3-opus")
response = model.prompt("JSON list of US state names", prefill='["')

@simonw
Copy link
Owner Author

simonw commented Apr 18, 2024

Probably needs a supports_prefill = True option on the class here too:

llm/llm/models.py

Lines 243 to 248 in 9ad9ac6

class Model(ABC, _get_key_mixin):
model_id: str
key: Optional[str] = None
needs_key: Optional[str] = None
key_env_var: Optional[str] = None
can_stream: bool = False

Or should I call that can_prefill? Need a consistent naming convention here.

The image branch is currently using supports_images:

llm/llm/models.py

Lines 255 to 261 in eaf50d8

class Model(ABC, _get_key_mixin):
model_id: str
key: Optional[str] = None
needs_key: Optional[str] = None
key_env_var: Optional[str] = None
can_stream: bool = False
supports_images: bool = False

@simonw
Copy link
Owner Author

simonw commented Apr 18, 2024

I'm tempted to switch can_stream to supports_streaming for consistency with new options. can_images doesn't sound as good as supports_images.

I also want a supports_system option, since some models don't support system prompts.

@jph00
Copy link

jph00 commented Apr 19, 2024

A couple of things to consider:

  • Some APIs might include the prefill in the response, whereas some might not (currently, Claude does not)
  • It's possible to mock prefill through prompting, eg OpenAI:
image

In the latter case, you don't actually know whether the model will include the prefill in the response or not (OpenAI picks one of the other somewhat randomly).

Therefore, perhaps the design of this lib should be that the assistant response always includes the prefill -- and if it's not included in the API response, then its added. And for APIs that don't support prefill (which I guess is anything that doesn't have a completion-based API -- for now is that just OpenAI?) we modify the prompt to mock it.

Does that seem reasonable?

@jph00
Copy link

jph00 commented Apr 20, 2024

Another thought @simonw -- I think there's 4 possibilities for API support:

  1. Supported by the API, and it eats the prefill so you have to add it back
  2. Supported by the API, and it includes the prefill in the response
  3. Not supported by API, but reliably adds the prefill when asked (but may or may not include it in the response)
  4. Not supported by API, and we haven't found any prompt that results in it reliably adding a prefill

So instead of a supports_prefill bool, how about an enum with these 4 options? For 1 it adds back the prefill, for 2 it doesn't, for 3 it checks whether it's there and adds it if not, and for 4 it raises an exception if a prefill is requested.

@simonw
Copy link
Owner Author

simonw commented Apr 21, 2024

In an interesting twist... some of the OpenAI models apparently support this too! https://twitter.com/HamelHusain/status/1782149471624888512

I've found that many OpenAI users do not know about pre-fill with the Chat API

image

But... it looks like they are a little bit inconsistent about whether they continue the prompt without the prefill or if they answer with the prefill included:

https://twitter.com/HamelHusain/status/1782154898102211053

I found inconsistent behavior with the newest gpt-4-turbo that doesn't conform though (this is consistent across many runs)

image

@simonw
Copy link
Owner Author

simonw commented Apr 21, 2024

Another thought @simonw -- I think there's 4 possibilities for API support:

  1. Supported by the API, and it eats the prefill so you have to add it back

  2. Supported by the API, and it includes the prefill in the response

  3. Not supported by API, but reliably adds the prefill when asked (but may or may not include it in the response)

  4. Not supported by API, and we haven't found any prompt that results in it reliably adding a prefill

So instead of a supports_prefill bool, how about an enum with these 4 options? For 1 it adds back the prefill, for 2 it doesn't, for 3 it checks whether it's there and adds it if not, and for 4 it raises an exception if a prefill is requested.

The supports_prefill boolean will actually just let the CLI tool know if it should throw an error if the user passes --prefill "something" - I'll leave it to custom Python code in each model implementation to handle whether or not that prefill needs to be added to the response. I expect this will be a bit fiddly for the GPT ones - might even need to say "if the response starts with an exact match for the prefill then don't prepend the prefill again".

@jph00
Copy link

jph00 commented Apr 25, 2024

I expect this will be a bit fiddly for the GPT ones - might even need to say "if the response starts with an exact match for the prefill then don't prepend the prefill again".

This is what we decided to do for OpenAI. I think it is a nice approach actually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants