Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A model that picks the right sized model #32

Open
simonw opened this issue Jun 15, 2023 · 6 comments
Open

A model that picks the right sized model #32

simonw opened this issue Jun 15, 2023 · 6 comments
Labels
enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Jun 15, 2023

Count tokens with tiktoken and switch to the 16k or 32k models if necessary.

@simonw simonw added the enhancement New feature or request label Jun 15, 2023
@simonw simonw added this to the 0.4 milestone Jun 15, 2023
@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

This may be a template and not a model - perhaps llm -t auto

Not sure what the YAML would look like.

A model might be better through, since then you could combine a template with the -m auto option.

@simonw
Copy link
Owner Author

simonw commented Jun 15, 2023

I think it's a special model called -m auto

How should it handle some users not having GPT-4 32k access?

I think it should try anyway and error if they don't have the model - it would have errored anyway since they were over 32k tokens.

@benjamin-kirkbride
Copy link
Sponsor Contributor

Also need to consider the 3.5 4k vs 16k, guessing this is going to be a pattern that continues as well; models that are "the same" but differ in context length (and pricing).

I think there needs to be some concept of "flavors" of models, and in llm you should be able to select the base "flavor" you want and have the model be selected based on a number of other factors (including context length).

@benjamin-kirkbride
Copy link
Sponsor Contributor

Worth noting this is a problem that other tools are facing right now as well. I'm not aware of any consensus on how to handle it as of yet, but it's probably worth looking into.

@benjamin-kirkbride
Copy link
Sponsor Contributor

this is relevant to the new -c flag as well, as a conversation that fits in the context of one model may outgrow it, and ideally you can continue the conversation without interuption.

@simonw simonw removed this from the 0.5 milestone Jul 1, 2023
@simonw
Copy link
Owner Author

simonw commented Jul 1, 2023

Dropped from the 0.5 milestone, it's not critical for that.

I'm actually thinking this might make more sense as a llm-auto plugin. It could be expanded to cover all kinds of other heuristics, not just the length of the context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants