Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trim chat prompt based on llm context size #1963

Merged
merged 8 commits into from
Jan 30, 2024

Conversation

BruceMacD
Copy link
Contributor

@BruceMacD BruceMacD commented Jan 12, 2024

When trimming the input chat prompt we need to make sure we keep the prompt template in the expected format. Without this the prompt will be trimmed without accounting for the model template when the maximum context length is reached, which can result in unexpected behavior from the model.

  • update the ChatPrompt function to return a list of prompt variable, to allow the calling function to append them into the final prompt
  • create the final prompt based on the loaded LLM's context window size, while preserving the prompt template formatting and system message in the first message of the new context window

server/routes.go Outdated Show resolved Hide resolved
server/routes.go Outdated Show resolved Hide resolved
server/images.go Show resolved Hide resolved
server/images.go Show resolved Hide resolved
- only encode each prompt once
- reduce nested functions
@BruceMacD BruceMacD merged commit 0632dff into main Jan 30, 2024
13 checks passed
@BruceMacD BruceMacD deleted the brucemacd/template-token-smart branch January 30, 2024 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants