-
-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow users to govern the ollama token context size #121803
Conversation
Hey there @synesthesiam, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
Companion to home-assistant/core#121803
The default 2048 is useless with even a modest smart home — it causes the system prompt to be completely ignored.
@@ -81,6 +81,11 @@ | |||
CONF_MAX_HISTORY = "max_history" | |||
DEFAULT_MAX_HISTORY = 20 | |||
|
|||
CONF_NUM_CTX = "num_ctx" | |||
DEFAULT_NUM_CTX = 2048 | |||
MAX_NUM_CTX = 65536 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newer model seem to support up to 128K, might be a good idea to change max to an even higher number (unless I'm misunderstanding something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps. Mind suggesting a diff / patch in this PR? I would happily increase it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAX_NUM_CTX = 65536 | |
MAX_NUM_CTX = 131072 |
@@ -251,6 +253,11 @@ async def async_process( | |||
{"messages": message_history.messages}, | |||
) | |||
|
|||
options: ollama.Options | None = None | |||
num_ctx = settings.get(CONF_NUM_CTX, DEFAULT_NUM_CTX) | |||
if num_ctx != DEFAULT_NUM_CTX: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default size in ollama is 2048. So when DEFAULT_NUM_CTX is set to 4096 here it doesn't actually set a num_ctx
so the 4096 is ignored and it still uses 2048.
One option is:
- Don't set
DEFAULT_NUM_CTX
at all so that the default isNone
and only pass the context size when its notNone
- If you're trying to also increase the default context size, then this needs a different approach
}, | ||
"data_description": { | ||
"prompt": "Instruct how the LLM should respond. This can be a template.", | ||
"keep_alive": "Duration in seconds for Ollama to keep model in memory. -1 = indefinite, 0 = never." | ||
"keep_alive": "Duration in seconds for Ollama to keep model in memory. -1 = indefinite, 0 = never.", | ||
"num_ctx": "Increase this if you have a complex smart home, or the LLM seems to ignore knowingly exposed devices." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"num_ctx": "Increase this if you have a complex smart home, or the LLM seems to ignore knowingly exposed devices." | |
"num_ctx": "Number of tokens a model can process. Higher values allow to handle a larger number of devices." |
I've marked this PR, as changes are requested that need to be processed. Thanks! 👍 ../Frenck |
@Rudd-O will you be able to implement the suggested changes to the PR? Thanks |
Sent #124555 given this seems to have gone un responsive and i think it needs a prompt fix given there is somewhat of a regression here. Respectfully closing to handle in the other PR. |
Proposed change
Allows the user of Ollama to specify the context token size, to ensure that models with
small default
num_ctx
can still be made to work with Home Assistant assist.Type of change
Additional information
Checklist
ruff format homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
.To help with the load of incoming pull requests: