Skip to content

Conversation

@elaminm2003
Copy link

Closes #176
This PR introduces a necessary feature to prevent message history from growing indefinitely, which previously led to high operational costs and frequent context length errors in LLM calls.

We now enforce a configurable token limit, ensuring better stability and cost control.

Key Changes
Token Limit Enforcement:

Added a new setting, MAX_TOKEN_LIMIT (default: 100,000 tokens), in src/ansari/config.py.

Integrated the tiktoken library to accurately count tokens in the message history (src/ansari/agents/ansari.py).

The process_message_history function now checks the token count. If the limit is exceeded, the agent gracefully refuses to process the request and prompts the user to start a new conversation.

Bug Fix:

Resolved a circular import issue in src/ansari/app/main_api.py by reordering imports. This ensures the server starts correctly when whatsapp_router is included.

Testing
Added new unit tests in tests/unit/test_token_limit.py to verify that the token limit is enforced correctly (blocking long histories while allowing short ones).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deal with message history that exceeds token limit

1 participant