Context overflow not handled gracefully with local model (qwen3.5:35b, 65K context)

## Description

When using Anton with a **local model** (e.g. `qwen3.5:35b` via llama.cpp) configured with a **65K token context window**, longer chat sessions eventually hit the context limit and produce an unhandled error instead of recovering gracefully.

## Error message
❌ What went wrong
The chat context was too large (over 69,000 tokens). This happens when many queries and results are stored.

## Expected behavior

Anton's code in `chat.py` has an automatic history summarization mechanism (`_summarize_history()`) that is supposed to trigger when `context_pressure` exceeds 70% (`_CONTEXT_PRESSURE_THRESHOLD = 0.7`). It also catches `ContextOverflowError` to compress older turns reactively.

In practice, with a local model at 65K context, this automatic summarization does **not** kick in before the limit is exceeded, causing the session to fail rather than recovering transparently.

## Environment

- **Model**: `qwen3.5:35b` (local, via llama.cpp)
- **Context window**: ~65,000 tokens
- **Trigger**: Long chat session with many tool calls and query results accumulated in history

## Steps to reproduce

1. Configure Anton with a local model (`qwen3.5:35b`) via `/setup`
2. Run a long, multi-step session with several data queries and tool calls
3. After enough turns, the session fails with the context-too-large error

## Possible causes

- The `context_pressure` calculation may not be accurate for local models, so the 70% threshold is never triggered proactively
- `ContextOverflowError` may not be raised or mapped correctly for local backends, so the reactive summarization path is also skipped
- Token counting for local models may differ from what Anton expects

## Suggested fix

- Ensure token counting works correctly for local models
- Fall back to a character-based estimate if token counts are unavailable from the local model API
- Show a user-friendly recovery message (e.g. "Context too large — summarizing history and continuing…") instead of a hard failure
- Consider adding a `/compact` or `/reset` slash command to let users manually trim history mid-session

## Current workarounds

- Start a new Anton session (history is cleared, learned memory is preserved)
- Switch to a model with a larger context window via `/setup`
- Split long tasks into shorter sessions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context overflow not handled gracefully with local model (qwen3.5:35b, 65K context) #106

Description

Error message

Expected behavior

Environment

Steps to reproduce

Possible causes

Suggested fix

Current workarounds

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Context overflow not handled gracefully with local model (qwen3.5:35b, 65K context) #106

Description

Description

Error message

Expected behavior

Environment

Steps to reproduce

Possible causes

Suggested fix

Current workarounds

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions