Feature/configurable memory compression by seanturner83 · Pull Request #454 · usestrix/strix

seanturner83 · 2026-04-16T12:45:33Z

Summary

Adds three environment variables to tune the memory compressor for large-scale scanning campaigns, plus an additional prompt cache breakpoint for improved cache hit rates on Anthropic models.

New environment variables

Variable	Default	Purpose
`STRIX_MAX_CONTEXT_TOKENS`	100000	Token threshold before compression triggers
`STRIX_MIN_RECENT_MESSAGES`	15	Messages preserved from compression
`STRIX_MAX_TOOL_OUTPUT_CHARS`	0 (off)	Truncate oversized tool outputs at ingestion, keeping first 60% + last 40% with notice

Prompt caching improvement

Adds a second cache_control breakpoint on the agent identity message (<agent_identity> tag), which is stable for the lifetime of each agent. This complements the existing system prompt breakpoint. Related to #279.

Motivation

Strix's agentic architecture (6 agents, ~600 LLM calls per scan, full history resent every call) can produce 1.5M–26M input tokens per scan on large repos. The memory compressor currently only triggers at 90% of 100K tokens — by which point the cost damage is already done. Oversized tool outputs (nmap scans, large file reads) accumulate in history and get resent hundreds of times.

These changes make compression tunable and add tool output truncation at ingestion to prevent context bloat at the source.

A/B test results

Tested on a production corpus of 800 repositories at a crypto infrastructure company. Settings: MAX_CONTEXT_TOKENS=40000, MIN_RECENT_MESSAGES=10, MAX_TOOL_OUTPUT_CHARS=8000.

Per-scan comparison (large TypeScript repo with confirmed critical findings):

Metric	Stock	Optimized	Delta
Findings	15 (10C/5H)	16 (9C/7H)	+1 finding
Cost	$20.62	$9.55	−54%
Input tokens	8.7M	5.5M	−37%
Cache hit rate	34%	56%	+22pp

At scale (15-scan sample): Average cost dropped from $8.66 to $4.52 per scan.

No findings were lost. The optimised configuration actually found one additional vulnerability that the stock configuration missed (likely because the stock run hit context limits and lost relevant earlier context).

Changes

strix/config/config.py — 3 new config variables
strix/llm/memory_compressor.py — Configurable thresholds, truncate_tool_outputs() method called before compression in compress_history(), _truncate_tool_output() helper preserving head + tail
strix/llm/llm.py — Second cache breakpoint on agent identity message

All changes are backwards compatible — default behaviour is unchanged when env vars are not set.

Test plan

A/B tested on production corpus (800 repos, multiple repo sizes/languages)
Verified no finding quality regression
Confirmed defaults match current behaviour (no env vars = no change)
Unit tests for _truncate_tool_output() edge cases (happy to add if wanted)

…sion

greptile-apps · 2026-04-16T12:50:07Z

Greptile Summary

This PR adds three environment variables (STRIX_MAX_CONTEXT_TOKENS, STRIX_MIN_RECENT_MESSAGES, STRIX_MAX_TOOL_OUTPUT_CHARS) to make the memory compressor tunable, plus a second cache_control breakpoint on the agent identity message. All defaults preserve existing behaviour.

Confidence Score: 5/5

Safe to merge; only P2 findings, defaults are unchanged, and the previous-thread blocking issues have all been addressed.

No P0 or P1 findings. The double-truncation original_len inaccuracy and the persisted-config note are both P2 quality/observability concerns that don't affect correctness or security. The earlier concerns about role-guard mutations and missing list-branch caching are resolved in this revision.

No files require special attention.

Important Files Changed

Filename	Overview
strix/config/config.py	Adds three new nullable config class attributes for memory/context tuning; correctly picked up by _tracked_names and thus persisted/loaded. Not added to _LLM_CANONICAL_NAMES, which is intentional.
strix/llm/llm.py	Adds a second cache_control breakpoint for the agent identity message; previous-thread issues (list-content path not caching, in-place mutation) are now addressed with an elif branch and idempotent mutation.
strix/llm/memory_compressor.py	Configurable thresholds and tool-output truncation are implemented correctly with role/type guards; truncation is called pre-threshold-check so it always runs, causing original_len in the notice to go stale on repeated compress_history calls.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: strix/llm/memory_compressor.py
Line: 271

Comment:
**Tool output `original_len` becomes stale after first truncation**

`compress_history` is called on every LLM iteration with the full (already-mutated) history. Since `truncate_tool_outputs` runs before the early-return token check, a message that was truncated on a previous call (resulting in `head + notice + tail ≈ max_chars + len(notice)` chars) will be slightly above `max_tool_output_chars` on the next call and will be re-processed. The data bytes stabilise (same head/tail are re-selected), but `original_len` in the notice will reflect the already-truncated size rather than the true original length from the second call onwards — e.g., "of 8140-character output" instead of "of 10000-character output".

A simple guard avoids re-processing content that already carries the marker:

```python
# Direct tool-role messages (string content)
if (
    role == "tool"
    and isinstance(content, str)
    and len(content) > self.max_tool_output_chars
    and "[Output truncated:" not in content
):
    msg["content"] = _truncate_tool_output(content, self.max_tool_output_chars)
```

The same guard should be applied in the `tool_result` branches inside the `elif isinstance(content, list)` block.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: strix/config/config.py
Line: 24-26

Comment:
**New vars missing from `_LLM_CANONICAL_NAMES` persisted-config clearing logic**

The three new variables are correctly picked up by `_tracked_names()` (lowercase, `None` default) so they are saved and loaded through `Config.save_current()` / `Config.apply_saved()`. However, `_LLM_CANONICAL_NAMES` drives the stale-config reset in `_llm_env_changed()`: when a saved config file is detected alongside different live LLM env vars the whole LLM block is cleared. The memory-tuning variables are **not** LLM-auth config, so excluding them here is correct.

Worth noting though: a user who has `STRIX_MAX_CONTEXT_TOKENS` persisted in `~/.strix/cli-config.json` and then unsets it by clearing the env var will have it silently re-applied on the next run (the `cleared_vars` logic only strips vars that are set to `""` in the environment, not vars that are simply absent). This is existing behaviour for all nullable config vars and not unique to this PR, but may surprise operators tuning these settings interactively.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (2): Last reviewed commit: "chore: retrigger review" | Re-trigger Greptile}

…ntent cache

bearsyankees · 2026-04-27T21:26:00Z

@greptile

Pulls usestrix#467 (feat: add resume session feature) ahead of upstream merge. Enables unattended scans to survive crashes or forced termination (e.g. AWS STS session expiry mid-scan) by appending every LLM message to strix_runs/<run>/conversation.jsonl and replaying on --resume. New CLI: strix --list-sessions # tabulate past scans strix --continue (or -c) # resume most recent strix --resume <run_name> # resume by name strix --resume # open interactive picker Motivates: https://github.com/seedcx/strix-scan-workflow/actions/runs/25116247499 — trade-api flag-aware scan exited at 71min with APIConnectionError ("security token included in the request is expired"). Session had produced 10 findings; all lost once the report upload also hit ExpiredToken. With this commit merged, the CI composite action can wrap strix in a re-assume + --continue loop so each sub-session fits inside the 60min STS budget and total scan duration is bounded only by the overall workflow timeout. Upstream integration: - Once usestrix#467 merges, drop from the seedcx-build recipe - Until then, include alongside usestrix#454 (memory compression), usestrix#460 (truncate retry), usestrix#468 (thinking retry), this one, and the adaptive-thinking commit on this branch Clean 3-way merge, no conflicts. Smoke tests: - uv sync succeeds - strix --help surfaces --resume / --continue / --list-sessions - imports (strix.sessions.resume, strix.telemetry.conversation_log) OK Follow-up (separate): strix-scan-workflow README recipe + composite action consumer changes to invoke resume on re-dispatch.

seanturner83 added 2 commits April 16, 2026 11:29

feat: configurable memory compression and tool output truncation

2c5bbed

Merge branch 'usestrix:main' into feature/configurable-memory-compres…

319034c

…sion

greptile-apps Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread strix/llm/memory_compressor.py

Comment thread strix/llm/memory_compressor.py

Comment thread strix/llm/llm.py

seanturner83 added 2 commits April 16, 2026 13:57

fix: address review feedback — role guard, truncation notice, list co…

ad4e28b

…ntent cache

chore: retrigger review

6412e12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/configurable memory compression#454

Feature/configurable memory compression#454
seanturner83 wants to merge 4 commits into
usestrix:mainfrom
seanturner83:feature/configurable-memory-compression

seanturner83 commented Apr 16, 2026

Uh oh!

greptile-apps Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bearsyankees commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

seanturner83 commented Apr 16, 2026

Summary

New environment variables

Prompt caching improvement

Motivation

A/B test results

Changes

Test plan

Uh oh!

greptile-apps Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bearsyankees commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Apr 16, 2026 •

edited

Loading