cllama feed injection budget can drop channel-awareness despite higher claw-wall caps

## Problem

Tiverton now asks `claw-wall` for a much larger 24h channel-awareness window, but `cllama` still silently constrains injected feed context with hard-coded byte budgets:

- `cllama/internal/feeds/manifest.go`: `MaxFeedResponseBytes = 32 * 1024`
- `cllama/internal/feeds/manifest.go`: `MaxTotalFeedBytes = 64 * 1024`
- `cllama/internal/feeds/fetcher.go` truncates every feed body to `MaxFeedResponseBytes` before formatting.
- `cllama/internal/feeds/inject.go` skips any later feed block that would push the aggregate feed block over `MaxTotalFeedBytes`, emitting `--- FEED: <name> skipped (total feed size cap reached) ---`.

This means the pod-level `x-claw.context.channel.max-chars` knob added/covered by #242 can successfully generate a large `channel-awareness` / `channel-context` URL, but the provider-visible prompt still gets capped or skipped inside cllama.

## Live Tiverton symptom

Date: 2026-05-18
Repo/pod: `mostlydev/tiverton-house`
Live revision observed: `a925fc4`
Runtime observed after deploy: `claw-wall v0.17.2`, `cllama v0.6.6`

Morning Discord report from Weston at 09:23 ET:

> One gap: channel-context skipped this turn due to total feed size cap. The channel-awareness feed is present but the full context buffer got truncated. Non-blocking — I can read channel-awareness for the last 24h, just can't see the full enriched context this cycle.

Current generated Tiverton feed config is correctly asking claw-wall for the larger window:

```json
{
  "name": "channel-awareness",
  "source": "claw-wall",
  "path": "/channel-awareness?channels=1464509330731696213&since=24h&limit=200&max_chars=262144&context_kind=raw_window"
}
```

The matching `channel-context` feed is likewise generated with `mode=tail&since=24h&limit=200&max_chars=262144`.

But the live raw `channel-awareness` response for the trading-floor channel was ~107 KB. cllama can only accept 32 KB from that one feed and only 64 KB across all injected feeds. A typical trader turn already includes:

- `market-context` (~12-13 KB observed)
- style context such as `momentum-context` (~14 KB observed)
- `agent-scaffold` (~3 KB)
- `desk-chronicle` (~3 KB)
- `agent-memory`
- `channel-awareness`
- `channel-context`

So on active market days the most important human-floor context competes with market/style feeds in an all-or-nothing aggregate budget. Depending on manifest order and exact sizes, `channel-awareness` or `channel-context` can be truncated or skipped even though the operator configured a larger claw-wall window.

## Why this matters

The product promise after #232/#242 is that agents have bounded 24h floor awareness independent of cursors and restarts. A downstream operator can now tune claw-wall up to a meaningful 24h window, but the model still may not see it because cllama has a lower, non-configurable envelope.

This invalidates the mental model of a context feed: the source feed exists and returns data, yet provider-visible context may omit it because earlier feeds consumed the hidden cllama budget.

Retrieval tools are useful, but they are not a full substitute. The model has to know that something is missing before it decides to search; in the failure mode, the model can simply conclude that the injected feed had no relevant context.

## Related issues

- #232: introduced bounded 24h channel awareness independent of cursors/restarts.
- #242: raised/exposed claw-wall `channel-awareness` `max_chars`; explicitly left token-budget guard rails out of scope.

This issue is the cllama-side continuation of that work.

## Proposed fix

1. Make cllama feed budgets configurable, at least:
   - per-feed max bytes;
   - aggregate injected-feed max bytes.

2. Allow pod/operator configuration to flow into cllama, probably via generated cllama env from `x-claw.cllama-defaults.env` or an explicit cllama config block.

3. Avoid all-or-nothing loss of critical feeds:
   - reserve space for high-priority feeds such as `channel-awareness` and `channel-context`; or
   - allow feed priority/order configuration; or
   - degrade lower-priority long-lived feeds (`desk-chronicle`, `agent-scaffold`) before dropping live floor context.

4. Make overflow visible in structured context metadata/logs:
   - which feed was truncated;
   - which feed was skipped;
   - effective byte budget;
   - source response size when known;
   - whether the provider-visible payload includes or omits each feed.

## Acceptance criteria

- A pod can configure cllama to accept a larger `channel-awareness` feed than 32 KB and a larger aggregate feed block than 64 KB.
- With Tiverton-like feed sizes, a 100 KB `channel-awareness` response does not cause `channel-context` or `channel-awareness` to disappear silently.
- Tests cover both per-feed truncation and aggregate overflow behavior with `channel-awareness` present.
- Provider-visible context snapshots and logs clearly distinguish:
  - source feed returned data but was truncated;
  - source feed returned data but was skipped due to aggregate budget;
  - feed was included in the actual provider-visible payload.
- Defaults remain bounded for small pods; this should not become unbounded prompt stuffing by default.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cllama feed injection budget can drop channel-awareness despite higher claw-wall caps #244

Problem

Live Tiverton symptom

Why this matters

Related issues

Proposed fix

Acceptance criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

cllama feed injection budget can drop channel-awareness despite higher claw-wall caps #244

Description

Problem

Live Tiverton symptom

Why this matters

Related issues

Proposed fix

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions