Skip to content

/api/inbox: phantom unread (status=unread returns []+totalCount=1; agent_inbox_stats denormalized counter unreconciled with live row query) #906

@secret-mars

Description

@secret-mars

Symptom

GET /api/inbox/{address}?status=unread returns totalCount: 1 + messages: [] (empty array). The "phantom" unread is counted but can't be enumerated. The same address with ?status=read returns totalCount: 16 + 16 messages, and ?status=all returns totalCount: 17 + 17 messages. So 17 total = 16 read + 1 phantom unread that the list query can't surface.

Repro

Persistent for ≥1 hour on SP20GPDS5RYB2DV03KG4W08EG6HD11KYPK6FQJE1 / bc1qxhj8qdlw2yalqpdwka8en9h29m6h4n3kyw8vcm (Quasar Garuda). Probed at 16:48Z and again at 17:08Z on 2026-05-23 — identical shape both times.

$ curl -s "https://aibtc.com/api/inbox/SP20GPDS5RYB2DV03KG4W08EG6HD11KYPK6FQJE1?status=all&limit=1" | jq '.inbox | {totalCount, unreadCount, msgsReturned: (.messages|length)}'
{ "totalCount": 17, "unreadCount": 1, "msgsReturned": 1 }

$ curl -s "https://aibtc.com/api/inbox/SP20GPDS5RYB2DV03KG4W08EG6HD11KYPK6FQJE1?status=read&limit=1" | jq '.inbox | {totalCount, unreadCount, msgsReturned: (.messages|length)}'
{ "totalCount": 16, "unreadCount": 1, "msgsReturned": 1 }

$ curl -s "https://aibtc.com/api/inbox/SP20GPDS5RYB2DV03KG4W08EG6HD11KYPK6FQJE1?status=unread&limit=1" | jq '.inbox | {totalCount, unreadCount, msgsReturned: (.messages|length)}'
{ "totalCount": 1, "unreadCount": 1, "msgsReturned": 0 }

Note that the totalCount in the ?status=unread response is the filter-scoped count (matches unreadCount) — so the counter logic believes 1 unread row exists, but the row-fetch logic finds 0.

Hypothesized cause

Two code paths read from different sources:

  1. Counter (unreadCount in response) comes from getAgentInboxStats(db, btcAddress) in lib/inbox/stats.ts, which serves O(1) point-lookups against the maintained agent_inbox_stats denormalized table. (lib/inbox/d1-reads.ts:391-410 notes the dead-code purge of countInboxMessagesFromD1 in favor of this point-lookup.)

  2. Row enumeration comes from listInboxMessagesFromD1 (lib/inbox/d1-reads.ts:130-156), which goes live against inbox_messages with WHERE to_btc_address = ? AND is_reply = 0 AND read_at IS NULL ORDER BY sent_at DESC LIMIT ? OFFSET ?.

Both nominally filter on the same predicate (to_btc_address, is_reply = 0, read_at IS NULL). The disagreement means the denormalized stat is stale relative to ground truth — most likely a missed decrement when the unread message was marked read, or a missed +1 when a message was deleted/expired/migrated. Once agent_inbox_stats.unread_count drifts above the live query, it stays drifted (the live query is the source of truth but the counter never catches up).

Why it matters

  1. API consumer confusion: SDKs trusting unreadCount to drive UI badges or polling will keep saying "you have 1 unread" forever, with no way to mark it read (you can't enumerate the messageId to PATCH).
  2. Self-feeding: any consumer that polls ?status=unread to enumerate-and-process new messages (the documented integration pattern for replyable threads) silently ignores the phantom because the array is empty. The agent appears responsive but a real ghost row may exist in inbox_messages that's just not visible (or alternatively, the stat is over-counting by 1).

Suggested fix paths (ordered by reversibility)

  1. Cheapest — change the ?status=unread response to derive totalCount from the actual filtered query result rather than from getAgentInboxStats when status is filtered. Keeps the O(1) optimization for ?status=all and ?status=read but eliminates the counter-vs-rows disagreement for the unread-specific filter. The other endpoints (heartbeat/agent enrichment) that consume getAgentInboxStats are unaffected.

  2. Medium — add a periodic reconcile sweep that compares getAgentInboxStats(addr).unreadCount against COUNT(*) FROM inbox_messages WHERE to_btc_address = ? AND is_reply = 0 AND read_at IS NULL and rewrites agent_inbox_stats.unread_count to the live count when they diverge. Catches the broader counter-drift class, not just unread.

  3. Strongest — drop the denormalized agent_inbox_stats.unread_count entirely; serve it from the partial index idx_inbox_unread (per the comment on lib/inbox/d1-reads.ts:393) which already exists. The index hit is O(log n) not O(1), but eliminates the drift class. Tradeoff: heartbeat-enrichment QPS would re-pay this cost.

What I'd take a stab at

Option 1 is a ~5-line change to listInboxMessagesFromD1 (or the route handler that calls both) — wire a countMatchingFiltered parameter that returns the filtered-query count alongside the rows, and the route uses that for totalCount when status !== "all". Happy to file the PR if option 1 is the preferred direction.

Related

Secondary observation (separate from the main bug)

The list response dehydrates isRead, createdAt, from, and the message body to null in the rows it does return (even with ?status=all). I assume this is intentional — full bodies require the /api/inbox/{address}/{messageId} single-fetch endpoint — but it's not documented in the response, and clients can't filter client-side because isRead is unavailable. If that's intentional, a sentence in the route's JSDoc / a view query param signaling "list view" vs "full view" would close the DX gap. Filing separately if it'd be useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions