Skip to content

ponymail-mcp: add limit/offset pagination to search_list#12

Open
andreahlert wants to merge 1 commit into
apache:mainfrom
andreahlert:ponymail-mcp/add-pagination
Open

ponymail-mcp: add limit/offset pagination to search_list#12
andreahlert wants to merge 1 commit into
apache:mainfrom
andreahlert:ponymail-mcp/add-pagination

Conversation

@andreahlert
Copy link
Copy Markdown

Problem

search_list currently formats the first 30 emails of the backend response and prints ... and N more emails for the rest. Because there is no limit or offset parameter, the remaining N entries are unreachable: re-querying the backend just returns the same first 30. For LLM clients doing list-wide analysis (e.g. enumerating every thread on a list, walking a month of archive), this hides most of the data behind a hard wall.

Concrete example I hit while writing this PR: dev-magpie@airflow.apache.org has 93 emails / 46 threads. Calling search_list returns 30; the other 63 emails are invisible. Working around it requires chaining many narrow from: / subject: queries hoping to cover the full set, with no guarantee of completeness.

Solution

Two optional parameters on search_list:

  • limit (int, 1..200, default 30) — caps the rendered window
  • offset (int, >= 0, default 0) — skips the first N entries

Defaults preserve current behaviour. The backend call is unchanged; both parameters only affect which slice of the already-fetched result set is formatted into the response. No new API hits, no new auth surface, no new env vars.

The trailing summary is upgraded:

  • Before: ... and N more emails (no way to reach them)
  • After: Showing X-Y of Z. plus ... K more emails. Re-query with offset=N to continue. (deterministic pagination)

Why now

This is the first thing a long-running LLM session needs that the MCP doesn't have. The original 30-cap is the right default for browse-style use, but offers no escape hatch for analysis-style use. The 200 ceiling on limit keeps the worst-case response size sane (~40KB of summaries) while removing the "you can only ever see 30" wall.

Backwards compatibility

  • No schema field renamed or removed.
  • Calling without limit/offset produces the same first 30 as before; only the trailing line text changes (Showing 1-30 of 93. vs ... and 63 more emails).
  • No change to get_email, get_thread, get_mbox, or any auth/lifecycle tool.

Test plan

  • npm test (43 existing tests pass)
  • node --check index.js (syntax clean)
  • Manual smoke test against lists.apache.org:
    • search_list(list=dev-magpie, domain=airflow.apache.org) — same first 30 as before, new footer line
    • search_list(... limit=100) — renders up to 93 (the full hit set)
    • search_list(... limit=30, offset=30) — renders emails 31-60, footer hints offset=60
    • search_list(... offset=999) — renders 0 emails, "offset is past the end" footer

Notes for reviewers

  • I considered cursor-based pagination but the backend /api/stats.lua returns the full set in one response, so an opaque cursor would just be a re-encoded numeric offset. Plain offset is simpler and equivalent.
  • The same cap pattern exists on parts.slice(0, 15) (participants) and truncate(text, 8000/4000/10000) (bodies / mbox). Not touched here — scope of this PR is just search_list. Happy to follow up with a second PR if the design lands well.

The search_list tool previously rendered the first 30 email summaries
unconditionally, then printed "... and N more emails" with no way to
reach those N entries without re-querying the backend (and getting the
same first 30 back). For LLM clients doing list-wide analysis, the
remainder was effectively invisible.

This change adds two optional parameters:

  - limit  (int, 1..200, default 30) — caps the rendered window
  - offset (int, >= 0,  default 0)   — skips the first N entries

The backend call is unchanged; both parameters only affect which slice
of the already-fetched result set is formatted into the response. The
default behaviour (no params) is identical to before.

The trailing summary is upgraded from a static "... and N more" to an
explicit "Showing X-Y of Z" plus the exact next-page offset when more
results remain, so the caller can page deterministically.

Signed-off-by: André Ahlert <andre@aex.partners>
@andreahlert
Copy link
Copy Markdown
Author

@rbowen @potiuk context on why I opened this

I'm using ponymail-mcp from Claude inside the airflow-steward / Magpie skills (mailing list triage, governance analysis, finding historical threads). The skills want to enumerate everything on a list, not just browse the top of it

ran a test asking Claude (Opus 4.7) to list every thread on dev-magpie@airflow.apache.org. 93 emails, 46 threads. it got back 30 and then tried to work around the cap with narrow from/subject filters. missed 15+ threads that didn't match the guesses, and worse, had no way to know the result was incomplete

the 30 default is fine for a human skimming. for an agent walking a list it's a wall with no escape hatch. patch keeps the default at 30 so nothing changes for existing callers, just lets you ask for more when you actually need it

happy to follow up with the same treatment for participants / get_mbox if this lands, kept the scope tight on purpose

@rbowen
Copy link
Copy Markdown
Contributor

rbowen commented May 28, 2026

Yes, the explanation is clear and I expect I wasn't even aware that I was missing results. I'll test and merge asap. Thanks!

@potiuk
Copy link
Copy Markdown
Member

potiuk commented May 28, 2026

Nice catch indeed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants