Skip to content

feat: migrate service logs to cursor-based pagination#143

Merged
nathanjcochran merged 6 commits intomainfrom
adrian/logs-cursor-pagination
Apr 9, 2026
Merged

feat: migrate service logs to cursor-based pagination#143
nathanjcochran merged 6 commits intomainfrom
adrian/logs-cursor-pagination

Conversation

@AdrianLC
Copy link
Copy Markdown
Contributor

@AdrianLC AdrianLC commented Apr 6, 2026

Summary

  • Rewrites FetchServiceLogs to use cursor-based pagination — lastCursor from each response is passed back as cursor on the next request, removing the Elasticsearch 10,000-hit window limit that silently capped offset pagination
  • FetchServiceLogs now returns []api.ServiceLogEntry (structured) instead of []string; each entry carries timestamp, message, and severity from Elasticsearch
  • Adds --since flag (lower-bound timestamp) to tiger service logs alongside the existing --until
  • Adds Since field to ServiceLogsInput in the MCP service_logs tool; MCP output returns structured entries with timestamp, message, and severity (no redundant flat logs array)
  • openapi.yaml updated: adds since, cursor params; ServiceLogEntry schema; entries field on ServiceLogs; page param marked deprecated; regenerated types.go and client.go

Merge order

Depends on savannah-public adding entries and lastCursor to the response (timescale/savannah-public#88), which depends on timescale/timescale-graphql-client#18. Merge those first.

Replace the page-index loop in FetchServiceLogs with cursor pagination
using the lastCursor field returned by the API. This removes the
Elasticsearch 10,000-hit window limit that capped offset pagination.

- Add cursor param and lastCursor field to openapi.yaml
- Regenerate internal/tiger/api/types.go and client.go
- Rewrite FetchServiceLogs to pass lastCursor as cursor on each
  subsequent request, stopping when lastCursor is nil or tail is reached
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 6, 2026

CLA assistant check
All committers have signed the CLA.

AdrianLC added 3 commits April 6, 2026 15:48
Add lower-bound time filtering so callers can request logs within a
specific window rather than only capping at an upper bound.

- openapi.yaml: add startTime query param to logs endpoint
- Regenerate internal/tiger/api/types.go and client.go
- FetchServiceLogs: add from *time.Time param, forwarded as StartTime
- CLI: add --from flag (RFC3339) alongside existing --until
- MCP: add From field to ServiceLogsInput and forward to FetchServiceLogs
- openapi.yaml: rename startTime query param to from
- Regenerate internal/tiger/api/types.go and client.go
- FetchServiceLogs: set params.From instead of params.StartTime
FetchServiceLogs now returns []api.ServiceLogEntry instead of []string,
carrying timestamp, message, and severity from the new entries field in
savannah-public. A toEntries() helper maintains backwards compatibility
with servers that return only the legacy logs []string field. The MCP
tool exposes both a flat logs []string (for backwards compat) and a
structured entries field with full metadata.
@AdrianLC AdrianLC self-assigned this Apr 6, 2026
@AdrianLC AdrianLC marked this pull request as ready for review April 6, 2026 14:37
@AdrianLC AdrianLC requested a review from nathanjcochran April 6, 2026 14:37
The server now strips the PostgreSQL timestamp from Message and provides
it as a separate Timestamp field. Prepend it when available so that CLI
output retains the same appearance as before.
Copy link
Copy Markdown
Contributor

@Askir Askir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll approve already, but I think there is a few things that we should fix before doing the release. Mainly some MCP cleanup,

I didn't add a comment for this but I think there are some mocks for the logs api responses that have the old format still, I think we want to update those to the new one.

I also think you can remove all the fallback logic here, and just cleanly switch to the new format.

Comment on lines 598 to 603
- name: page
in: query
required: false
schema:
type: integer
description: Page number for pagination (0-based)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove this one here already? That should help with avoiding anyone using it in the future since this repo is public, it also means some of the old generated code is deleted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in theory, but I feel like it's easier to keep the files in sync. If we remove page here, but not upstream, then people can't just copy the files to sync them in the future (and Claude will constantly be trying to put the page parameter back). I mentioned elsewhere that I think it's possible to mark a parameter/field as deprecated with deprecated: true, so that might be the best middle-ground here.

Comment on lines +1188 to +1201
logs := make([]string, len(entries))
structured := make([]serviceLogEntryOutput, len(entries))
for i, e := range entries {
logs[i] = e.Message
structured[i] = serviceLogEntryOutput{
Timestamp: e.Timestamp,
Message: e.Message,
Severity: e.Severity,
}
}

output := ServiceLogsOutput{
Logs: logs,
Logs: logs,
Entries: structured,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit duplicate to return both the logs string arary as well as the structured entries. I think it is okay to just just remove the logs here, otherwise this bloats the context window a lot if the LLM calls this function even once.

Also we do the type mapping to the MCP type usually in utils.go I think, rather than inline.

Copy link
Copy Markdown
Member

@nathanjcochran nathanjcochran Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we don't want duplicate content in the MCP response. But on that note, do we need to return the structured entries at all? Doesn't the message field already contain the severity and timestamp (or am I misunderstanding)? In other words, aren't timestamp and severity already kind of duplicative with the message? I think LLMs are pretty good at understanding logs in their normal text format (they've seen lots of them in the training data), so I'm not sure it's really helpful for us to split it out into separate fields anyways. And just returning the log messages as a string array massively cuts down on the size of the response (no JSON objects with nested keys for each log message, etc.)

Also, I think it's fine for some of the conversions to happen inline. Especially if the conversion functions aren't re-used (I think most of the ones in utils.go are there because they're reused across multiple MCP tool implementations).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message won't contain the Timestamp which is a separate field now. It still contains the severity but it's likely to change if we finish the move to full structured json logs.

I'll keep only the new structure then.

Copy link
Copy Markdown
Member

@nathanjcochran nathanjcochran Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message won't contain the Timestamp which is a separate field now.

Hmm. Okay. I think keeping just the new structure is definitely better than keeping both, but I do still think this might be sub-optimal for LLMs.

I really think the "best" format for LLMs is probably closer to the native text format. Partly because that's just what LLMs have seen most often in the training data, but also because it's significantly fewer tokens, due to the fact that the content is just the log message text instead of a bunch of JSON objects, each with a timestamp, message, and severity field (which takes up a lot of tokens when duplicated for each log message, and probably doesn't really help the LLM understand the content in any meaningful way).

So imo, it might be better to just prepend the timestamp (like you're doing for the CLI output) and return an array of strings.

It still contains the severity but it's likely to change if we finish the move to full structured json logs.

Wait, wouldn't this be a backwards-compatibility issue? If you removed the severity from the message field, the way the logs are currently being formatted for the sake of the CLI output would break for existing versions of Tiger CLI, in that sense that the severity would no longer be output in each message (since we aren't prepending that field to the message before outputting it, like we do with the timestamp). We should probably figure out our plan for this now, if that's actually something you think you're going to do in the near future...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the MCP tool call output format can be changed after the fact without affecting backwards-compatibility (we don't need the MCP output to be backwards-compatible, since it's just read by an LLM), I'll go ahead and merge this now, and we can continue to think about it and potentially change it later.

Copy link
Copy Markdown
Contributor Author

@AdrianLC AdrianLC Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we are not planning to strip the severity out of messages. If we did we'd prepend it here first for compatibility, like we did with the Timestamp. But it's not planned yet

I can easily open another PR to use the single phrase output for MCP to save tokens. Let's do that along with the colorizing change in #144 before creating a new release

if !entry.Timestamp.IsZero() {
line = entry.Timestamp.UTC().Format("2006-01-02 15:04:05 UTC") + " " + line
}
colorizedLog := colorizeLogLine(line, shouldColorize)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to use the passed down severity field instead of doing string comparison? That should get it consistent with the frontend too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but IMO let's do a separate PR

if !entry.Timestamp.IsZero() {
line = entry.Timestamp.UTC().Format("2006-01-02 15:04:05 UTC") + " " + line
}
colorizedLog := colorizeLogLine(line, shouldColorize)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but IMO let's do a separate PR

Comment on lines +1188 to +1201
logs := make([]string, len(entries))
structured := make([]serviceLogEntryOutput, len(entries))
for i, e := range entries {
logs[i] = e.Message
structured[i] = serviceLogEntryOutput{
Timestamp: e.Timestamp,
Message: e.Message,
Severity: e.Severity,
}
}

output := ServiceLogsOutput{
Logs: logs,
Logs: logs,
Entries: structured,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message won't contain the Timestamp which is a separate field now. It still contains the severity but it's likely to change if we finish the move to full structured json logs.

I'll keep only the new structure then.

- rename from to since
- remove dupllicate logs array
- update help descriptions and examples
- update test stub
@nathanjcochran nathanjcochran merged commit 61ddc7c into main Apr 9, 2026
1 of 3 checks passed
@nathanjcochran nathanjcochran deleted the adrian/logs-cursor-pagination branch April 9, 2026 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants