feat: fetch partition routing and on-demand metadata refresh by klaudworks · Pull Request #126 · KafScale/platform

klaudworks · 2026-03-01T12:56:33Z

Merge #125 first (group coordination routing). This PR is stacked on top.
Closes #115

Summary

Fetch requests are now routed to the broker that owns the requested partitions, similar to how produce requests already work.
The proxy no longer polls for metadata every 3 seconds. Instead, it fetches metadata only when it encounters a broker or topic it doesn't recognize. Multiple simultaneous cache misses share a single metadata request.
The readiness probe no longer makes a network call on every check. It returns healthy immediately if recent metadata is cached, and only reaches out to the metadata store if the cache has gone stale.

Outlook

Consumers will now directly fetch data from brokers that already have it in their segment cache. Slow S3 requests are only required to reprocess older data after partition reassignments or similar.

novatechflow · 2026-03-02T09:05:09Z

@klaudworks - can you please resolve the conflicts?

Split fetch requests by owning broker, forward concurrently, and merge responses. Retry partitions rejected with NOT_LEADER_OR_FOLLOWER up to 3 times. For v12+ requests that use topic IDs instead of names, resolve IDs via a metadata-refreshed cache and use a collision-safe key to prevent unresolved topics from merging silently. Adds EncodeFetchRequest and ParseFetchResponse codecs with round-trip and kmsg validation tests.

Remove the 3-second polling ticker and refresh broker/topic caches on demand when a lookup misses. Concurrent misses are coalesced via singleflight to avoid thundering herd metadata fetches. Readiness probe now checks cached state first (fast path), falling back to a live metadata fetch only when the cache TTL expires. Static backends are always ready. Clean up comments per AGENTS.md: remove low-value comments, condense verbose doc comments.

klaudworks · 2026-03-02T12:28:25Z

@novatechflow Unfortunately github has no support for stacked PR so I have to rebase each PR after the prior one is merged. Once this is approved + merged, I'll rebase the next one.

klaudworks changed the title ~~feat: partition-aware fetch routing and on-demand metadata caching~~ feat: fetch partition routing and on-demand metadata refresh Mar 1, 2026

klaudworks requested a review from novatechflow March 1, 2026 13:02

klaudworks mentioned this pull request Mar 1, 2026

fix: move produce tagged fields inside partition loop #127

Open

novatechflow previously approved these changes Mar 2, 2026

View reviewed changes

klaudworks dismissed novatechflow’s stale review via 4acb9c5 March 2, 2026 11:36

klaudworks added 2 commits March 2, 2026 12:37

klaudworks force-pushed the feat/fetch-partition-routing branch from 4acb9c5 to fa6938f Compare March 2, 2026 11:37

klaudworks requested a review from novatechflow March 2, 2026 12:28

novatechflow approved these changes Mar 2, 2026

View reviewed changes

novatechflow merged commit 5162304 into KafScale:main Mar 2, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: fetch partition routing and on-demand metadata refresh#126

feat: fetch partition routing and on-demand metadata refresh#126
novatechflow merged 2 commits intoKafScale:mainfrom
klaudworks:feat/fetch-partition-routing

klaudworks commented Mar 1, 2026 •

edited

Loading

Uh oh!

novatechflow commented Mar 2, 2026

Uh oh!

klaudworks commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

klaudworks commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Outlook

Uh oh!

novatechflow commented Mar 2, 2026

Uh oh!

klaudworks commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

klaudworks commented Mar 1, 2026 •

edited

Loading