Skip to content

add legacy tenant attribution in router#274

Open
yuchen-db wants to merge 1 commit intodb_mainfrom
yuchen-db/legacy-tenant-attribution
Open

add legacy tenant attribution in router#274
yuchen-db wants to merge 1 commit intodb_mainfrom
yuchen-db/legacy-tenant-attribution

Conversation

@yuchen-db
Copy link
Collaborator

@yuchen-db yuchen-db commented Jan 16, 2026

(generated by Claude)
New Files Created

  1. pkg/receive/tenant_attribution.go (~400 lines)
    - Ported M3's filter package (glob-style pattern matching)
    - TenantAttributor struct with config loading and tenant matching
    - Supports patterns: , prefix, *suffix, contains, ?, [abc], [a-z], {a,b,c}, !negation
    - Verification mode metrics (attributionMatches, attributionMismatches)
  2. pkg/receive/tenant_attribution_test.go (~460 lines)
    - Unit tests for filter parsing, pattern matching, TenantAttributor, verification mode

Modified Files

  1. cmd/thanos/receive.go
    - Added --receive.tenant-rules flag (path to YAML config)
    - Added --receive.verify-tenant-attribution flag (verification mode)
    - TenantAttributor initialization before handler creation
  2. pkg/receive/handler.go
    - Added TenantAttributor to Options struct
    - Integrated tenant attribution in distributeTimeseriesToReplicas
    - In verification mode: computes attribution for every series, records match/mismatch metrics, but keeps using HTTP tenant for routing
    - In normal mode: only attributes when no HTTP header is present

Behavior Matrix

tenant-rules verify-attribution HTTP header exists Behavior
Not set N/A Any Current behavior (use HTTP header)
Set false Yes Use HTTP header tenant (skip attribution)
Set false No Do attribution from labels
Set true Any Attribution for metrics only, router uses HTTP tenant
  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

@yuchen-db yuchen-db force-pushed the yuchen-db/legacy-tenant-attribution branch from e1c1011 to 81abd84 Compare January 16, 2026 02:08
Copy link
Collaborator

@jnyi jnyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to do an integration test in a dev region

type tenantRequestStats map[string]requestStats

func (h *Handler) handleRequest(ctx context.Context, rep uint64, tenantHTTP string, wreq *prompb.WriteRequest) (tenantRequestStats, error) {
func (h *Handler) handleRequest(ctx context.Context, rep uint64, tenantHTTP string, httpHeaderPresent bool, wreq *prompb.WriteRequest) (tenantRequestStats, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why introduce new variable, can tenantHTTP string tell if a header is present? I think if it isn't, it will be empty?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is obtained in tenantHTTP, err := tenancy.GetTenantFromHTTP(r, h.options.TenantHeader, h.options.DefaultTenantID, h.options.TenantField)

a few places might be broken for example,

rate limit logic: if !requestLimiter.AllowSizeBytes(tenantHTTP, int64(len(reqBuf))) {
e2e latency calculation: h.writeE2eLatency.WithLabelValues(strconv.Itoa(responseStatusCode), tenantHTTP, strconv.FormatBool(isPreAgged)).Observe(lat)

with new tenant attribution, we might need to fix how those work too, (especially for rate limit which is request scoped, it might be not possible to reject 1 remote write directly anymore)

}

// Tenant attribution logic
if h.options.TenantAttributor != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also note this is exclusive to h.splitTenantLabelName != ""

@@ -914,6 +920,7 @@ func (h *Handler) fanoutForward(ctx context.Context, params remoteWriteParams) (
// series that should be written to remote nodes.
func (h *Handler) distributeTimeseriesToReplicas(
tenantHTTP string,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if we can use tenantHTTP to tell the diff, we should set it to empty string if HTTP headers didn't set it and set to default TenantId within this function

Copy link
Collaborator

@jnyi jnyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would suggest to update GetTenantFromHTTPfunction, default would be unset if we enable these flags:

--receive.split-tenant-label-name
or
--receive.tenant-rules + --receive.no-verify-tenant-attribution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants