-
Notifications
You must be signed in to change notification settings - Fork 647
Description
Describe the bug
The default values specified in the overrides.defaults configuration are not being applied to tenant configurations when using per_tenant_override_config. When a tenant is defined in the override file but doesn't specify certain limits, it appears these values are defaulting to 0 instead of inheriting the default values from the main configuration.
To Reproduce
Steps to reproduce the behavior:
- Start Tempo (2.7.1)
- Write traces when a default override rate_limit is set but not in the per_tenant override
Expected behavior
When a tenant is defined in the override file but doesn't specify certain limits (like ingestion.rate_limit_bytes), it should inherit the default values from the main configuration (as it is for mimir & loki). In this case, tenant "b4af8459-3937-462b-9886-fd749be4f6ad" should have (see following config):
- ingestion.rate_limit_bytes = 15000000
- Other default values as specified in the main configuration
Environment:
- Infrastructure: Kubernetes
- Deployment tool: helm
Additional Context
Here is my conf
tempo.yaml: |2
cache:
caches:
- memcached:
consistent_hash: true
host: 'tempo-0-memcached'
service: memcached-client
timeout: 500ms
roles:
- parquet-footer
- bloom
- frontend-search
compactor:
compaction:
block_retention: 768h
compacted_block_retention: 1h
compaction_cycle: 30s
compaction_window: 1h
max_block_bytes: 107374182400
max_compaction_objects: 6000000
max_time_per_tenant: 5m
retention_concurrency: 10
v2_in_buffer_bytes: 5242880
v2_out_buffer_bytes: 20971520
v2_prefetch_traces_count: 1000
ring:
kvstore:
store: memberlist
distributor:
receivers:
jaeger:
protocols:
thrift_http:
endpoint: 0.0.0.0:14268
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
zipkin:
endpoint: 0.0.0.0:9411
ring:
kvstore:
store: memberlist
ingester:
lifecycler:
ring:
kvstore:
store: memberlist
replication_factor: 3
tokens_file_path: /var/tempo/tokens.json
memberlist:
abort_if_cluster_join_fails: false
bind_addr: []
bind_port: 7946
cluster_label: 'tempo-0.tempo-0'
gossip_interval: 1s
gossip_nodes: 2
gossip_to_dead_nodes_time: 30s
join_members:
- dns+tempo-0-gossip-ring:7946
leave_timeout: 5s
left_ingesters_timeout: 5m
max_join_backoff: 1m
max_join_retries: 10
min_join_backoff: 1s
node_name: ""
packet_dial_timeout: 5s
packet_write_timeout: 5s
pull_push_interval: 30s
randomize_node_name: true
rejoin_interval: 0s
retransmit_factor: 2
stream_timeout: 10s
multitenancy_enabled: true
overrides:
defaults:
ingestion:
burst_size_bytes: 20000000
max_traces_per_user: 10000
rate_limit_bytes: 15000000
read:
max_bytes_per_tag_values_query: 1000000
per_tenant_override_config: /etc/runtime-config/tempo-0.yaml
querier:
frontend_worker:
frontend_address: tempo-0-query-frontend-discovery:9095
max_concurrent_queries: 20
search:
query_timeout: 30s
trace_by_id:
query_timeout: 10s
query_frontend:
max_outstanding_per_tenant: 2000
max_retries: 2
metrics:
concurrent_jobs: 1000
duration_slo: 0s
interval: 5m
max_duration: 3h
query_backend_after: 30m
target_bytes_per_job: 104857600
throughput_bytes_slo: 0
search:
concurrent_jobs: 1000
target_bytes_per_job: 104857600
trace_by_id:
query_shards: 50
server:
grpc_server_max_recv_msg_size: 4194304
grpc_server_max_send_msg_size: 4194304
http_listen_port: 3100
http_server_read_timeout: 30s
http_server_write_timeout: 30s
log_format: logfmt
log_level: info
storage:
trace:
backend: s3
blocklist_poll: 5m
local:
path: /var/tempo/traces
pool:
max_workers: 400
queue_depth: 20000
s3:
access_key: <secret>
bucket: <secret>
endpoint: <secret>
secret_key: <secret>
wal:
path: /var/tempo/wal
usage_report:
reporting_enabled: false
Here the logs when I push traces to tempo
level=error ts=2025-04-14T16:39:08.7461777Z caller=rate_limited_logger.go:38 msg="pusher failed to consume trace data" err="rpc error: code = ResourceExhausted desc = RATE_LIMITED: ingestion rate limit (loca │
│ l: 0 bytes, global: 0 bytes) exceeded while adding 1611 bytes for user dev"
Here the results on a GET /status/runtime_config?mode=default
GET /status/runtime_config
defaults:
ingestion:
rate_strategy: local
rate_limit_bytes: 15000000
burst_size_bytes: 20000000
max_traces_per_user: 10000
read:
max_bytes_per_tag_values_query: 1000000
metrics_generator:
generate_native_histograms: classic
ingestion_time_range_slack: 0s
global:
max_bytes_per_trace: 5000000
overrides:
dev:
compaction:
block_retention: 1w
And here the results with GET /status/runtime_config?mode=diff
GET /status/runtime_config
overrides:
dev:
compaction:
block_retention: 1w
Thanks you all for your time and the great product you provide !