Only second replica use max cpu #12186

yongho-chang · 2026-02-21T01:37:36Z

yongho-chang
Feb 21, 2026

Describe your question

It puts a load of about 1000 rps on Langfuse,
but only the 2nd replica of Clickhouse uses close to the maximum CPU.
Resources are as follows:

cpu: "6"
memory: "12Gi"
replica: 3

Langfuse Cloud or Self-Hosted?

Self-Hosted

If Self-Hosted

3.148.0

If Langfuse Cloud

No response

SDK and integration versions

No response

Pre-Submission Checklist

I have checked for existing issues/discussions and consulted Langfuse AI.

yongho-chang · 2026-02-21T01:38:09Z

inkeep-ai-assistant[bot]
Bot Feb 21, 2026

Based on your ClickHouse deployment with 3 replicas where only the second replica is experiencing high CPU usage under 1000 rps load, this appears to be a load distribution issue within your ClickHouse cluster.

For Kubernetes deployments using the Bitnami ClickHouse Helm Chart, Langfuse recommends specific replica configurations(1). The documentation states that "ClickHouse counts all instances towards the number of replicas, i.e. a replica count of 1 means no redundancy at all"(1), and recommends "a minimum of 3 replicas for production setups"(1).

However, the uneven load distribution you're experiencing suggests the queries may not be properly balanced across replicas. This could be related to how ClickHouse handles distributed queries in your cluster configuration.

To address scaling issues at high throughput, consider these options:

Worker Container Scaling: The documentation recommends scaling worker containers by CPU load, with "a load above 50% for a 2 CPU container" indicating saturation(2). You can also monitor the langfuse.queue.ingestion.length metric for scaling decisions(2).

Reduce ClickHouse Read Load: If your projects weren't migrated from a previous Langfuse version, you can set LANGFUSE_SKIP_INGESTION_CLICKHOUSE_READ_MIN_PROJECT_CREATE_DATE to a date before your first project was created (e.g., 2025-01-01)(2). This reduces ClickHouse load during ingestion processing by relying on S3/blob storage for event history(2).

Separate Ingestion Traffic: For high ingestion loads, you can split your langfuse-web deployment to route traffic to /api/public/ingestion*, /api/public/media*, and /api/public/otel* to a separate deployment(2).

Your current resources (6 CPU, 12Gi memory per replica) appear adequate, but the uneven distribution suggests investigating your ClickHouse cluster configuration and query routing rather than just increasing resources.

📚 Sources:

Have another question? Just tag @inkeep.

1 reply

yongho-chang Feb 21, 2026
Author

@inkeep
How about creating a distributed table in clickhosue?
Is this the method recommended by langfuse?

yongho-chang · 2026-02-21T01:49:13Z

inkeep-ai-assistant[bot]
Bot Feb 21, 2026

Langfuse does not support multi-shard ClickHouse clusters, which means distributed tables across shards are not supported(1).

The documentation explicitly states that "shards" must be set to 1 in the Bitnami Helm chart configuration(1). The note clarifies: "Shards are used for horizontally scaling ClickHouse. A single ClickHouse shard can handle multiple Terabytes of data. Today, Langfuse does not support a multi-shard cluster, i.e. this value must be set to 1"(1).

However, Langfuse does support replicated tables within a single shard. The configuration requires CLICKHOUSE_CLUSTER_ENABLED to be set to true (the default) for clustered deployments(1). When enabled, Langfuse runs ClickHouse commands ON CLUSTER(1).

For your 3-replica setup, Langfuse expects certain ClickHouse parameters to be configured for replicated tables(1):

<!--
 Substitutions for parameters of replicated tables.
 Optional. If you don't use replicated tables, you could omit that.
 See https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables
-->
<!--
 <macros>
 <shard>01</shard>
 <replica>example01-01-1</replica>
 </macros>
-->
<!--
 <default_replica_path>/clickhouse/tables/{database}/{table}</default_replica_path>
 <default_replica_name>{replica}</default_replica_name>
-->

(1)

The Bitnami Helm chart should handle these configurations automatically(1).

So to directly answer your question: No, creating distributed tables across multiple shards is not recommended by Langfuse because multi-shard clusters are not currently supported(1). The recommended approach is using a single-shard cluster with multiple replicas (which you already have with your 3-replica setup)(1).

📚 Sources:

ClickHouse

Have another question? Just tag @inkeep.

1 reply

yongho-chang Feb 21, 2026
Author

@inkeep
Hi, I think there might be a small misunderstanding. When I mentioned using a Distributed table, I did not mean running a multi-shard setup.

My setup is 1 shard + multiple replicas (e.g. 3 replicas).
Each node has a ReplicatedMergeTree local table, and on top of that I add a Distributed table as a thin routing/abstraction layer.

So:

The data still lives in a single shard

The Distributed table is only used for:

query routing / read load balancing

providing a single logical endpoint for the application

From my understanding, when Langfuse says it does not support multi-shard, it refers to a setup where data is horizontally partitioned across multiple shards.
That is different from my case, where the shard count is still 1 and only replicas are used for HA.

So my question is:

In a single-shard + multi-replica setup, is using a Distributed table on top of ReplicatedMergeTree also considered unsupported by Langfuse,
or does “multi-shard not supported” only refer to setups with multiple data shards?

2026-02-21T01:54:36Z

inkeep-ai-assistant[bot]
Bot Feb 21, 2026

Based on the available documentation, when Langfuse states that multi-shard clusters are not supported, this specifically refers to setups where data is horizontally partitioned across multiple shards(1). The documentation explicitly states: "Shards are used for horizontally scaling ClickHouse. A single ClickHouse shard can handle multiple Terabytes of data. Today, Langfuse does not support a multi-shard cluster, i.e. this value must be set to 1"(1).

Your proposed setup—using a Distributed table as a query routing layer on top of ReplicatedMergeTree tables within a single shard with multiple replicas—falls into a gray area that isn't explicitly documented as supported or unsupported.

However, there's relevant evidence from community discussions. One user successfully implemented exactly what you're describing: "I've implemented a tricky multi-shard clickhouse cluster with langfuse. First, setup clickhouse cluster with sharding and replication and overwrite default cluster with this cluster. (because langfuse DDLs are 'on cluster default') Second, initialize langfuse tables by running langfuse web. Third, rename tables like... - traces -> traces_local - observations -> observations_local Finally, create distributed tables with original name, traces and observations"(2). They noted: "This approach is easy by modifying clickhouse configuration, without any langfuse applicaiton code changes. I've simply verified viewing on UI and ingestion works properly"(2).

That said, the same user warned: "Take care when you try to upgrade langfuse"(2), indicating potential compatibility issues with future versions.

The official recommendation remains using ReplicatedReplacingMergeTree tables within a single-shard, multi-replica cluster(1). For your specific use case of load balancing reads across replicas, this should work natively without requiring a Distributed table layer, as ClickHouse's replication handles query distribution.

If you proceed with the Distributed table approach, be aware it's not officially documented or tested by Langfuse, and you may encounter issues during upgrades or with certain features.

📚 Sources:

Have another question? Just tag @inkeep.

0 replies

tylerlittlefield · 2026-04-20T22:01:55Z

tylerlittlefield
Apr 20, 2026

Experiencing this as well, the second replica is only one that maxes out, chart version is 1.2.18, values are:

clickhouse:
  logLevel: fatal
  extraOverrides: |
      <yandex>
        <trace_log remove="1"/>
        <text_log remove="1"/>
        <opentelemetry_span_log remove="1"/>
        <query_log remove="1"/>
        <processors_profile_log remove="1"/>
        <metric_log remove="1"/>
        <part_log remove="1"/>
        <asynchronous_metric_log remove="1"/>
        <query_views_log remove="1"/>
        <asynchronous_insert_log remove="1"/>
        <latency_log remove="1"/>
        <query_metric_log remove="1"/>
        <error_log remove="1"/>
      </yandex>
  # since patched to 256Gi
  persistence:
    size: 100Gi
  resources:
    limits:
      cpu: "6"
      memory: "24Gi"
    requests:
      cpu: "6"
      memory: "24Gi"
  podAntiAffinityPreset: hard
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: kubernetes.io/hostname
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app.kubernetes.io/instance: langfuse
          app.kubernetes.io/name: clickhouse
  zookeeper:
    persistence:
      size: 100Gi
    resources:
      limits:
        cpu: "2"
        memory: "3Gi"
      requests:
        cpu: "1"
        memory: "3Gi"
  replicaCount: 3
  clusterEnabled: true
  # https://github.com/bitnami/charts/blob/main/bitnami/common/templates/_resources.tpl#L15
  # In production, we should use larger preset, at least "large" per langfuse documentation: https://langfuse.com/self-hosting/infrastructure/clickhouse#example-configuration-1
  resourcesPreset: large
  auth:
    existingSecret: langfuse-additional-secrets
    existingSecretKey: CLICKHOUSE_PASSWORD

Is this because clickhouse in langfuse is fixed to 1 shard so we cannot scale horizontally?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Langfuse

Only second replica use max cpu #12186

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Langfuse

Only second replica use max cpu #12186

Uh oh!

yongho-chang Feb 21, 2026

Describe your question

Langfuse Cloud or Self-Hosted?

If Self-Hosted

If Langfuse Cloud

SDK and integration versions

Pre-Submission Checklist

Replies: 4 comments · 2 replies

Uh oh!

inkeep-ai-assistant[bot] Bot Feb 21, 2026

Uh oh!

yongho-chang Feb 21, 2026 Author

Uh oh!

inkeep-ai-assistant[bot] Bot Feb 21, 2026

Uh oh!

yongho-chang Feb 21, 2026 Author

Uh oh!

inkeep-ai-assistant[bot] Bot Feb 21, 2026

Uh oh!

Uh oh!

tylerlittlefield Apr 20, 2026

yongho-chang
Feb 21, 2026

Replies: 4 comments 2 replies

inkeep-ai-assistant[bot]
Bot Feb 21, 2026

yongho-chang Feb 21, 2026
Author

inkeep-ai-assistant[bot]
Bot Feb 21, 2026

yongho-chang Feb 21, 2026
Author

inkeep-ai-assistant[bot]
Bot Feb 21, 2026

tylerlittlefield
Apr 20, 2026