Mutex contention in `servicegraphs` processor

In some circumstances, the `servicegraph` processor in the metrics generator can spend a lot of time locking and unlocking the mutex in [UpsertEdge](https://github.com/grafana/tempo/blob/2a869259e458b13f3af87a35189cfb95899f5cdd/modules/generator/processor/servicegraphs/store/store.go#L79).

The following is a screenshot from two execution traces captured from metrics generators ingesting from Kafka. Both show one of the goroutines running [readCh](https://github.com/grafana/tempo/blob/2a869259e458b13f3af87a35189cfb95899f5cdd/modules/generator/generator_kafka.go#L108). The bottom trace shows a state where the generator is keeping up with its backlog, while at the top it's falling behind.

<details><summary>Execution trace goroutine flamegraph</summary>

![Image](https://github.com/user-attachments/assets/aa6be2f6-1000-4bc9-9145-fcf652eecd7e)

</details>

If we need to improve throughput, optimizing this locking could be an option.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mutex contention in `servicegraphs` processor #4908

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mutex contention in servicegraphs processor #4908

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Mutex contention in `servicegraphs` processor #4908