Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[client] Metadata is not thread safe #9594

Closed
TylerHelmuth opened this issue Feb 16, 2024 · 3 comments · Fixed by #9595
Closed

[client] Metadata is not thread safe #9594

TylerHelmuth opened this issue Feb 16, 2024 · 3 comments · Fixed by #9595
Labels
bug Something isn't working

Comments

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Feb 16, 2024

Describe the bug
Related to open-telemetry/opentelemetry-collector-contrib#31289

I believe there is a bug in Metadata.Get when accessed by multiple threads.

If the Metadata struct is added to a context that is then shared between multiple exporters and those exporters attempt a Get at the same time the collector will panic with fatal error: concurrent map writes.

We are seeing this happen right now with the headerssetterextension. The extension is telling the exporters to grab details out of the context, but it is the same context used between all the exporters in the pipeline.

Code that I suspect is not thread safe:

func (m Metadata) Get(key string) []string {
vals := m.data[key]
if len(vals) == 0 {
// we didn't find the key, but perhaps it just has different cases?
for k, v := range m.data {
if strings.EqualFold(key, k) {
vals = v
// we optimize for the next lookup
m.data[key] = v
}
}
// if it's still not found, it's really not here
if len(vals) == 0 {
return nil
}
}
ret := make([]string, len(vals))
copy(ret, vals)
return ret

Steps to reproduce

  1. Setup a collector to receive OTLP metrics
  2. Configure a headerssetter extension to extract a value from the context.
  3. Use the headerssetter extension as auth to multiple exporters (the more exporters, the faster you get crashes)
  4. Send data through the pipeline for several minutes.
  5. Watch for the collector to crash

What did you expect to see?
No crashes, multiple exporters could extract from the same context

What did you see instead?
Panic with fatal error: concurrent map writes

What version did you use?
Latest

@TylerHelmuth TylerHelmuth added the bug Something isn't working label Feb 16, 2024
@TylerHelmuth
Copy link
Member Author

Is it expected that this feature NOT be thread-safe and that users (like the extension) should implement protection around Metadata.Get to make it thread safe?

@jpkrohling
Copy link
Member

I believe it should be made thread safe. The initial assumption I had back then was that receivers would write and other components would read, meaning that only one part would write at any given time. I think the optimization that causes the write was added later and a lock should have been added there, given the initial assumption changed.

@TylerHelmuth
Copy link
Member Author

As @bogdandrutu pointed out, the issue is happening specifically because of m.data[key] = v. If the request headers match the configured value exactly then this issue does not happen.

bogdandrutu pushed a commit that referenced this issue Feb 21, 2024
**Description:** 
Fixes a bug where `Get` was not thread safe.

**Link to tracking Issue:** <Issue number if applicable>
Closes
#9594

**Testing:** <Describe what testing was performed and which tests were
added.>
Tested locally
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants