Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib: fix span start option races #2418

Merged
merged 23 commits into from Dec 12, 2023

Conversation

eliottness
Copy link
Contributor

@eliottness eliottness commented Dec 7, 2023

What does this PR do?

Fix data race from closure on the span options used when starting spans in multiple contribs.

This PR introduces a new package contrib/internal/options with a method called Copy which provides a newly allocated options slice with a copy of the provides options inside it.

Note: This PR does not fix the scenario where a customer keeps a reference to the options they pass to their contrib setup and then alters them later on, which would have the side effect of altering the options in the generated config. That could theoretically create a race condition, but it's not supported behavior. This PR just focused on race conditions caused by our library.

Motivation

This PR originally fixes a data race found on github.com/go-chi.chi.v5 contrib while running some -race tests on dd-source. It was then discovered that a lot of contrib's had the same issue. I noticed 2 mechanisms on every contrib when buildings options to start a new span: either we create a new array and append the global options into it (no issue with this mechanism), or we append options directly into the cfg.spanOpts slice. This second approach could lead to data races in case the array has enough capacity to append in-place the new element appended.

Reviewer's Checklist

  • Changed code has unit tests for its functionality at or near 100% coverage.
  • System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
  • There is a benchmark for any new code, or changes to existing code.
  • If this interacts with the agent in a new way, a system test has been added.
  • Add an appropriate team label so this PR gets put in the right place for the release notes.

For Datadog employees:

  • If this PR touches code that handles credentials of any kind, such as Datadog API keys, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

@github-actions github-actions bot added the apm:ecosystem contrib/* related feature requests or bugs label Dec 7, 2023
@eliottness eliottness force-pushed the eliott.bouhana/fix-span-opts-race branch from 90c2665 to b99b705 Compare December 7, 2023 11:07
Signed-off-by: Eliott Bouhana <eliott.bouhana@datadoghq.com>
@eliottness eliottness force-pushed the eliott.bouhana/fix-span-opts-race branch from b99b705 to 8f24d7b Compare December 7, 2023 11:08
@pr-commenter
Copy link

pr-commenter bot commented Dec 7, 2023

Benchmarks

Benchmark execution time: 2023-12-12 18:29:22

Comparing candidate commit 3c089f2 in PR branch eliott.bouhana/fix-span-opts-race with baseline commit aada6eb in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 40 metrics, 1 unstable metrics.

@Julio-Guerra Julio-Guerra changed the title contrib/*: fix multiple spanOpts potential append() data race contrib/*: fix span start option races Dec 7, 2023
Comment on lines 47 to 59

// Make sure opts is a copy of the span options, scoped to this request,
// to avoid races and leaks to spanOpts or cfg.spanOpts
opts := append([]tracer.StartSpanOption{
tracer.ResourceName(cfg.resourceNamer(c)),
tracer.Tag(ext.HTTPRoute, c.FullPath()),
httptrace.HeaderTagsFromRequest(c.Request, cfg.headerTags),
}, spanOpts...)

if !math.IsNaN(cfg.analyticsRate) {
opts = append(opts, tracer.Tag(ext.EventSampleRate, cfg.analyticsRate))
}
opts = append(opts, tracer.Tag(ext.HTTPRoute, c.FullPath()))
opts = append(opts, httptrace.HeaderTagsFromRequest(c.Request, cfg.headerTags))
span, ctx := httptrace.StartRequestSpan(c.Request, opts...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most likely actually not an issue here, but I'd want to point out that this change affectes the order of the options, which could affect the end result...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is why a "builder" would be preferable to enforce this too, whereas every contrib is doing its own thing at the moment, and some of them have ordering tests while others don't.

This is a known limitation of the current design that we discussed in the past here #1352

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was difficult to use a builder here, since the builder just ended up being shared as well, which led to the same problem. At least for now, I made a helper method to make those copies more explicit, so I'm hoping that'll help a bit with readability. It doesn't fully prevent this in the future, but it does make it more explicit.

@Julio-Guerra Julio-Guerra marked this pull request as ready for review December 7, 2023 13:18
@Julio-Guerra Julio-Guerra requested a review from a team December 7, 2023 13:18
@Julio-Guerra Julio-Guerra changed the title contrib/*: fix span start option races contrib: fix span start option races Dec 7, 2023
@Julio-Guerra Julio-Guerra added the bug unintended behavior that has to be fixed label Dec 7, 2023
contrib/urfave/negroni/negroni.go Outdated Show resolved Hide resolved
contrib/urfave/negroni/negroni.go Outdated Show resolved Hide resolved
contrib/internal/options/options.go Outdated Show resolved Hide resolved
@katiehockman
Copy link
Contributor

I've gone ahead and made several changes, and fixed additional races. Given how much of the code I've authored, I shouldn't be the one to approve this, so will seek an additional review from someone else on the team.

Copy link
Contributor

@knusbaum knusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few comments about some unnecessary copies.

Only the GRPC one looks like it could actually have a performance impact on customers. The rest are fine either way.

contrib/database/sql/sql.go Show resolved Hide resolved
contrib/gofiber/fiber.v2/fiber.go Outdated Show resolved Hide resolved
contrib/google.golang.org/grpc.v12/grpc.go Show resolved Hide resolved
contrib/gorilla/mux/mux.go Show resolved Hide resolved
contrib/net/http/roundtripper.go Outdated Show resolved Hide resolved
Copy link
Contributor

@knusbaum knusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some changes that we need to make. Not all of them need to go in here, but some of these need to be modified.

// This product includes software developed at Datadog (https://www.datadoghq.com/).
// Copyright 2023 Datadog, Inc.

package options
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was reading some contribs changed in this PR, I think we could go further to help thanks to some ImmutableOptions wrapper type.

For example avoiding:

func Middleware(opts ...Options) Handler {
    opts = options.Copy(opts) // problem avoided here
    opts = append(opts, foo)

    return func(w http.ResponseWriter, r *http.Request) {
        opts = append(opts, headersTags(r)) // problem not avoided here
    }
}

And doing instead:

func Middleware(opts ...Options) Handler {
    localOpts = options.Copy(opts) // problem avoided here
    localOpts = append(localOpts, foo)
    opts := options.ImmutableOptions(localOpts)
    return func(w http.ResponseWriter, r *http.Request) {
        opts = append(opts, headersTags(r)) // problem not avoided here
    }
}

Or having a full wrapper type Options with an Append method so it can do more safety checks?

My point is: how can we further improve this API to avoid future mistakes again and maybe add this ability to block any further uses of append once we know we are done and anything coming later is in a request handler, from concurrent goroutines.

cc @knusbaum

Copy link
Contributor

@knusbaum knusbaum Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstracting the mutations is potentially a good idea, but we would have to be careful to do it in an efficient way so that every Append doesn't cause a copy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider it, but let's do so in a follow-up PR. I'd like to keep this PR as minimal as possible given the already large blast radius of changes.

@katiehockman katiehockman enabled auto-merge (squash) December 11, 2023 18:26
@@ -38,9 +38,7 @@ func Middleware(opts ...Option) func(*web.C, http.Handler) http.Handler {
warnonce sync.Once
)
defaults(&cfg)
localOpts := make([]Option, len(opts))
copy(localOpts, opts) // make a copy to avoid changes to opts having side effects to the returned router
for _, fn := range localOpts {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this one anymore.

Copy link
Contributor

@knusbaum knusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

contrib/go-chi/chi.v5/chi.go Show resolved Hide resolved
contrib/go-chi/chi/chi.go Outdated Show resolved Hide resolved
@katiehockman katiehockman enabled auto-merge (squash) December 12, 2023 20:49
@katiehockman katiehockman merged commit 92773e7 into main Dec 12, 2023
52 of 53 checks passed
@katiehockman katiehockman deleted the eliott.bouhana/fix-span-opts-race branch December 12, 2023 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apm:ecosystem contrib/* related feature requests or bugs bug unintended behavior that has to be fixed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants