Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

otelsarama: Fix WrapAsyncProducer race condition #881

Merged
merged 2 commits into from
Jul 15, 2021
Merged

otelsarama: Fix WrapAsyncProducer race condition #881

merged 2 commits into from
Jul 15, 2021

Conversation

pellared
Copy link
Member

@pellared pellared commented Jul 12, 2021

Why

I have spotted that otelsarama tests are sometimes failing because not all spans are recorded in oteltest.SpanRecorder.

Before the fix:

$ go test -race -count=1000
--- FAIL: TestWrapAsyncProducer (0.00s)
    --- FAIL: TestWrapAsyncProducer/with_successes_config (0.00s)
        producer_test.go:289: 
                Error Trace:    producer_test.go:289
                Error:          Not equal: 
                                expected: trace.SpanID{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2}
                                actual  : trace.SpanID{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
                            
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1,3 +1,3 @@
                                 (trace.SpanID) (len=8) {
                                - 00000000  00 00 00 00 00 00 00 02                           |........|
                                + 00000000  00 00 00 00 00 00 00 00                           |........|
                                 }
                Test:           TestWrapAsyncProducer/with_successes_config
        producer_test.go:293: 
                Error Trace:    producer_test.go:293
                Error:          Not equal: 
                                expected: attribute.Value{vtype:4, numeric:0x0, stringly:"1", array:interface {}(nil)}
                                actual  : attribute.Value{vtype:4, numeric:0x0, stringly:"2", array:interface {}(nil)}
                            
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -3,3 +3,3 @@
                                  numeric: (uint64) 0,
                                - stringly: (string) (len=1) "1",
                                + stringly: (string) (len=1) "2",
                                  array: (interface {}) <nil>
                Test:           TestWrapAsyncProducer/with_successes_config
                Messages:       messaging.message_id
        producer_test.go:297: 
                Error Trace:    producer_test.go:297
                Error:          Not equal: 
                                expected: int(0)
                                actual  : trace.SpanID(trace.SpanID{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3})
                Test:           TestWrapAsyncProducer/with_successes_config
        producer_test.go:289: 
                Error Trace:    producer_test.go:289
                Error:          Not equal: 
                                expected: trace.SpanID{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
                                actual  : trace.SpanID{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2}
                            
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1,3 +1,3 @@
                                 (trace.SpanID) (len=8) {
                                - 00000000  00 00 00 00 00 00 00 00                           |........|
                                + 00000000  00 00 00 00 00 00 00 02                           |........|
                                 }
                Test:           TestWrapAsyncProducer/with_successes_config
        producer_test.go:293: 
                Error Trace:    producer_test.go:293
                Error:          Not equal: 
                                expected: attribute.Value{vtype:4, numeric:0x0, stringly:"2", array:interface {}(nil)}
                                actual  : attribute.Value{vtype:0, numeric:0x0, stringly:"", array:interface {}(nil)}
                            
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1,5 +1,5 @@
                                 (attribute.Value) {
                                - vtype: (attribute.Type) 4,
                                + vtype: (attribute.Type) 0,
                                  numeric: (uint64) 0,
                                - stringly: (string) (len=1) "2",
                                + stringly: (string) "",
                                  array: (interface {}) <nil>
                Test:           TestWrapAsyncProducer/with_successes_config
                Messages:       messaging.message_id
        producer_test.go:293: 
                Error Trace:    producer_test.go:293
                Error:          Not equal: 
                                expected: attribute.Value{vtype:2, numeric:0x0, stringly:"", array:interface {}(nil)}
                                actual  : attribute.Value{vtype:0, numeric:0x0, stringly:"", array:interface {}(nil)}
                            
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1,3 +1,3 @@
                                 (attribute.Value) {
                                - vtype: (attribute.Type) 2,
                                + vtype: (attribute.Type) 0,
                                  numeric: (uint64) 0,
                Test:           TestWrapAsyncProducer/with_successes_config
                Messages:       messaging.kafka.partition
--- FAIL: TestWrapAsyncProducer_DrainsSuccessesAndErrorsChannels (0.00s)
    producer_test.go:389: 
                Error Trace:    producer_test.go:389
                Error:          "[%!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6360 {[0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 2] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684140728220 1067214001 0x98dc20} {13849452684140757820 1067243601 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 1 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6360 {[0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 7] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684140828420 1067314101 0x98dc20} {13849452684140909420 1067395101 0x98dc20} 1 test map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 0 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6360 {[0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 3] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684140764220 1067250001 0x98dc20} {13849452684140920220 1067405901 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 2 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6360 {[0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 4] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684140786620 1067272301 0x98dc20} {13849452684140930020 1067415701 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 3 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6360 {[0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 5] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684140801020 1067286701 0x98dc20} {13849452684140937120 1067422801 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 4 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6360 {[0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 6] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684140811420 1067297101 0x98dc20} {13849452684140944320 1067430101 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 5 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6360 {[0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 8] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684140839120 1067324801 0x98dc20} {13849452684140965820 1067451501 0x98dc20} 1 test map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 0 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4})]" should have 8 item(s), but has 7
                Test:           TestWrapAsyncProducer_DrainsSuccessesAndErrorsChannels
                Messages:       should record all spans
--- FAIL: TestWrapAsyncProducer_DrainsSuccessesAndErrorsChannels (0.00s)
    producer_test.go:389: 
                Error Trace:    producer_test.go:389
                Error:          "[%!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6780 {[0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 2] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684576586644 1429330501 0x98dc20} {13849452684576611244 1429355101 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 1 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6780 {[0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 7] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684576808244 1429552101 0x98dc20} {13849452684576877244 1429621101 0x98dc20} 1 test map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 0 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6780 {[0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 8] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684576821244 1429565201 0x98dc20} {13849452684576915544 1429659501 0x98dc20} 1 test map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 0 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6780 {[0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 3] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684576624644 1429368501 0x98dc20} {13849452684576957844 1429701701 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 2 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6780 {[0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 4] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684576695544 1429439401 0x98dc20} {13849452684576967144 1429711101 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 3 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6780 {[0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 5] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684576723444 1429467301 0x98dc20} {13849452684576989644 1429733601 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 4 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4}) %!s(*oteltest.Span=&{{{0 0} 0 0 0 0} 0xc0004a6780 {[0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 6] 0 {[]} false} [0 0 0 0 0 0 0 0] true kafka.produce {13849452684576774344 1429518301 0x98dc20} {13849452684576998144 1429742001 0x98dc20} 0  map[messaging.destination:{4 0 test-topic <nil>} messaging.destination_kind:{4 0 topic <nil>} messaging.kafka.partition:{2 0  <nil>} messaging.message_id:{4 0 5 <nil>} messaging.system:{4 0 kafka <nil>}] [] [] 4})]" should have 8 item(s), but has 7
                Test:           TestWrapAsyncProducer_DrainsSuccessesAndErrorsChannels
                Messages:       should record all spans
--- FAIL: TestWrapAsyncProducerError (0.00s)
    producer_test.go:325: 
                Error Trace:    producer_test.go:325
                Error:          Not equal: 
                                expected: string("test metadata")
                                actual  : trace.SpanID(trace.SpanID{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2})
                Test:           TestWrapAsyncProducerError
                Messages:       should preseve metadata
    producer_test.go:330: 
                Error Trace:    producer_test.go:330
                Error:          "[]" should have 1 item(s), but has 0
                Test:           TestWrapAsyncProducerError
panic: runtime error: index out of range [0] with length 0 [recovered]
        panic: runtime error: index out of range [0] with length 0

goroutine 20163 [running]:
testing.tRunner.func1.2(0x77bc40, 0xc0004c4738)
        /usr/local/go/src/testing/testing.go:1143 +0x49f
testing.tRunner.func1(0xc0006d3c80)
        /usr/local/go/src/testing/testing.go:1146 +0x695
panic(0x77bc40, 0xc0004c4738)
        /usr/local/go/src/runtime/panic.go:971 +0x499
go.opentelemetry.io/contrib/instrumentation/github.com/Shopify/sarama/otelsarama.TestWrapAsyncProducerError(0xc0006d3c80)
        /home/rpajak/repos/opentelemetry-go-contrib/instrumentation/github.com/Shopify/sarama/otelsarama/producer_test.go:332 +0x9c5
testing.tRunner(0xc0006d3c80, 0x7b4630)
        /usr/local/go/src/testing/testing.go:1193 +0x203
created by testing.(*T).Run
        /usr/local/go/src/testing/testing.go:1238 +0x5d8
exit status 2
FAIL    go.opentelemetry.io/contrib/instrumentation/github.com/Shopify/sarama/otelsarama        1.925s

The bug was introduced here: #754

What

  • Properly synchronize the span's metadata cache (remember the value BEFORE sending to input channel)

After the fix:

$ go test -race -count=1000
PASS
ok      go.opentelemetry.io/contrib/instrumentation/github.com/Shopify/sarama/otelsarama        15.329s

PS. Hopefully, this is the last bug related to concurrency in otelsarama.

@codecov
Copy link

codecov bot commented Jul 12, 2021

Codecov Report

Merging #881 (cabb771) into main (cfed0ec) will not change coverage.
The diff coverage is 100.0%.

Impacted file tree graph

@@          Coverage Diff          @@
##            main    #881   +/-   ##
=====================================
  Coverage   79.0%   79.0%           
=====================================
  Files         62      62           
  Lines       2731    2731           
=====================================
  Hits        2159    2159           
  Misses       440     440           
  Partials     132     132           
Impacted Files Coverage Δ
...n/github.com/Shopify/sarama/otelsarama/producer.go 95.7% <100.0%> (ø)

@MrAlias MrAlias merged commit 4c90cea into open-telemetry:main Jul 15, 2021
@pellared pellared deleted the otelsarama-race-condition branch July 15, 2021 21:01
@Aneurysm9 Aneurysm9 mentioned this pull request Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants