Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Data race in TestApplication_StartUnified #43

Closed
pjanotti opened this issue Jun 21, 2019 · 3 comments
Closed

BUG: Data race in TestApplication_StartUnified #43

pjanotti opened this issue Jun 21, 2019 · 3 comments
Assignees
Projects
Milestone

Comments

@pjanotti
Copy link
Contributor

pjanotti commented Jun 21, 2019

From this Travis run: https://travis-ci.org/open-telemetry/opentelemetry-service/builds/548531356?utm_source=github_status&utm_medium=notification

{"level":"info","ts":1561097714.798292,"caller":"collector/collector.go:174","msg":"Starting...","NumCPU":2}
{"level":"info","ts":1561097714.7983959,"caller":"collector/collector.go:93","msg":"Setting up profiler..."}
{"level":"info","ts":1561097714.7984593,"caller":"collector/collector.go:101","msg":"Setting up health checks..."}
{"level":"info","ts":1561097714.798795,"caller":"healthcheck/handler.go:99","msg":"Health Check server started","http-port":43771,"status":"unavailable"}
{"level":"info","ts":1561097714.7989807,"caller":"collector/collector.go:110","msg":"Setting up zPages..."}
{"level":"info","ts":1561097714.7992342,"caller":"collector/collector.go:118","msg":"Running zPages","port":42581}
{"level":"info","ts":1561097715.8005574,"caller":"opencensus/receiver.go:51","msg":"OpenCensus receiver is running.","port":45320}
{"level":"info","ts":1561097715.8007278,"caller":"collector/collector.go:127","msg":"Setting up own telemetry..."}
{"level":"info","ts":1561097715.8020036,"caller":"collector/telemetry.go:93","msg":"Serving Prometheus metrics","port":35317}
{"level":"info","ts":1561097715.8023326,"caller":"collector/collector.go:137","msg":"Everything is ready. Begin running and processing data."}
{"level":"info","ts":1561097715.8031561,"caller":"healthcheck/handler.go:133","msg":"Health Check state change","status":"ready"}
{"level":"info","ts":1561097715.8068607,"caller":"collector/collector.go:157","msg":"Received stop test request"}
{"level":"info","ts":1561097715.8069491,"caller":"healthcheck/handler.go:133","msg":"Health Check state change","status":"unavailable"}
{"level":"info","ts":1561097715.8070078,"caller":"collector/collector.go:194","msg":"Starting shutdown..."}
{"level":"info","ts":1561097715.807472,"caller":"collector/collector.go:205","msg":"Shutdown complete."}
{"level":"info","ts":1561097715.8124511,"caller":"collector/collector.go:285","msg":"Starting...","NumCPU":2}
{"level":"info","ts":1561097715.8126116,"caller":"collector/collector.go:93","msg":"Setting up profiler..."}
{"level":"info","ts":1561097715.8127234,"caller":"collector/collector.go:101","msg":"Setting up health checks..."}
{"level":"info","ts":1561097715.8131378,"caller":"healthcheck/handler.go:99","msg":"Health Check server started","http-port":37679,"status":"unavailable"}
{"level":"info","ts":1561097715.8132315,"caller":"collector/collector.go:110","msg":"Setting up zPages..."}
{"level":"info","ts":1561097715.8135908,"caller":"collector/collector.go:118","msg":"Running zPages","port":45743}
{"level":"info","ts":1561097715.8136768,"caller":"collector/collector.go:127","msg":"Setting up own telemetry..."}
{"level":"info","ts":1561097715.8148317,"caller":"collector/telemetry.go:93","msg":"Serving Prometheus metrics","port":46092}
{"level":"info","ts":1561097715.8150716,"caller":"collector/collector.go:232","msg":"Loading configuration..."}
{"level":"info","ts":1561097715.8159492,"caller":"collector/collector.go:240","msg":"Applying configuration..."}
{"level":"info","ts":1561097715.8763263,"caller":"builder/exporters_builder.go:199","msg":"Exporter is enabled.","exporter":"opencensus"}
{"level":"info","ts":1561097715.8764515,"caller":"builder/pipelines_builder.go:118","msg":"Pipeline is enabled.","pipelines":"traces"}
{"level":"info","ts":1561097715.8765345,"caller":"builder/receivers_builder.go:210","msg":"Receiver is enabled.","receiver":"jaeger","datatype":"traces"}
{"level":"info","ts":1561097715.876596,"caller":"collector/collector.go:264","msg":"Starting receivers..."}
{"level":"info","ts":1561097715.8766484,"caller":"builder/receivers_builder.go:91","msg":"Receiver is starting...","receiver":"jaeger"}
{"level":"info","ts":1561097715.878872,"caller":"builder/receivers_builder.go:96","msg":"Receiver is started.","receiver":"jaeger"}
{"level":"info","ts":1561097715.8789592,"caller":"collector/collector.go:137","msg":"Everything is ready. Begin running and processing data."}
{"level":"info","ts":1561097715.8790317,"caller":"healthcheck/handler.go:133","msg":"Health Check state change","status":"ready"}
{"level":"info","ts":1561097715.8865044,"caller":"collector/collector.go:157","msg":"Received stop test request"}
{"level":"info","ts":1561097715.8866715,"caller":"healthcheck/handler.go:133","msg":"Health Check state change","status":"unavailable"}
{"level":"info","ts":1561097715.8867686,"caller":"collector/collector.go:304","msg":"Starting shutdown..."}
{"level":"info","ts":1561097715.886851,"caller":"collector/collector.go:276","msg":"Stopping receivers..."}
==================
WARNING: DATA RACE
Write at 0x00c000139ae8 by goroutine 54:
  sync.(*WaitGroup).Wait()
      /home/travis/.gimme/versions/go1.12.6.linux.amd64/src/internal/race/race.go:41 +0xef
  github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).Stop()
      /home/travis/gopath/pkg/mod/github.com/jaegertracing/jaeger@v1.9.0/cmd/agent/app/processors/thrift_processor.go:102 +0x11d
Previous read at 0x00c000139ae8 by goroutine 107:
  sync.(*WaitGroup).Add()
      /home/travis/.gimme/versions/go1.12.6.linux.amd64/src/internal/race/race.go:37 +0x169
  github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).Serve()
      /home/travis/gopath/pkg/mod/github.com/jaegertracing/jaeger@v1.9.0/cmd/agent/app/processors/thrift_processor.go:84 +0x62
Goroutine 54 (running) created at:
  github.com/jaegertracing/jaeger/cmd/agent/app.(*Agent).Stop()
      /home/travis/gopath/pkg/mod/github.com/jaegertracing/jaeger@v1.9.0/cmd/agent/app/agent.go:88 +0xa1
  github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver.(*jReceiver).stopTraceReceptionLocked.func1()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver/trace_receiver.go:208 +0x5ad
  sync.(*Once).Do()
      /home/travis/.gimme/versions/go1.12.6.linux.amd64/src/sync/once.go:44 +0xde
  github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver.(*jReceiver).stopTraceReceptionLocked()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver/trace_receiver.go:204 +0xa0
  github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver.(*jReceiver).StopTraceReception()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver/trace_receiver.go:199 +0x8e
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/builder.(*builtReceiver).Stop()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/builder/receivers_builder.go:42 +0x2f8
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/builder.Receivers.StopAll()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/builder/receivers_builder.go:84 +0xc7
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.(*Application).shutdownPipelines()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector/collector.go:277 +0xb6
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.(*Application).executeUnified()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector/collector.go:306 +0x30a
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.(*Application).StartUnified.func1()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector/collector.go:322 +0x71
  github.com/spf13/cobra.(*Command).execute()
      /home/travis/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:766 +0x8eb
  github.com/spf13/cobra.(*Command).ExecuteC()
      /home/travis/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:852 +0x418
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.(*Application).StartUnified()
      /home/travis/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:800 +0x298
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.TestApplication_StartUnified.func1()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector/collector_test.go:88 +0x78
Goroutine 107 (running) created at:
  github.com/jaegertracing/jaeger/cmd/agent/app.(*Agent).Run()
      /home/travis/gopath/pkg/mod/github.com/jaegertracing/jaeger@v1.9.0/cmd/agent/app/agent.go:75 +0x2bf
  github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver.(*jReceiver).startAgent()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver/trace_receiver.go:335 +0x3cc
  github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver.(*jReceiver).StartTraceReception.func1()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver/trace_receiver.go:180 +0x53
  sync.(*Once).Do()
      /home/travis/.gimme/versions/go1.12.6.linux.amd64/src/sync/once.go:44 +0xde
  github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver.(*jReceiver).StartTraceReception()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/receiver/jaegerreceiver/trace_receiver.go:179 +0xe0
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/builder.(*builtReceiver).Start()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/builder/receivers_builder.go:62 +0x312
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/builder.Receivers.StartAll()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/builder/receivers_builder.go:93 +0x2c7
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.(*Application).setupPipelines()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector/collector.go:265 +0x485
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.(*Application).executeUnified()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector/collector.go:295 +0x26c
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.(*Application).StartUnified.func1()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector/collector.go:322 +0x71
  github.com/spf13/cobra.(*Command).execute()
      /home/travis/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:766 +0x8eb
  github.com/spf13/cobra.(*Command).ExecuteC()
      /home/travis/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:852 +0x418
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.(*Application).StartUnified()
      /home/travis/gopath/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:800 +0x298
  github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector.TestApplication_StartUnified.func1()
      /home/travis/gopath/src/github.com/open-telemetry/opentelemetry-service/cmd/occollector/app/collector/collector_test.go:88 +0x78
==================
{"level":"info","ts":1561097715.897572,"caller":"collector/collector.go:311","msg":"Shutdown complete."}
--- FAIL: TestApplication_StartUnified (0.10s)
    testing.go:809: race detected during execution of test
FAIL
coverage: 57.1% of statements
@pjanotti pjanotti changed the title BUG: Data race in BUG: Data race in TestApplication_StartUnified Jun 21, 2019
@tigrannajaryan tigrannajaryan self-assigned this Jun 21, 2019
@tigrannajaryan
Copy link
Member

Root cause is in Jaeger Agent implementation. I filed a bug with Jaeger: jaegertracing/jaeger#1624

I will see if there is a good workaround.

@tigrannajaryan
Copy link
Member

I submitted a PR to fix the bug in Jaeger: jaegertracing/jaeger#1625

There is no good workaround except adding a pause in our tests. I will do it if PR does not get merged quickly.

pjanotti referenced this issue in pjanotti/opentelemetry-service Jun 27, 2019
Add multiplexing/"pass through" mode where spans from
sources other than the initiating node can be sent
and they should be properly proxied to their final
destination. If the currently received node is nil,
use the previously received and non-nil node, and after
processing spans, memoize the last received node.
Updated the tests to lock-in this behavior.

This change also adds in tests to ensure this behavior
and to avoid regressions in the future.

Fixes #43
@flands flands added this to To do in Triaged via automation Jun 28, 2019
@flands flands added this to the 0.1.0 milestone Jun 28, 2019
@tigrannajaryan
Copy link
Member

Jaeger PR is now fixed. However updating to use latest Jaeger is not straightforward. They have breaking changes.

@tigrannajaryan tigrannajaryan removed their assignment Jul 2, 2019
@tigrannajaryan tigrannajaryan added the help wanted Good issue for contributors to OpenTelemetry Service to pick up label Jul 2, 2019
tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-collector that referenced this issue Jul 3, 2019
Added a Sleep to work around a data race bug in Jaeger
(jaegertracing/jaeger#1625) caused
by stopping immediately after starting.

Without this Sleep we were observing this bug on our side:
open-telemetry#43
The Sleep ensures that Jaeger Start() is fully completed before
we call Jaeger Stop().

TODO: Jaeger bug is already fixed, remove this once we update Jaeger
to latest version.

Testing done: make test
@tigrannajaryan tigrannajaryan self-assigned this Jul 3, 2019
@tigrannajaryan tigrannajaryan removed the help wanted Good issue for contributors to OpenTelemetry Service to pick up label Jul 3, 2019
songy23 pushed a commit that referenced this issue Jul 3, 2019
Added a Sleep to work around a data race bug in Jaeger
(jaegertracing/jaeger#1625) caused
by stopping immediately after starting.

Without this Sleep we were observing this bug on our side:
#43
The Sleep ensures that Jaeger Start() is fully completed before
we call Jaeger Stop().

TODO: Jaeger bug is already fixed, remove this once we update Jaeger
to latest version.

Testing done: make test
Triaged automation moved this from To do to Done Jul 3, 2019
bogdandrutu added a commit that referenced this issue Oct 28, 2021
* Initial commit

* Add CODEOWNERS file (#2)

* Add CODEOWNERS file

* Update CODEOWNERS

* Moved from github.com/observatorium/opentelemetry-collector-builder (#3)

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* fixed panics (#6)

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Replace master with main in CI and mergify files (#8)

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Bump to OpenTelemetry Collector 0.20.0 (#10)

Closes #9

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Explicitly enable Go modules in quickstart instructions (#13)

* Update to collector v0.21.0 (#17)

Fixes #16

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Update to collector v0.22.0 (#19)

* Download go modules before building (#20)

Fixes #14

* Add version command (#25)

Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>

* Pass errors from cobra Execute back to main for correct exit code (#28)

* pass errors from cobra execute back to main

* print the error

* Update to collector v0.23.0 (#27)

* Generate a warning if the builder and collector base version mismatch (#30)

* Generate a warning if the builder and collector base version mismatch

* Show current default version in the warning message

* Update to OpenTelemetry Collector 0.24.0

* Don't use %w formatting with log.Fatal (#35)

* Update to OpenTelemetry Collector 0.25.0 (#36)

Signed-off-by: Serge Catudal <serge.catudal@gmail.com>

* Update to 0.26.0 and update BuildInfo (#39)

* Sync build and CI Go versions at latest 1.16 (#34)

* Sync build and CI Go versions at latest 1.16

* Run go mod tidy

* Set go binary to use in the compilation phase in tests

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

Co-authored-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Add option to generate go code only (no compile) (#40)

* Issue#24 Add option to generate go code only (no compile)

* Update cmd/root.go logging

Suggested by @jpkkrohling

Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>

* remove verbose help .. created by corba

* suggestion by jpkrohling to keep generateandcompile

* lint error: remove unused var

* reword cmd option and add back help message for default

Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>

* Don't reuse exec.Cmd (#42)

* Update to OpenTelemetry Collector 0.27.0 (#43)

* Add CI Badge (#47)

* Update to Collector v0.28.0 (#49)

* Update to Collector v0.28.0

Closes #48

Addresses the breaking API change in
#3163,
besides the usual version number changes.

Signed-off-by: Fangyi Zhou <me@fangyi.io>

* Use `go mod tidy` instead of `go mod download`

It appears that this magically resolves the go.mod file issue.
https://stackoverflow.com/questions/67203641/missing-go-sum-entry-for-module-providing-package-package-name

Signed-off-by: Fangyi Zhou <me@fangyi.io>

* Account for go mod download in go1.17 not updating go.sum (#50)

* Update to collector v0.29.0 (#54)

* Update replaces.builder.yaml

* Update nocore.builder.yaml

* Update config.go

* Update README.md

* Update main.go

* Update to collector v0.30.0 (#57)

* cmd: fix module flag default value to github.com/open-telemetry (#58)

Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>

* Update to collector v0.31.0 (#60)

* Update to v0.33.0 (#62)

Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>

* Add excludes support to generated go.mod (#63)

Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>

Co-authored-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Small cleanup for the builder files (#64)

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

* Support building with Go 1.17 (#66)

* Support building with Go 1.17
Fixes #65

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Update workflows to use Go 1.17

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Add gosec exceptions for exec.Command

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Update to OpenTelemetry core 0.34.0 (#68)

Fixes #67

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Upgrade to OpenTelemetry Collector 0.35.0 (#70)

Signed-off-by: Fangyi Zhou <me@fangyi.io>

* Upgrade to OpenTelemetry Collector 0.36.0 (#76)

* Generate custom service code for Windows (#75)

* update main to include windows service code

* use main version from tag 0.35.0

* update main function

* align with upstream v0.36.0 tag

* dummy change to trigger build

* Revert "dummy change to trigger build"

This reverts commit 629d499461da2d2c240bf1e495b5fe0558e3547f.

* Remove Core from Module type (#77)

Fixes #15

Signed-off-by: yugo-horie <u5.horie@gmail.com>

* release 0.37.0 (#78)

* release 0.37.0

* update use of NewCommand

* Move builder to subdirectory

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

Co-authored-by: Bogdan Drutu <lazy@splunk.com>
Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
Co-authored-by: Joe Elliott <joe.elliott@grafana.com>
Co-authored-by: Eric Yang <jiwen624@gmail.com>
Co-authored-by: Brian Gibbins <eroteme@supernought.co.uk>
Co-authored-by: Ashmita <ashmita.bohara152@gmail.com>
Co-authored-by: Fangyi Zhou <me@fangyi.io>
Co-authored-by: Shaun Creary <65406540+crearys@users.noreply.github.com>
Co-authored-by: Patryk Małek <69143962+pmalek-sumo@users.noreply.github.com>
Co-authored-by: Serge Catudal <serge.catudal@gmail.com>
Co-authored-by: Aaron Stone <aaron@serendipity.cx>
Co-authored-by: Patryk Małek <pmalek@sumologic.com>
Co-authored-by: Aaron Stone <aaron.stone@udacity.com>
Co-authored-by: Kelvin Lo <kello@live.ca>
Co-authored-by: Himanshu <addyjeridiq@gmail.com>
Co-authored-by: Y.Horie <u5.horie@gmail.com>
Co-authored-by: Koichi Shiraishi <zchee.io@gmail.com>
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
Co-authored-by: Cal Loomis <68860480+loomis-relativity@users.noreply.github.com>
Co-authored-by: alrex <aboten@lightstep.com>
MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this issue Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Triaged
  
Done
Development

No branches or pull requests

3 participants