Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix client side race condition #1124

Merged
merged 1 commit into from
Aug 26, 2021

Conversation

demirkayaender
Copy link
Contributor

@demirkayaender demirkayaender commented Aug 26, 2021

What changed?
The problem is described here: #1117

That PR fixes leaking go routines but also introduces a race condition, this PR fixes the race.

Why?

How did you test it?

go test -race -test.count 1 -test.run TestContext_RaceRegression ./internal
go test -timeout 30s -run ^TestWorkflowUnitTest$ -testify.m ^(Test_StaleGoroutinesAreShutDown)$ go.uber.org/cadence/internal

Output of the first test without the fix:

==================
WARNING: DATA RACE
Read at 0x00c0003feaa8 by goroutine 13:
  go.uber.org/cadence/internal.(*cancelCtx).cancel()
      /Users/enderd/cadence-client/internal/context.go:312 +0x71
  go.uber.org/cadence/internal.WithCancel.func1()
      /Users/enderd/cadence-client/internal/context.go:194 +0x78
  runtime.call16()
      /usr/local/Cellar/go/1.16.6/libexec/src/runtime/asm_amd64.s:550 +0x3d
  go.uber.org/cadence/internal.(*coroutineState).initialYield()
      /Users/enderd/cadence-client/internal/internal_workflow.go:796 +0xb9
  go.uber.org/cadence/internal.(*coroutineState).yield()
      /Users/enderd/cadence-client/internal/internal_workflow.go:805 +0x45c
  go.uber.org/cadence/internal.(*channelImpl).Receive()
      /Users/enderd/cadence-client/internal/internal_workflow.go:620 +0x35a
  go.uber.org/cadence/internal.(*futureImpl).Get()
      /Users/enderd/cadence-client/internal/internal_workflow.go:294 +0x92
  go.uber.org/cadence/internal.(*workflowEnvironmentInterceptor).Sleep()
      /Users/enderd/cadence-client/internal/workflow.go:906 +0x81
  go.uber.org/cadence/internal.Sleep()
      /Users/enderd/cadence-client/internal/workflow.go:901 +0xc1
  go.uber.org/cadence/internal.TestContext_RaceRegression.func1.1()
      /Users/enderd/cadence-client/internal/context_test.go:49 +0x99
  go.uber.org/cadence/internal.(*dispatcherImpl).newNamedCoroutine.func1()
      /Users/enderd/cadence-client/internal/internal_workflow.go:883 +0x11c

Previous write at 0x00c0003feaa8 by goroutine 16:
  go.uber.org/cadence/internal.(*cancelCtx).cancel()
      /Users/enderd/cadence-client/internal/context.go:315 +0x93
  go.uber.org/cadence/internal.WithCancel.func1()
      /Users/enderd/cadence-client/internal/context.go:194 +0x78
  runtime.call16()
      /usr/local/Cellar/go/1.16.6/libexec/src/runtime/asm_amd64.s:550 +0x3d
  go.uber.org/cadence/internal.(*coroutineState).initialYield()
      /Users/enderd/cadence-client/internal/internal_workflow.go:796 +0xb9
  go.uber.org/cadence/internal.(*coroutineState).yield()
      /Users/enderd/cadence-client/internal/internal_workflow.go:805 +0x45c
  go.uber.org/cadence/internal.(*channelImpl).Receive()
      /Users/enderd/cadence-client/internal/internal_workflow.go:620 +0x35a
  go.uber.org/cadence/internal.(*futureImpl).Get()
      /Users/enderd/cadence-client/internal/internal_workflow.go:294 +0x92
  go.uber.org/cadence/internal.(*workflowEnvironmentInterceptor).Sleep()
      /Users/enderd/cadence-client/internal/workflow.go:906 +0x81
  go.uber.org/cadence/internal.Sleep()
      /Users/enderd/cadence-client/internal/workflow.go:901 +0xc1
  go.uber.org/cadence/internal.TestContext_RaceRegression.func1.1()
      /Users/enderd/cadence-client/internal/context_test.go:49 +0x99
  go.uber.org/cadence/internal.(*dispatcherImpl).newNamedCoroutine.func1()
      /Users/enderd/cadence-client/internal/internal_workflow.go:883 +0x11c

Goroutine 13 (running) created at:
  go.uber.org/cadence/internal.(*dispatcherImpl).newNamedCoroutine()
      /Users/enderd/cadence-client/internal/internal_workflow.go:874 +0x3d1
  go.uber.org/cadence/internal.(*dispatcherImpl).newCoroutine()
      /Users/enderd/cadence-client/internal/internal_workflow.go:868 +0xeb
  go.uber.org/cadence/internal.Go()
      /Users/enderd/cadence-client/internal/workflow.go:394 +0x94
  go.uber.org/cadence/internal.TestContext_RaceRegression.func1()
      /Users/enderd/cadence-client/internal/context_test.go:53 +0xe9
  runtime.call32()
      /usr/local/Cellar/go/1.16.6/libexec/src/runtime/asm_amd64.s:551 +0x3d
  reflect.Value.Call()
      /usr/local/Cellar/go/1.16.6/libexec/src/reflect/value.go:337 +0xd8
  go.uber.org/cadence/internal.(*workflowEnvironmentInterceptor).ExecuteWorkflow()
      /Users/enderd/cadence-client/internal/workflow.go:423 +0x2e4
  go.uber.org/cadence/internal.(*workflowExecutor).Execute()
      /Users/enderd/cadence-client/internal/internal_worker.go:700 +0x41a
  go.uber.org/cadence/internal.(*workflowExecutorWrapper).Execute()
      /Users/enderd/cadence-client/internal/internal_workflow_testsuite.go:1459 +0x8f9
  go.uber.org/cadence/internal.(*syncWorkflowDefinition).Execute.func1()
      /Users/enderd/cadence-client/internal/internal_workflow.go:466 +0x243
  go.uber.org/cadence/internal.(*dispatcherImpl).newNamedCoroutine.func1()
      /Users/enderd/cadence-client/internal/internal_workflow.go:883 +0x11c

Goroutine 16 (running) created at:
  go.uber.org/cadence/internal.(*dispatcherImpl).newNamedCoroutine()
      /Users/enderd/cadence-client/internal/internal_workflow.go:874 +0x3d1
  go.uber.org/cadence/internal.(*dispatcherImpl).newCoroutine()
      /Users/enderd/cadence-client/internal/internal_workflow.go:868 +0xeb
  go.uber.org/cadence/internal.Go()
      /Users/enderd/cadence-client/internal/workflow.go:394 +0x94
  go.uber.org/cadence/internal.TestContext_RaceRegression.func1()
      /Users/enderd/cadence-client/internal/context_test.go:53 +0xe9
  runtime.call32()
      /usr/local/Cellar/go/1.16.6/libexec/src/runtime/asm_amd64.s:551 +0x3d
  reflect.Value.Call()
      /usr/local/Cellar/go/1.16.6/libexec/src/reflect/value.go:337 +0xd8
  go.uber.org/cadence/internal.(*workflowEnvironmentInterceptor).ExecuteWorkflow()
      /Users/enderd/cadence-client/internal/workflow.go:423 +0x2e4
  go.uber.org/cadence/internal.(*workflowExecutor).Execute()
      /Users/enderd/cadence-client/internal/internal_worker.go:700 +0x41a
  go.uber.org/cadence/internal.(*workflowExecutorWrapper).Execute()
      /Users/enderd/cadence-client/internal/internal_workflow_testsuite.go:1459 +0x8f9
  go.uber.org/cadence/internal.(*syncWorkflowDefinition).Execute.func1()
      /Users/enderd/cadence-client/internal/internal_workflow.go:466 +0x243
  go.uber.org/cadence/internal.(*dispatcherImpl).newNamedCoroutine.func1()
      /Users/enderd/cadence-client/internal/internal_workflow.go:883 +0x11c
==================
FAIL
FAIL    go.uber.org/cadence/internal    0.646s
FAIL

Potential risks

@demirkayaender demirkayaender requested review from Groxx and a team August 26, 2021 20:55
"go.uber.org/zap/zaptest"
)

func TestContext_RaceRegression(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test taken from Groxx@1e5a037

@coveralls
Copy link

coveralls commented Aug 26, 2021

Pull Request Test Coverage Report for Build a606b89c-de6f-41f7-a1d2-e8199e40e732

  • 10 of 10 (100.0%) changed or added relevant lines in 1 file are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.02%) to 71.543%

Files with Coverage Reduction New Missed Lines %
internal/context.go 2 78.29%
Totals Coverage Status
Change from base Build 00905a6a-bf4b-4d84-b4f3-f5ea4534cfdc: 0.02%
Covered Lines: 12158
Relevant Lines: 16994

💛 - Coveralls

internal/context.go Outdated Show resolved Hide resolved
Copy link
Contributor

@Groxx Groxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, with a minimal lock like that, it may be sufficient. Worth a try to reduce crashes while we fix the underlying cause.

@demirkayaender demirkayaender merged commit 999748d into uber-go:master Aug 26, 2021
@demirkayaender demirkayaender deleted the fix_race branch August 26, 2021 21:29
Groxx pushed a commit that referenced this pull request Oct 6, 2021
@Groxx Groxx mentioned this pull request Nov 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants