Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix read/write race condition for edge case #1038

Merged
merged 2 commits into from Sep 18, 2023

Conversation

ankur22
Copy link
Collaborator

@ankur22 ankur22 commented Sep 18, 2023

What?

The mutex that is added in this PR is only needed in an edge case where we have multiple instances of k6 connecting to the same chrome instance.

Data race stack trace
WARNING: DATA RACE
Write at 0x00c001b2ac40 by goroutine 164:
  github.com/grafana/xk6-browser/common.(*Browser).NewContext()
      /home/runner/work/xk6-browser/xk6-browser/common/browser.go:515 +0xa76
  github.com/grafana/xk6-browser/tests.TestMultiConnectToSingleBrowser()
      /home/runner/work/xk6-browser/xk6-browser/tests/browser_test.go:334 +0x2fe
  testing.tRunner()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.19.13/x64/src/testing/testing.go:14[93](https://github.com/grafana/xk6-browser/actions/runs/6222592947/job/16886846565?pr=1034#step:5:94) +0x47

Previous read at 0x00c001b2ac40 by goroutine 4402:
  github.com/grafana/xk6-browser/common.(*Browser).getDefaultBrowserContextOrMatchedID()
      /home/runner/work/xk6-browser/xk6-browser/common/browser.go:142 +0x236
  github.com/grafana/xk6-browser/common.(*Browser).onAttachedToTarget()
      /home/runner/work/xk6-browser/xk6-browser/common/browser.go:220 +0x1f0
  github.com/grafana/xk6-browser/common.(*Browser).initEvents.func1()
      /home/runner/work/xk6-browser/xk6-browser/common/browser.go:185 +0x424

Why?

In this case when a page is created by the first k6 instance, the second instance of k6 will also receive an onAttachedToTarget event, causing it to try to create a new page instance in the second k6 instance too. At this point the second instance of k6 tries to recover a browserContext with the given ID from the CDP async message. If there is no browserContext with the given id (which there shouldn't be in the second k6 instance) it will fall back to using the defaultBrowserContext. This is where the read part of the race condition occurs.

When this occurs there's a small chance that at the same time a NewContext is being actioned on by the second k6 instance, and this is where the write part of the race condition occurs.

This mutex solely protects the read/write race condition for this one case.

Checklist

  • I have performed a self-review of my code
  • I have added tests for my changes
  • I have commented on my code, particularly in hard-to-understand areas

Related PR(s)/Issue(s)

Linked CI test failure where this issue was discovered: https://github.com/grafana/xk6-browser/actions/runs/6222592947/job/16886846565?pr=1034, and the test that failed was TestMultiConnectToSingleBrowser. I was able to reproduce the issue when stress testing this one integration test locally. After the fix the problem didn't surface again.

This mutex is only needed in an edge case where we have multiple
instances of k6 connecting to the same chrome instance. In this
case when a page is created by the first k6 instance, the second
instance of k6 will also receive an onAttachedToTarget event. When
this occurs there's a small chance that at the same time a new
context is being created by the second k6 instance. So the read
occurs in getDefaultBrowserContextOrMatchedID which is called by
onAttachedToTarget, and the write in NewContext. This mutex protects
the read/write race condition for this one case.
Copy link
Member

@inancgumus inancgumus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👏

Copy link
Collaborator

@ka3de ka3de left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! LGTM.

@ankur22 ankur22 merged commit 202dd3a into main Sep 18, 2023
13 checks passed
@ankur22 ankur22 deleted the fix/data-race-browser-context branch September 18, 2023 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants