Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: testing: ability to tell if test is running in parallel #70017

Open
jim-minter opened this issue Oct 23, 2024 · 12 comments
Open

proposal: testing: ability to tell if test is running in parallel #70017

jim-minter opened this issue Oct 23, 2024 · 12 comments
Labels
Milestone

Comments

@jim-minter
Copy link

Proposal Details

Unit tests can indicate their ability to be run in parallel by making the opt-in call t.Parallel().

However sometimes a test might opt in but then mistakenly change some global state which could race with something else. In this case, the parallel test should actually be a serial one. In a large codebase it can be tricky to protect against this happening.

For example,

func TestShouldNotBeParallel(t *testing.T) {
    t.Parallel()
    // ...
    changeGlobalState(t) // a mistake, given above t.Parallel()
    // ...
}

func changeGlobalState(t *testing.T) {
    // proposal: could we check `t` here to determine that we're running in parallel when we shouldn't be, and error out?
    if t.IsRunningInParallel() {
        t.Fatal("changeGlobalState called from test with t.Parallel()")
    }

    // ...
}

The proposal here is a function like t.IsRunningInParallel() that could return true in the case that the current test is running in a parallel context. I don't think there's a way to check this currently?

@gopherbot gopherbot added this to the Proposal milestone Oct 23, 2024
@seankhliao
Copy link
Member

This feels too niche, nobody is going to think to add something like this until they actually run into a problem, and at that point, this doesn't help you track down the issue.

@abhinav
Copy link
Contributor

abhinav commented Oct 23, 2024

I've definitely felt a need for this when writing test utilities that modify global state (e.g. changing a global logger temporarily to capture output, or stubbling a global value). Any such utility will misbehave if called from parallel tests, and there's no way for the utility to prevent this misuse.
Right now, such misuse is caught by go test -race if the global state being manipulated happens to hit that. But if the API for changing the global state is safe for concurrent use (e.g. Zap's ReplaceGlobals), then you're out of luck and the tests that are misusing the non-parallel test helper will just have a hard-to-debug intermittent test failure.

As an stdlib data point in favor of exposing this information to external libraries:
testing.T.Setenv also does not want to be called from parallel tests, and it's able to guard against it by inspecting the internal state of testing.T.

Naming-wise: I suggest IsParallel since Parallel is already taken.

@seankhliao
Copy link
Member

if you can modify a global value, then you can also hold a lock or some other token protecting the global value, which you can protect properly rather than doing as hoc checks of if a test is running in parallel

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Oct 24, 2024
@Jorropo
Copy link
Member

Jorropo commented Oct 24, 2024

As a workaround you can do:

func TestShouldNotBeParallel(t *testing.T) {
    t.Parallel()
    // ...
    changeGlobalState(t) // a mistake, given above t.Parallel()
    // ...
}

var noParallel bool

func changeGlobalState(t *testing.T) {
    noParallel = !noParallel // trips the race detector

    // ...
}

@abhinav
Copy link
Contributor

abhinav commented Oct 24, 2024

That's true, @seankhliao, except that's not always an option.
TB.Setenv and the incoming TB.Chdir are examples of this.
They both rely on being able to check whether the test is being run with t.Parallel to prevent misuse.
It's currently not possible for a helper that lives outside the standard library to implement similar functionality.

Edit:
That's a good workaround, @Jorropo.

@adonovan
Copy link
Member

adonovan commented Oct 24, 2024

The workaround is a good one, but it still relies on the race detector. I don't think the request to publish t.CheckParallel is unreasonable as it would give earlier notification when a (known) concurrency-unsafe test helper is accidentally used by a parallel test.

@seankhliao
Copy link
Member

I was thinking more along the lines of:

var globalResource sync.Mutex

func TestMyThing(t *testing.T) {
    // or Lock() if you want to just sequence the tests
    gotLock := globalResource.TryLock()
    if !gotLock {
        t.Fatal("...")
    }
    defer globalResource.Unlock()

    // ...
}

It's only wrong for tests that mutate the same set of resources to run in parallel, not for anything else to run in parallel.

@adonovan
Copy link
Member

TryLock only probabilistically reports the problem, with low probability because the critical section is short. By contrast, T.checkParallel reliably reports the problem.

The analogy with t.Chdir is instructive: Chdir is a short-lived operation, but it has long-lasting global effects; the risk is not that two tests call Chdir at the same time, but that any test calls it while another test is running.

@seankhliao
Copy link
Member

A successful TryLock ends with you holding the lock for the entire duration of the test, covering any change and restore operation you make on the global resource. Any other test using the same lock would fail if run in parallel.

@Jorropo
Copy link
Member

Jorropo commented Oct 24, 2024

This could happen:

  • Test A start
    • Test A calls parallel
  • Test B start
    • Test B calls parallel
    • Test B TryLock
    • Test B do it's thing
    • Test B release lock
  • Test A TryLock
    • ...

Which should have been caught but was not.

@adonovan
Copy link
Member

A successful TryLock ends with you holding the lock for the entire duration of the test

Ah, I see; has I misread the defer as a shorthand for a critical section that updates some resource, rather than the entire test.

Yes, assuming the effects can be undone, as with os.Chdir and restore, or replace os.Stdout and restore, then the TryLock trick lets you put the test in the critical section. Whereas an ordinary mutex would ensure the test runs correctly but potentially slowly, the TryLock approach says "fast or die".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

7 participants