Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: TestWaitGroupMisuse2 takes 45-90 seconds on netbsd, AIX #22944

Open
bradfitz opened this issue Nov 30, 2017 · 10 comments

Comments

Projects
None yet
6 participants
@bradfitz
Copy link
Member

commented Nov 30, 2017

TestWaitGroupMisuse2 on Linux with 8 cores takes 0.22s.

On NetBSD it does pass but takes 45-90 seconds. Why?

/cc @bsiegert

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Nov 30, 2017

/cc @dvyukov for any theories.

@dvyukov

This comment has been minimized.

Copy link
Member

commented Dec 1, 2017

This test requires physical parallelism, so my first bet would be on a problem in netbsd scheduler.

@dvyukov

This comment has been minimized.

Copy link
Member

commented Dec 1, 2017

Perhaps execution trace will sched some light.

@coypoop

This comment has been minimized.

Copy link
Contributor

commented Dec 6, 2017

how can I run this specific test?

@bsiegert

This comment has been minimized.

Copy link
Contributor

commented Dec 6, 2017

go test sync should work/

@coypoop

This comment has been minimized.

Copy link
Contributor

commented Dec 6, 2017

netbsd's nanosleep always schedules another process, even for really short sleeps. making it spin on really short sleeps fixes this

@jdolecek

This comment has been minimized.

Copy link

commented Dec 9, 2017

NetBSD nanosleep() always sleeps for at least 1 schedule slice when the specified time is under the schedule resolution. With default HZ value of 100 for i386 and amd64, it's always at least 10ms even when the specified time is smaller. I think other systems might work similar way.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Dec 9, 2017

@coypoop, or:

$ go test -v -run=TestWaitGroupMisuse2 sync

Or:

$ go test -v -run=TestWaitGroupMisuse2 -count=20 sync

@bradfitz bradfitz changed the title sync: TestWaitGroupMisuse2 takes 45-90 seconds on netbsd w/ 8 cores sync: TestWaitGroupMisuse2 takes 45-90 seconds on netbsd, AIX May 29, 2019

@bradfitz bradfitz modified the milestones: Unplanned, Go1.13 May 29, 2019

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented May 29, 2019

It takes 45 seconds on AIX too, and never panics as it expects to:

https://build.golang.org/log/4521fa0230a42f6f3b70e8f108c3ab63d962d567

/cc @Helflym

@Helflym

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2019

I was already aware about this failure. But I didn't find anything relevant.
As the test says "The detection is opportunistic" and in some cases, I don't get a panic until iteration 500000... So a few other cases might not trigger it at all, but it's random.
Note that it also explains the slowness of this test. On a local Linux machine, almost all the panic occurs during the first 1000th iteration. On AIX builder, it's far more random, it can happen at the 10th one like at 100000th one...

Edit: as NetBSD, it might be related to AIX scheduler:

The suspension time may be longer than requested due to the scheduling of other activity by the system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.