Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time: NewTicker will not emit ticks at a frequency greater than 1/sec on qemu user mode ppc64le #36592

Closed
patrickmeiring opened this issue Jan 16, 2020 · 16 comments

Comments

@patrickmeiring
Copy link

@patrickmeiring patrickmeiring commented Jan 16, 2020

What version of Go are you using (go version)?

Found on go version 1.12.15 and 1.13.6 but not 1.11.13.

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

ppc64le (run via Qemu user mode emulation in a docker container on an AMD64 machine provided by travis-ci.org).
This issue does not occur on amd64, 386, arm or arm64 architectures.

go env Output (for host machine)
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/travis/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/travis/gopath"
GOPROXY=""
GORACE=""
GOROOT="/home/travis/.gimme/versions/go1.12.15.linux.amd64"
GOTMPDIR=""
GOTOOLDIR="/home/travis/.gimme/versions/go1.12.15.linux.amd64/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build725852664=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Ran the following test:

func TestTickerPerformance(t *testing.T) {
	testTicker(t, time.Microsecond*100)
	testTicker(t, time.Microsecond*200)
	testTicker(t, time.Microsecond*500)
	testTicker(t, time.Millisecond)
	testTicker(t, time.Millisecond*10)
	testTicker(t, time.Millisecond*100)
	testTicker(t, time.Millisecond*1000)
}

func testTicker(t *testing.T, d time.Duration) {
	count := 0

	ctx, cancel := context.WithTimeout(context.Background(), time.Second*4)
	defer cancel()

	ticker := time.NewTicker(d)
	defer ticker.Stop()

loop:
	for {
		select {
		case <-ticker.C:
			count++
		case <-ctx.Done():
			break loop
		}
	}

	t.Logf("%v ticker emitted %v ticks/sec", d, float64(count)/4.0)
}

Available here also as a playground link (code slightly modified to run from main() instead of as a test): https://play.golang.org/p/-zk7GR62pse (please note this produces the 'expected' output on playground rather than the broken output obtained from the ppc64le emulator).

What did you expect to see?

The playground output:

100µs ticker emitted 10000 ticks/sec
200µs ticker emitted 5000 ticks/sec
500µs ticker emitted 2000 ticks/sec
1ms ticker emitted 1000 ticks/sec
10ms ticker emitted 100 ticks/sec
100ms ticker emitted 10 ticks/sec
1s ticker emitted 1 ticks/sec

Or performance figures close to it (noting that this test isn't perfect and there are good reasons why the 100µs ticker may be observed as emitting somewhat fewer than 10000 ticks/sec).

What did you see instead?

On go 1.12.x on ppc64le (see this travis CI run, scroll down to "Running test suite under ppc64le", then to "TestTickerPerformance", line ~1077): https://travis-ci.org/godbus/dbus/jobs/637768665)

100µs ticker emitted 1 ticks/sec
200µs ticker emitted 1 ticks/sec
500µs ticker emitted 1 ticks/sec
1ms ticker emitted 1 ticks/sec
10ms ticker emitted 1 ticks/sec
100ms ticker emitted 1 ticks/sec
1s ticker emitted 0.75 ticks/sec

This go 1.13.x run has the same result: https://travis-ci.org/godbus/dbus/jobs/637768666

The go 1.11.x run does not have this issue:

100µs ticker emitted 9702.75 ticks/sec
200µs ticker emitted 4995.25 ticks/sec
500µs ticker emitted 1999 ticks/sec
1ms ticker emitted 1000 ticks/sec
10ms ticker emitted 100 ticks/sec
100ms ticker emitted 10 ticks/sec
1s ticker emitted 0.75 ticks/sec

This suggests to me that this may be a regression in golang and not due to a flawed emulator.

Comments

I wanted to independently verify this issue on an actual PowerPC device (rather only test on an emulator), but this is difficult for me to come across, so I have not done this.

@patrickmeiring
Copy link
Author

@patrickmeiring patrickmeiring commented Jan 16, 2020

Further investigation reveals a related problem: all time.Sleep(duration) calls with a duration of less than 1 second seem to be having the same effect as time.Sleep(time.Second).

@ceseo
Copy link
Contributor

@ceseo ceseo commented Jan 16, 2020

I don't see this running on a POWER9 (go1.13.5):

2020/01/16 09:17:46 100µs ticker emitted 9999.75 ticks/sec
2020/01/16 09:17:50 200µs ticker emitted 5000 ticks/sec
2020/01/16 09:17:54 500µs ticker emitted 2000 ticks/sec
2020/01/16 09:17:58 1ms ticker emitted 1000 ticks/sec
2020/01/16 09:18:02 10ms ticker emitted 99.75 ticks/sec
2020/01/16 09:18:06 100ms ticker emitted 9.75 ticks/sec
2020/01/16 09:18:10 1s ticker emitted 0.75 ticks/sec

Looks like an emulator problem. By the go versions you posted, it seems that the problem there started when I changed the time functions to use the VDSO rather than the syscall (commit dbd8af7).

Btw, if you need access to POWER8/9 systems, please go to the OSU Open Source Lab website. You can request a VM there for development. We host our builders at OSU.

@ceseo
Copy link
Contributor

@ceseo ceseo commented Jan 16, 2020

@cagedmantis cagedmantis changed the title time.NewTicker() will not emit ticks at a frequency greater than 1/sec on ppc64le time: time.NewTicker() will not emit ticks at a frequency greater than 1/sec on ppc64le Jan 16, 2020
@bradfitz bradfitz changed the title time: time.NewTicker() will not emit ticks at a frequency greater than 1/sec on ppc64le time: NewTicker will not emit ticks at a frequency greater than 1/sec on qemu user mode ppc64le Jan 16, 2020
@bradfitz bradfitz added this to the Backlog milestone Jan 16, 2020
@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Jan 16, 2020

It's much easier for users to set some Travis flags than request & configure an OSU OSL VM, so a workaround here might be nice.

First I'd file a qemu bug (please do so and link it here), but if that proves too slow or hard to fix, perhaps we could detect that we're running under qemu-user and switch to using a syscall rather than the VDSO?

@laboger
Copy link
Contributor

@laboger laboger commented Jan 16, 2020

This issue described a QEMU bug for ppc64le which could be related. Check the QEMU versions to see if you have one that includes the QEMU fix.

#34139

@patrickmeiring
Copy link
Author

@patrickmeiring patrickmeiring commented Jan 16, 2020

The failing Travis CI build is pulling in the latest version of "multiarch/qemu-user-static", which is 4.2.0-2 as of writing. This packages version 4.2.0 of QEMU (the latest).

The fix in the other thread appears to be in the v4.2.0 release:
https://git.qemu.org/?p=qemu.git&a=search&h=b0ca999a43a22b38158a222233d3f5881648bb4f&st=commit&s=A+number+of+TCG+fixes+in+FPU+emulation

@ceseo
Copy link
Contributor

@ceseo ceseo commented Jan 17, 2020

I think I found the bug. QEMU doesn't implement VDSO, so it falls back to the syscall. And there was a wrong offset when getting the syscall return. Could you please test this patch?

ceseo@cce371c

If that works, I'll submit a CL.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Jan 17, 2020

@ceseo, good find! That looks correct to me at least. Send the CL? It could make Go 1.14. (And we can backport it.)

@bradfitz bradfitz modified the milestones: Backlog, Go1.14 Jan 17, 2020
@gopherbot
Copy link

@gopherbot gopherbot commented Jan 18, 2020

Change https://golang.org/cl/215397 mentions this issue: runtime: fix wrong offset when calling syscall in nanotime

@tklauser
Copy link
Member

@tklauser tklauser commented Jan 20, 2020

Verified with the provided TestTickerPerformance that https://golang.org/cl/215397 fixes the issue on qemu-ppc64le version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.21).

@laboger
Copy link
Contributor

@laboger laboger commented Apr 3, 2020

Doesn't this CL need to be backported?

@tklauser
Copy link
Member

@tklauser tklauser commented Apr 3, 2020

@gopherbot please consider this for backport to 1.13, it's a regression.

/cc @ianlancetaylor

@gopherbot
Copy link

@gopherbot gopherbot commented Apr 3, 2020

Backport issue(s) opened: #38236 (for 1.13).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot
Copy link

@gopherbot gopherbot commented Apr 3, 2020

Change https://golang.org/cl/227179 mentions this issue: [release-branch.go1.13] runtime: fix wrong offset when calling ppc64x nanotime syscall

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 3, 2020

@laboger It's already fixed in 1.14, send CL 227179 for 1.13. Thanks.

gopherbot pushed a commit that referenced this issue Apr 3, 2020
… nanotime syscall

There is a wrong offset when getting the results of a clock_gettime
syscall. Although the syscall will never be called in native ppc64x,
QEMU doesn't implement VDSO, so it will return wrong values.

For #36592
Fixes #38236

Change-Id: Icf838075228dcdd62cf2c1279aa983e5993d66ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/215397
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
(cherry picked from commit 71239b4)
Reviewed-on: https://go-review.googlesource.com/c/go/+/227179
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

7 participants
You can’t perform that action at this time.