-
Notifications
You must be signed in to change notification settings - Fork 18.1k
runtime: cputicks has low precision on arm #9725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
ARMv6 and ARMv7 has a cycle counter, but it is privileged http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0211i/Bihcgfcf.html What about using the perf_event_open syscall on linux? |
As far as I see, it requires using read syscall to access cpu cycle counter. This may be too expensive for tracer. However, we don't have better solutions for now. var prevTicks int64 func cputicks() int64 { This also can be slow and obviously will give locally skewed traces. |
Right now nanotime is a syscall on linux/arm, so a read from perf_event_open shouldn't be any more expensive. Your way might be better if we switch to VSDO for time on linux/arm, but I cannot work out when it was introduced into linux for arm. |
I don't think linux/arm has a VDSO function for time. There are some kernel patches floating around but nothing in Linus' tree that I can see. (I've had the reasons this is hard on ARM explained to me a few times but never really understood it...) |
Dmitry, you mentioned that your trace infrastructure is based on chrome. If
|
Another weird thing is the test reliably fails on my dual core Pandaboard,
|
@davecheney Here is how chromium does that: |
@mwhudson clock_gettime on arm/arm64 is even already documented in man pages: |
did anyone follow up on this? |
even if the clock_gettime exported in vDSO is faster,
we can't always use it as it's only exported for ARM
since v4.1 of the kernel.
|
Wow, that's fresh. Most SoC's are lucky if they can do better than some moldy version of 3.0 that seems to ship with all iMX6 platforms. |
Wait v4.1? my take is v3.10 |
It's the 32-bit arm kernel that only exposes the clock_gettime starting
from 4.1, and your link clearly states it's about arm64.
|
Yes my bad, I read arm, as arm in general. 32bit arm is 4.1, arm64 is much earlier |
It sounds like this is not easy to fix (because of hardware privilege issues). Also the new trace sequence numbers may mean we don't need to fix it, since these things don't all compare equal anymore. I'm going to mark this as closed unless there's some reason to keep it open. |
runtime/os_linux_arm.go:
func cputicks() int64 {
// Currently cputicks() is used in blocking profiler and to seed fastrand1().
// nanotime() is a poor approximation of CPU ticks that is enough for the profiler.
// randomNumber provides better seeding of fastrand1.
return nanotime() + int64(randomNumber)
}
This causes runtime/pprof trace tests failures:
--- FAIL: TestTrace (0.00s)
trace_test.go:56: failed to parse trace: g 3 is not runnable before traceEvGoWaiting (offset 48, time 30528)
--- FAIL: TestTraceStress (1.32s)
trace_test.go:151: failed to parse trace: g 3 is not runnable before traceEvGoWaiting (offset 75185, time 30528)
--- FAIL: TestTraceSymbolize (0.04s)
trace_test.go:199: failed to parse trace: g 3 is not runnable before traceEvGoWaiting (offset 48, time 30481)
http://build.golang.org/log/413c7741597ae53221e88070d1b5a70c3a365b03
http://build.golang.org/log/46208c3a96add5a5e5b15ed36ca1d5aa09f7173b
Tracer expects that cputicks are globally monotonic and has cycle precision (subsequent events has different timestamps). The former is probably true for this impl, but the latter is not true.
The text was updated successfully, but these errors were encountered: