-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: expose the internal/runtime/sys.Prefetch
intrinsic
#68769
Comments
internal/runtime/sys.Prefetch
intrinsic.internal/runtime/sys.Prefetch
intrinsic.
internal/runtime/sys.Prefetch
intrinsic.internal/runtime/sys.Prefetch
intrinsic
This signature has less pitfalls: func Prefetch[E any, P ~*E | ~unsafe.Pointer](P) Does this have any memory model interactions ?
👎
👍 Lastly I'm not sure how easily portable to other architectures that would be. |
There are no memory model interactions here. A prefetch by definition does not change the meaning of a program, it only affects the execution time. On an architecture that doesn't have a prefetch instruction, the proposed function would be a no-op. It's worth noting that amd64 has at least six different kinds of prefetch instructions. The GCC compiler |
I thought it would be able to generate pagefaults which can be turned into callbacks with signals but after going over the docs at least on amd64 it does not.
I'm more worried about architectures that don't implement it the same way. |
This is easily solved by only using it when it's useful to use. General purpose data structures are unlikely to need this. I only found one place in the Go project that actually uses
In my use case the bucket pointers for which I'm prefetching are under certain rare conditions nil.
The current situation is that there's a small set of |
Bordering off-topic but apparently some mutex implementations use PREFETCHW when spinning. AMD64 manual does say that
which one could read as stalling until the other CPU core is done with the cache-line, possibly yielding the CPU time to a virtual core when hyper-threading is enabled. It also sounds useful in my use case considering that all of my buckets happen to carry a |
Prefetching would also be nice for radix sorting. There should be an intrinsic that takes a slice and an index into it, prefetching from an index into the array while omitting the bounds check. Prefetching unmapped storage / beyond the end of an object is harmless and it's often very annoying to avoid failing the bounds check in code that iterates through arrays, like this hypothetical example:
This code would crash on the final iteration as This could of course also be achieved with a |
Here's a mini-survey of prefetch instructions from the architectures Go supports right now: 386 / amd64These have prefetch instructions for reading: There is also a prefetch for writing armARM has arm64ARM64 has The semantics seem to be the same as on ARM. MIPSMIPS has a PowerPCThe PowerPC term for prefetching is “touch.” All touch instructions support prefetching into L1 and L2 cache. There are touch instructions for instruction memory ( S390S390 supports prefetching through the RISC-VPrefetch instructions for loading, storing, and instructions are part of the |
Proposal Details
I'm working on a concurrent hash map and with me having experience in using them in highly concurrent contexts with high throughput requirements I knew to add batch Put and Get methods to it. Such API design exposes the very obvious opportunity for prefetching the data.
To experiment with the impact of
internal/runtime/sys.Prefetch
on the batch Get call I modifiedruntime/panic.go
to expose the intrinsic.go tool objdump
then confirmed that a PREFETCHT0 instruction is emitted into the binary as expected.Looking up 100k distinct entries from a hash map with 5000k entries became significantly faster.
Which looks like a worthwhile optimization for me.
I personally like how for me this
runtime.Prefetch
Just Works™ so I propose adding just it. I also propose to not tell of it's addition in the change log so people don't use it willy-nilly. Just imagine the damage that all the blog spam will cause in the overall ecosystem, possibly leaking into other language communities too.The text was updated successfully, but these errors were encountered: