Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

removed can_fail=True attribute and updated documentation to reflect …

…recommended usage pattern for prefetchByteArray and prefetchAddr
  • Loading branch information...
commit 691fcd997ab5815ccc1d448f8e30e0ac07a9faee 1 parent 23fb7f3
@cartazio authored
Showing with 16 additions and 23 deletions.
  1. +16 −23 compiler/prelude/primops.txt.pp
View
39 compiler/prelude/primops.txt.pp
@@ -2608,13 +2608,6 @@
convention follows the naming convention of the prefetch intrinsic found
in the GCC and Clang C compilers.
- The prefetch primops are all marked with the can_fail=True attribute, but
- they will never fail. The motivation for enabling the can_fail attribute is
- so that prefetches are not hoisted/let floated out. This is because prefetch
- is a tool for optimizing usage of system memory bandwidth, and preventing let
- hoising makes *WHEN* the prefetch happens a bit more predictable.
-
-
On the LLVM backend, prefetch*N# uses the LLVM prefetch intrinsic
with locality level N. The code generated by LLVM is target architecture
dependent, but should agree with the GHC NCG on x86 systems.
@@ -2636,13 +2629,25 @@
The "Intel 64 and IA-32 Architectures Optimization Reference Manual" is
especially a helpful read, even if your software is meant for other CPU
- architectures or vendor hardware.
+ architectures or vendor hardware. The manual can be found at
+ http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html .
- http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html
+ The {\tt prefetchMutableByteArray} family of operations has the order of operations
+ determined by passing around the {\tt State#} token.
+ For the {\tt prefetchByteArray}
+ and {\tt prefetchAddr} families of operations, consider the following example:
+
+ {\tt let a1 = prefetchByteArray2# a n in ...a1... }
-
- }
+ In the above fragement, {\tt a} is the input variable for the prefetch
+ and {\tt a1 == a} will be true. To ensure that the prefetch is not treated as deadcode,
+ the body of the let should only use {\tt a1} and NOT {\tt a}. The same principle
+ applies for uses of prefetch in a loop.
+
+ }
+
+
------------------------------------------------------------------------
@@ -2651,57 +2656,45 @@
---
primop PrefetchByteArrayOp3 "prefetchByteArray3#" GenPrimOp
ByteArray# -> Int# -> ByteArray#
- with can_fail = True
primop PrefetchMutableByteArrayOp3 "prefetchMutableByteArray3#" GenPrimOp
MutableByteArray# s -> Int# -> State# s -> State# s
- with can_fail = True
primop PrefetchAddrOp3 "prefetchAddr3#" GenPrimOp
Addr# -> Int# -> Addr#
- with can_fail = True
----
primop PrefetchByteArrayOp2 "prefetchByteArray2#" GenPrimOp
ByteArray# -> Int# -> ByteArray#
- with can_fail = True
primop PrefetchMutableByteArrayOp2 "prefetchMutableByteArray2#" GenPrimOp
MutableByteArray# s -> Int# -> State# s -> State# s
- with can_fail = True
primop PrefetchAddrOp2 "prefetchAddr2#" GenPrimOp
Addr# -> Int# -> Addr#
- with can_fail = True
----
primop PrefetchByteArrayOp1 "prefetchByteArray1#" GenPrimOp
ByteArray# -> Int# -> ByteArray#
- with can_fail = True
primop PrefetchMutableByteArrayOp1 "prefetchMutableByteArray1#" GenPrimOp
MutableByteArray# s -> Int# -> State# s -> State# s
- with can_fail = True
primop PrefetchAddrOp1 "prefetchAddr1#" GenPrimOp
Addr# -> Int# -> Addr#
- with can_fail = True
----
primop PrefetchByteArrayOp0 "prefetchByteArray0#" GenPrimOp
ByteArray# -> Int# -> ByteArray#
- with can_fail = True
primop PrefetchMutableByteArrayOp0 "prefetchMutableByteArray0#" GenPrimOp
MutableByteArray# s -> Int# -> State# s -> State# s
- with can_fail = True
primop PrefetchAddrOp0 "prefetchAddr0#" GenPrimOp
Addr# -> Int# -> Addr#
- with can_fail = True
Please sign in to comment.
Something went wrong with that request. Please try again.