See also #4426
A value of type 'Proxy# a' can only be created through the new, primitive witness 'proxy# :: Proxy# a' - a Proxy# has no runtime representation and is thus free. This lets us clean up the internals of TypeRep, as well as Adam's future work concerning records (by using a zero-width primitive type.) Authored-by: Edward Kmett <email@example.com> Authored-by: Austin Seipp <firstname.lastname@example.org> Signed-off-by: Austin Seipp <email@example.com>
- `primitive` is updated to upstream's HEAD which is essentially `primitive-0.5.1.0`, plus a core-lint-error workaround for #8355 and some minor cleanups. - `vector` is updated to upstreams `vector-0.10.9.1` release Note: The upstream repo location has changed to GitHub, hence the update in the `packages` file Signed-off-by: Herbert Valerio Riedel <firstname.lastname@example.org>
*p is both read and written to by the cmpxchg instruction, and therefore should be given the '+' constraint modifier. (In GCC's extended ASM language, '+' means that the operand is both read and written to whereas '=' means that it is only written to.) Otherwise, the compiler is allowed to rewrite something like SpinLock lock; initSpinLock(&lock); /* sets lock = 1 */ ACQUIRE_SPIN_LOCK(&lock); into SpinLock lock; ACQUIRE_SPIN_LOCK(&lock); because according to the asm statement, the previous value of 'lock' is not important.
…(#8291) See also #5435. Now we have to remember the the StablePtrs that get created by the module initializer so that we can free them again in unloadObj().
The problem with unreachable code is that it might refer to undefined registers. This happens accidentally: a block can be orphaned by an optimisation, for example when the result of a comparsion becomes known. The register allocator panics when it finds an undefined register, because they shouldn't occur in generated code. So we need to also discard unreachable code to prevent this panic being triggered by optimisations. The register alloator already does a strongly-connected component analysis, so it ought to be easy to make it discard unreachable code as part of that traversal. It turns out that we need a different variant of the scc algorithm to do that (see Digraph), however the new variant also generates slightly better code by putting the blocks within a loop in a better order for register allocation.
This merge revises and extends the current SIMD support in GHC. Notable features: * Support for AVX, AVX2, and AVX-512. Support for AVX-512 is untested. * SIMD primops are currently LLVM-only and documented in compiler/prelude/primops.txt.pp. * By default only 128-bit wide SIMD vectors are passed in registers, and then only on the X86_64 architecture. There is a "hidden" flag, -fllvm-pass-vectors-in-regs, that causes GHC to generate LLVM code that assumes all vectors are passed in registers by LLVM. This can be used with a suitably patched version of LLVM, and if we get LLVM 3.4 patched, we can consider turning it on by default for LLVM 3.4+. This would mean that we couldn't mix LLVM <3.4-compiled object files with LLVM >=3.4-compiled object files, but I don't see that as much of a problem. * utils/genprimcode has been hacked up to allow us to write vector operations once and have them instantiated at multiple vector types. I'm not thrilled with this solution, but after discussing with Simon PJ, what I've implemented seems to be the minimal reasonable solution to the problem of exploding primop boilerplate. The changes are documented in compiler/prelude/primops.txt.pp. * Error handling is sub-optimal. My patch checks to make sure that vector primops can be compiled efficiently based on the current set of dynamic flags. For example, if -mavx is not specified and the user tries to use a primop that adds together two 256-bit wide vectors of double-precision elements, the user will see an error message like: ghc-stage2: sorry! (unimplemented feature or known bug) (GHC version 7.7.20130916 for x86_64-unknown-linux): 256-bit wide floating point SIMD vector instructions require at least -mavx.
…f dynamic flags. SIMD vector instructions currently require the LLVM back-end. The set of available instructions also depends on the set of architecture flags specified on the command line.
LLVM's GHC calling convention only allows 128-bit SIMD vectors to be passed in machine registers on X86-64. This may change in LLVM 3.4; the hidden flag -fllvm-pass-vectors-in-regs causes all SIMD vector widths to be passed in registers on both X86-64 and on X86-32.