Small performance improvement to STM: reduce the size of an atomically frame from 3 words to 2 words by combining the "waiting" boolean field with the info pointer, i.e. having two separate info tables/return addresses for an atomically frame, one for the normal case and one for the waiitng case.
In SMP mode it is still possible for an update frame on the stack to point to an indirection, when two threads evaluate the same thunk (see comment for details). So we use the following trick: when the GC discovers an update frame pointing to an indirection, it changes the indirection to be an IND_PERM, so it will be retained rather than discarded.
unlockClosure() requires a write barrier for the compiler - write barriers aren't required for the CPU, but gcc re-orders non-aliasing writes unless we use an explicit barrier. This only just showed up when we started compiling the RTS with -O2.
A patch to the already-somewhat-delicate machinery that deals with pattern-matching on unboxed tuples. This patch deals with pattern matches that can fail, e.g. case f x of (# Just x, Nothing #) -> ... The fix is in desugaring of HsCase (DsExpr.lhs). The test is dsrun013