Commit
Interpreter Improve method cache printing to include hex hashes and totals of printed items. When using intel inlining syntax, use correct one Cogit: Fix bad performance regression with open PICs and fix slip in generated perform primitive. The regression, when the open PIC compilation code was refactored to eliminate duplication of the probe generation, was to misorder the probes, so that when a new method was entered into the cache at a clashing line, which zeros the two entries following the first, the open PIC would search for the last entry, hence always missing. The perform code had two copies of the second probe and no third probe. Remember to nil out the last uncoggable method variables on code compaction, become, GC JIT blocks more agressively in the value primitive; always try and JIT (unless noted in lastUncoggableInterpretedBlockMethod) if reached from the machine code primitive. Change traceFlags so that traceFlag 2 prints only interpreted sends, and trace flags 258 print both interpreted and machine code sends. Correct the prototype of genDoubleFailIfZeroArgRcvr:arg: It's used as a function taking int, not #sqInt See #genDoubleArithmetic:preOpCheck: and #genDoubleArithmetic:preOpCheck:boxed: for 64bits Fixed a register spill for remote inst var access. Improved error code returns Fixed minor bug in noMustBeBoolean flag and added a comment. Fixed twoPath compilation both with and without immutability. I (Clement) changed the test from "isInstVarStore" to "is1ByteInstVarStore" because other instVarStore may happen on context objects, requiring a frame. The compilation of Context>>setSender:receiver:method:closure:startpc: requires a frame for example. Without immutability, the two path is done for methods with multiple inst var stores and branch on the receiver's age. With immutability, all setters are compiled with two paths and branch on the receiver mutability and age. That's very funny because now the VM with immutability is clearly faster on binary tree than the VM before the two paths compilation (obviously we backported that to the non-immutability VM, so the VM without immutability is still faster). (Eliot: Woo hoo!!) Added extB 3rd bit flag to mark if the store requires the immutability check or not in multiple store instructions (LitVar, instVar, RemoteInstVar). Fixed a zip (flag was inverted for no store check). Now that immutability is rock stable, I made two generic store methods, one for stores into maybe context objects, one for stores into non context objects. Both methods require multiple parameters such as "requireImmutabilityCheck" and "requireStoreCheck". The generic methods test for the IMMUTABILITY flag to know if they need to generate the test or not in addition to the parameter. Every heap store calls one of these two methods, so there's no code duplication. Fixed some methods to return error codes instead of 0. Sista Cogit: - Fixed inlined multiply primitive. - disabled jitting of full block temporarily (to have the VM stable). Spur: Improve method cache locality in Spur, which uses class indices and tag patterns as class tags. Shift up the class tag by 2 bits so that the least significant two bits are included in the hash. Make sure retryPrimitiveOnFailure is option: #SpurObjectMemory. Fix declaration of tenuringProportion in SpurGenerationScavenger: it is expected to be a double (between 0.0 and 1.0), not a sqInt. Correct, but not fix, ceTraceStoreOf:into: for Spur (use isImmediate:, but not fixed cuz stores may not be into ReceiverResultReg any more). MSVC has a problem with the macro-expansion trickery involving assert (vis ||), GIV and macro indirections. Using the getter (And the translated code from that) works fine, tho. 64-bit Spur: Remove Undefined Behavior that prevents correct SmallFloat handling The following symptoms were experienced with Squeak stack spur 64 bits VM and gcc 4.9.2 with -O2 optimization (mvm -f). 2.0 = 3.0 -> true. 2.0 * 3.0 -> 4.0. For solving that it's necessary to remove undefined behavior related to left shifting a signed integer. Instead of generating something like: ((sqLong) rcvr) << arg It's better to generate it like this: (sqLong)(((usqLong) rcvr) << arg) This way we preserve signedness and Behavior is well defined. Since the formulation is rather heavy, I've (Nicolas) also added tricks to avoid some casts if variable is long enough and unsigned already. If we later switch longAt and some others as unsigned as already discussed here, the generated C might almost be readable. I've also replaced pointer aliasing used to get/set SmallFloat value, like: doubleResult = (double *)( & rawBitsInteger )[0]; by memcpy: memcpy( &doubleResult , &rawBitsInteger , sizeof(doubleResult) ); Why is memcpy less evil than pointer aliasing? With pointer aliasing any other write into a long integer could modify doubleResult. This completely defeat optimization - the holy grail of C people, they can't bother that FORTRAN compilers are faster than theirs ;). With this greatly biased wisdom, they declared this construct as undefined behavior, giving priority to optimization rather than backward compatibility or programmers' intentions... memcpy is less evil because it's localized (one shot). memcpy is heavily optimized (no function call generated, just about the same instructions as pointer aliasing), so there's no reason to not abide by the standard. BitBlt/Squeak3D plugins: BitBltSimulation>>primitivePixelValueAtX:y: needs to be modified to allow the optimization mentioned in VMMaker.oscog-eem.1888's message. This modification is similar to those in that same commit.
- Loading branch information
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.