188.8.131.52 introduced an actual check that the path for all source files is correctly externalized as an LPN at cold-init time. Due to a longstanding bug in MAKE-FILE-INFO-NAMESTRING, not fixed with 184.108.40.206, it is possible for the system to create a pathname such as "SYS:SRC;LISP;SBCL;SBCL-GIT;OUTPUT;STUFF-GROVELED-FROM-HEADERS.LISP". Once the SYS: logical pathname translations are set up, this path is not valid, causing a build failure. Fixed, at the cost of disallowing paths in SYS:SRC that have a final directory of OUTPUT, not likely to be an issue in practice.
* CHECK-FOR-RECURSIVE-READ signaled a READER-ERROR without supplying a stream initarg.
make genesis of identical fasls produce identical cold cores. 4 messages follow: documentation handling CLISP supports documentation for packages now, so remove the read-time conditional. However, don't try to use the documentation for the CL or KEYWORD packages (as they come from the host directly) LAYOUT clos hash values Set them in cold-init using the target's RANDOM rather than in genesis using the host's. hash table traversal in genesis MAPHASH will not give repeatable results in general, and certainly won't between distinct implementations of hash tables. Sort the contents of hash tables according to a predicate which completely orders the contents. (This is mildly tricky for FDEFN names: we have to assume that we are only dealing with names of the forms SYMBOL and (SETF SYMBOL)). give smallvecs an initial element Whoops. The smallvecs (representing the memory image of the core being constructed) were being constructed without an initial-element. For the most part this wouldn't matter, because it will (almost) all be overwritten by the genesis process itself. The crux is in that (almost), though; in some cases it matters, such as producing bogus values for symbol tls slots. Mostly implementations seem to zero-fill newly-constructed (unsigned-byte 8) arrays, but there seem to be some circumstances under which CLISP will produce one with random data in it...
Constant coalescing decisions, legitimately differing between different hosts, can if not very careful propagate into the target, often through vop-parse structures. Be explicit in which constants can be shared and which shouldn't. 5 messages follow: constant coalescing KLUDGE, part 1 [(any)] The constant initforms for the vop-parse structure are evaluated on the host; therefore, their coalescing is at the host's discretion. This wouldn't matter except that (why?) vop-parse structures get dumped at each vop definition. Make the coalescedness explicit. constant coalescing KLUDGE, part 2: [(fixnumize n)] The static function template for at least LENGTH (in subprim.lisp) contains two instances of (FIXNUMIZE 2), which are coaelesced differently on different host lisps. We can KLUDGE around this problem (and gain a millimetric amount of efficiency, too!) by evaluating the FIXNUMIZE calls at expansion time. remove confusing code structure sharing from DEF-MOVE-IF I can't actually see exactly where the code structure sharing happens nor why it causes xc fasl contents to differ between hosts, but since it makes the code clearer to rewrite the macro... fix two separate issues in compiler/globaldb One is a hash-table traversal issue; the other is coalescing of constants. I *think* what's going on in the latter case is that there are two separate ways that shared constants can happen. One is in the dumping of objects which are EQUAL, where the compiler can dump a reference to a previous object instead; the other is the dumping of a single object with circularities, where a nil is dumped along with a later instruction to backpatch the circularity in. We need to ensure a deterministic cold-init-form, so that means we need to control the coalescing in the _host_ compiler (because the cold-init-form is generated from introspection), but of course we can't, so we COPY-TREE instead, which will allow the xc to coalesce and will prevent the form as compiled from sharing structure. Static function template vop macro has a common subexpression, factored out as new-ebp-ea.
2 messages follow: stable-sort the time specifications Dunno if this is actually necessary for anything. make unpacking and repacking happen in a determined order The unpacked blocks were stuffed into a hash table and then maphashed over; as in other cases, this is host-dependent. Use a list and pushnew instead.
To get floating point stuff exactly right, we should build a complete IEEE float implementation to do calculations in for the cross-compiler. Since that's not going to happen this millennium, instead try to be careful when writing code that looks constant-foldable. Some other fixups on the way... 6 messages follow: fix load-time tests in src/code/pred It turns out that #c(1.1 0) is not portable: it's a REAL in clisp and a COMPLEX in sbcl. begin work on floats Floats Are Hard. The issue is that the host's float implementation, even if it agrees with SBCL that SINGLE-FLOAT is IEEE single and DOUBLE-FLOAT is IEEE double, may not match sbcl idiosyncracy for idiosyncracy. For example, clisp doesn't support denormals, so its LEAST-FOOATIVE-QUUXLE-FLOAT constants are very different from sbcl's: and sbcl's can't even be represented within the host. Ugh. Defining the print-related MIN-E constants is, however, easy enough. comment (well, #!+long-float) out some floating point constants The clauses in question were never taken absent #!+long-float anyway. -0.0 is not portable: many lisps don't respect negative zeros Use make-unportable-float instead, and hope that this doesn't matter during cross-compilation... host floating point differences Not all lisps think (log 2d0 10d0) is the same. Compute it accurately and use LOAD-TIME-VALUE. tentative attempt at smoothing over host floating point differences Compute all the necessary constants as double-float bit patterns using LOAD-TIME-VALUE.
It took a little time to get right, but here's (I hope) invariant constant string coalescing in the cross-file-compiler. 3 commit messages follow: more invariant constant string coalescing When dumping strings in cross-compilation, we always end up dumping as base-string. That means we need to compare all strings of whatever underlying array type for content equality, not just strings of the same type. This shows up when dumping in the same file "BLOCK" and the value of (symbol-name 'block) under CLISP, which dumps two separate values. dumping string constants, the other half Not only do we have to enter entries into the hash table with a known element-type, we also have to retrieve them... bogosity finally picked up by use of a CL symbol name (AND) in src/compiler/x86/insts.lisp... further refinement on constant coalescing Not only must we coalesce all kinds of strings at fasl dump time, we must coalesce the constants in our TN representation while we're compiling, otherwise we will get different lengths of constant vectors for the same function depending on how many different string representations there are in the host compiler.
- WITH-ACTIVE-PROCESSES-LOCK does not allow WITH-INTERRUPTS because that can lead to recursive lock attempts upon receiving a SIGCHLD. - if fork() in RUN-PROGRAM fails, signal the error outside the lock. - the SIGCHLD handler only reaps processes started by RUN-PROGRAM in order not to interfere with SB-POSIX:WAIT, SB-POSIX:WAITPID and their C equivalents (thanks to James Y Knight). - the SIGCHLD handler is installed once at startup, because on Darwin sigaction() seems to do unexpected things to the current sigmask.
Previously, we constructed a printed version of the code and used that, but it seems remarkably hard to get identical printed contents from identical list structure in three different implementations: indentation, line breaks, QUOTE and FUNCTION, and so on all seem to vary. 2 previous commit messages follow: bind printer control variables in FAILED-AVER FAILED-AVER prints source code with ~A. If printer control variables are different in different implementations, then the error message will be different. Actually at the moment the binding (of *PRINT-PRETTY* to T) is probably a no-op. We tried binding *PRINT-PRETTY* to NIL to get the same output as XCL, but apparently, other implementations (CLISP, reportedly ECL) don't obey CLHS 220.127.116.11 for printing conses when the pretty printer is off. another attempt to tame AVER binding printer control variables is all well and good, but linebreaks cause problems. We could probably deal with that with a suitable value for *pprint-right-margin*, but... instead, just save the form, not its printed representation.
Various ways in which a host constant can leak through the cross-compiler into the target are plugged. 5 commit messages follow: fix host most-positive-fixnum leak in declaration Found by comparing object code for SORT-VECTOR between clisp and sbcl xc hosts. Fix most-fooative-fixnum leak in number-psxhash Gah, floats. Most cases will be more complicated to fix than this one. (Fixing things absolutely properly would be hugely difficult; this fix should do for now...) more careful cross-compiler constant-form-value We need to take values from the xc info database in preference to using SYMBOL-VALUE, otherwise we'll leak from the host. (In particular, this one was for function in debug.lisp with lambda lists of the form (&optional (n most-positive-fixnum)) deal with another host fixnum-related constant leak This time it's in the definition of the integer constants which are both fixnums and exactly representable as floats. Amazingly, just above these definitions are the ones for SB!XC:MOST-POSITIVE-FIXNUM and friends; no alarm bells were ringing... fix a fixnum leak in unix-fd type This mistake [ (deftype foo () `(integer 0 ,most-positive-fixnum)) ] seems distressingly easy to make. Not easy to guard against, either. (Aside: is it sensible to define FDs as positive fixnums?)
Genesis already knew about the case of a symbol exported from the CL package with a different home package. For repeatable FASLs, the dumper and the xref internals detector also need to know. 2 commit messages follow: special case dumping of CL symbols with other home packages Just like in genesis, we need to deal with CL symbols which are permitted to have a home package that's not CL. SBCL doesn't do that, but other implementations legitimately can and do; nevertheless, dump as though it were a CL symbol. xref cross-compilation consistency fixes Treat as internal symbols (a) symbols with home package being "SB-XC", and (b) symbols which are external in the CL package but whose home package is elsewhere.
By having minimal debug names for toplevel forms and component names, we avoid having arbitrary gensyms or, horror of horror, QUOTE: which is printed differently in different implementations... 2 commit messages follow: minimal debug names for cross-compiled top-level forms Otherwise we run the risk of getting arbitrary gensyms dumped as part of the debug name. bandage for ' vs QUOTE in two files Make FIND-COMPONENT-NAME in the XC (which names components, whose names are dumped in xc fasls) use only the first symbol in the context. That will be generally lame but avoids any current instances of QUOTE, which prints differently in different implementations when pretty-printing is off.
The only one that is potentially controversial is the use of READ-PRESERVING-WHITESPACE... 3 commit messages follow: don't print array SB!KERNEL:TYPE in internal error strings Use the specifier instead. (This is a long-standing bug; FIXME: try to find a test case). Use read-preserving-whitespace rather than just read in the compiler With just CL:READ, at least CLISP and SBCL differ on the source locations dumped in the fasls; with READ-PRESERVING-WHITESPACE, things are consistent. disassembler / printer names. The compiler wants to generate names based on all sorts of information, including byte specs, and attempts to make those names by printing all that information into one big string. Unfortunately, that allows the host to insert line breaks, which it will do with maximal perversity. Bind *PRINT-RIGHT-MARGIN* around the printing call in an attempt to minimize this problem.
Usually involves sorting the output of a hash-table loop or set operation. 3 commit messages follow: make the order of (setf cl:foo) defining forms deterministic alphabetize the automated out-of-line definitions of modular functions Otherwise we go in hash-table order, which is not noticeably the same between clisp and sbcl hosts. UNION can return entries in arbitrary order So SORT [a copy of: don't mutate the source code!] the UNION of signed-num and unsigned-num by symbol name.
We need a gensym variant that doesn't share state with *GENSYM-COUNTER*, so that host macroexpansions don't affect us. (We also need to bind our counter variant in the INFO compiler macro, because compiler macros might or might not be expanded...) 11 individual commit messages follow: Implement SB!XC:GENSYM Host implementations can, even during cross-compilation, expand macros (including arbitrary host macros such as CL:DEFUN) in :compile-toplevel function definitions different numbers of times. This is a problem because some gensyms end up in function arglists (e.g. from MULTIPLE-VALUE-BIND as well as from explicit FLETs or LAMBDAs in macro expansions). Our own SB!XC:GENSYM allows us to control the gensym counter we use and hence the symbol names that are dumped. Use SB!XC:GENSYM in BLOCK-GENSYM remove a needless gensym Nothing wrong with a regular symbol here. Bind SB!XC:*GENSYM-COUNTER* in DEFINE-COMPILER-MACRO INFO The compiler-macro for INFO now uses SB!XC:GENSYM, which is OK except that the compiler macro gets used during cross-compilation; some implementations expand compiler macros, while others (e.g. clisp) interpret the relevant code and so don't. Binding the counter variable renders the effect of the compiler macro on the counter invariant. various reworks of macros to use SB!XC:GENSYM In some cases radically decrease vertical space use by judicious use of MAKE-GENSYM-LIST or WITH-UNIQUE-NAMES, both of which go through BLOCK-GENSYM. more reworks of macros to use SB!XC:GENSYM Nothing vastly interesting here. yet more reworks of macros to use SB!XC:GENSYM Nothing much of interest. even more reworks of macros to use SB!XC:GENSYM more reworks of macros for SB!XC:GENSYM goodness. one more SB!XC:GENSYM fix Use WITH-UNIQUE-NAMES in FD-FOO macros. One more gensym
The fasl header is easy to deal with; writing "at cross-compile time" instead of something depending on the host is easy. The debug-source is harder; we set the structure slots to 0 in the cross-compiler, and arrange for cold-init to patch sensible values in (by inventing a new FOP to note the debug-source's arrival). made up of 5 commits, whose individual messages follow: deal with trivial volatile contents of fasl files Don't try to preserve, even in the header, information about which implementation or machine was used for the compilation. Similarly, don't emit the timestamps in the debug-source into the fasls. comments: delete a FIXME and explain a bare `2' consistent source pathname for output/stuff-groveled-from-headers.lisp At the moment it's the only compiled file not in src/; code defensively around that fact. fix a longstanding KLUDGE Find the index of the source slot by introspection rather than using a baffling literal `2'. Unfortunately, in doing so we run into bug #117. patch in the source created/compiled information in cold-init We can't do it before without making our fasls or cold-sbcl.core dependent on filesystem timestamps or current time. The way we do it is perhaps overcomplicated, compared with simply assuming that the file timestamps are right, but has the advantage that it's demonstrably correct: we implement a new FOP specifically for noting our DEBUG-SOURCE, dumped only during cross-compilation, and in genesis we interpret that FOP to build a list of debug-sources, which we can frob in cold-init. Everything should now be restored to its previous functionality.
* For Slime and their ilk, thanks to Tobias Rittweiler. * Sort NEWS with "more important" items on top.
* sprinkle type declarations around to avoid checking for general SEQUENCEs; * use >= in loop exit tests to avoid checks for overflow.
* Index and bound were swapped around. * Also fix the name in the type declamation for INVALID-ARRAY-INDEX-ERROR. Thanks to Stas Boukarev.
* Thanks to Sidney Markowitz. * A few missing NEWS entries.
* Patch by Daniel Lowe.
* Use VALID-FUNCTION-NAME-P to check if we should store the docstring: previously we stored docstrings for anonymous functions under names like (LAMBDA (X)) -- Not Good.
* Contributed by Alex Plotnick <firstname.lastname@example.org>
* Bugfix: PEEK-CHAR always popped the unread-stuff, leading to spurious duplicate echos in some cases. * Minor incompatible change: UNREAD-CHAR on an ECHO-STREAM now unreads onto the echo-stream's input stream. This is unspecified in the CLHS, but makes SBCL compatible with most implementations (AFAICT, everybody but CMUCL). * Minor incompatible change: echo-streams used to buffer arbitrarily many characters in UNREAD-CHAR. Conforming programs can't have relied on this, but non-conforming ones might have; users who need the old CMUCL/SBCL behavior can do it easily and de-facto-portably with Gray Streams. * Possible bugfix that nobody cares about: ECHO-N-BIN (which implements a path through READ-SEQUENCE) can never have worked after an UNREAD-CHAR, because it tried to store characters into an octet buffer.