Skip to content

Commit

Permalink
0.8.13.77.character.24:
Browse files Browse the repository at this point in the history
	"You'll make plenty of new friends in no time at all"

	Make the cross-compiler dump all constant strings as
	BASE-STRINGs.
	... cross-compiler TYPE-OF, TYPEP and friends informed of this
		logic.  (So a host string -- of any type -- will be
		SB!XC:TYPEP BASE-STRING even if it's not CL:TYPEP
		BASE-STRING).
	... character-string dumping functions can be moved into
		target-dump.lisp.
	... genesis should never see CHARACTER-STRING-FOP.
  • Loading branch information
csrhodes committed Sep 20, 2004
1 parent 15e1447 commit af580e8
Show file tree
Hide file tree
Showing 6 changed files with 54 additions and 47 deletions.
39 changes: 13 additions & 26 deletions TODO.character
Expand Up @@ -109,6 +109,19 @@ even current bloated sbcl.core look tiny.)
(Note that the symbol printer has multiple bugs in its logic
which have not been fixed by this branch.)

** fix GENESIS (and the cross-compiler in general) to dump
BASE-STRINGs always. (Rationale: SBCL aspires to portability, so
should not use any non-STANDARD-CHAR in its source code. By
definition, therefore, all strings and stringlike objects are dumpable
as BASE-STRING, which allows for identical cold fasls and cores to be
generated from lisps with different BASE-CHAR/CHARACTER distinctions.)
-- done: the cross-compiler type system, the cross-compiler dumper
and genesis cooperate to make every host string look like a
target base-string.
-- (Note that dubious uses of CL:TYPE-OF in portions of the
compiler such as CONVERT-MEMBER-TYPE, TWO-ARG-DERIVE-TYPE
remain.)

** implement an SB-ALIEN:UTF8-STRING parallel to SB-ALIEN:C-STRING.
(Rationale: for calling out to Pango or similar. Actually, a valid
use might be in Unix libc/kernel functions: at least under Linux, I
Expand All @@ -122,32 +135,6 @@ support people other than simply those living in non-Eurozone Western
Europe or the United States of America.) This requires at minimum
adjusting the dumper/fop code and the low-level memory accessors.

** fix GENESIS to dump BASE-STRINGs always. (Rationale: SBCL aspires
to portability, so should not use any non-STANDARD-CHAR in its source
code. By definition, therefore, all strings and stringlike objects
are dumpable as BASE-STRING, which allows for identical cold fasls and
cores to be generated from lisps with different BASE-CHAR/CHARACTER
distinctions.)
-- NOTE I: This is actually moderately tricky, because not only do
GENESIS (src/compiler/generic/genesis) and the regular dumper
(src/compiler/dump) need to be adjusted, in relatively simple
ways, but also the cross-compiler's versions of various
type-investigating predicates (CTYPE-OF, SB!XC:TYPEP) must be
taught that objects which are any kind of string on the host will
become BASE-STRINGs on the new lisp. Probably best to do this
after the two string representations are separated, so that any
breakage will be immediately obvious -- but it may be that this
is too hard to implement properly at all.
-- NOTE II: (while investigating and writing NOTE I) there are a
couple of extremely dubious uses of CL:TYPE-OF in bits of the
compiler: see the functions CONVERT-MEMBER-TYPE,
TWO-ARG-DERIVE-TYPE.
-- NOTE III: (while writing NOTE II) The current system is probably
equally broken, as the cross-compiler version of CTYPE-OF will
happily create types which need not exist in the target lisp (in
the ARRAY clause). Using SPECIALIZE-ARRAY-TYPE would probably be
better, if the above TODO item is not done in its entirety.

** possibly retain a CHAR-CODE-LIMIT = 256 build option, with
character point encoding dependent on locale. This requires
implementing, at a minimum, latin-X eternal formats, so that files
Expand Down
8 changes: 8 additions & 0 deletions src/code/cross-type.lisp
Expand Up @@ -82,6 +82,9 @@
'fixnum)
(t
'integer)))
((subtypep raw-result 'simple-string)
`(simple-base-string ,(length object)))
((subtypep raw-result 'string) 'base-string)
((some (lambda (type) (subtypep raw-result type))
'(array character list symbol))
raw-result)
Expand Down Expand Up @@ -360,6 +363,11 @@
(make-member-type :members (list x)))
(number
(ctype-of-number x))
(string
(make-array-type :dimensions (array-dimensions x)
:complexp (not (typep x 'simple-array))
:element-type (specifier-type 'base-char)
:specialized-element-type (specifier-type 'base-char)))
(array
(let ((etype (specifier-type (array-element-type x))))
(make-array-type :dimensions (array-dimensions x)
Expand Down
31 changes: 15 additions & 16 deletions src/compiler/dump.lisp
Expand Up @@ -601,6 +601,11 @@
(t
(unless *cold-load-dump*
(dump-fop 'fop-normal-load file))
#+sb-xc-host
(dump-simple-base-string
(coerce (package-name pkg) 'simple-base-string)
file)
#-sb-xc-host
(dump-simple-character-string
(coerce (package-name pkg) '(simple-array character (*)))
file)
Expand Down Expand Up @@ -734,10 +739,17 @@
(*)))
x)))
(typecase simple-version
#+sb-xc-host
(simple-string
(unless (string-check-table x file)
(dump-simple-base-string simple-version file)
(string-save-object x file)))
#-sb-xc-host
(simple-base-string
(unless (string-check-table x file)
(dump-simple-base-string simple-version file)
(string-save-object x file)))
#-sb-xc-host
((simple-array character (*))
(unless (string-check-table x file)
(dump-simple-character-string simple-version file)
Expand Down Expand Up @@ -900,7 +912,7 @@
(defun dump-base-chars-of-string (s fasl-output)
(declare (type base-string s) (type fasl-output fasl-output))
(dovector (c s)
(dump-byte (char-code c) fasl-output))
(dump-byte (sb!xc:char-code c) fasl-output))
(values))

;;; Dump a SIMPLE-BASE-STRING.
Expand All @@ -910,20 +922,6 @@
(dump-base-chars-of-string s file)
(values))

;;; a helper function shared by DUMP-SIMPLE-CHARACTER-STRING and DUMP-SYMBOL
(defun dump-characters-of-string (s fasl-output)
(declare (type string s) (type fasl-output fasl-output))
(dovector (c s)
;; DUMP-UNSIGNED-32 soon
(dump-byte (char-code c) fasl-output))
(values))

(defun dump-simple-character-string (s file)
(declare (type (simple-array character (*)) s))
(dump-fop* (length s) fop-small-character-string fop-character-string file)
(dump-characters-of-string s file)
(values))

;;; If we get here, it is assumed that the symbol isn't in the table,
;;; but we are responsible for putting it there when appropriate. To
;;; avoid too much special-casing, we always push the symbol in the
Expand Down Expand Up @@ -971,7 +969,8 @@
file)
(dump-unsigned-32 pname-length file)))

(dump-characters-of-string pname file)
#+sb-xc-host (dump-base-chars-of-string pname file)
#-sb-xc-host (dump-characters-of-string pname file)

(unless *cold-load-dump*
(setf (gethash s (fasl-output-eq-table file))
Expand Down
5 changes: 1 addition & 4 deletions src/compiler/generic/genesis.lisp
Expand Up @@ -2104,10 +2104,7 @@ core and return a descriptor to it."

(clone-cold-fop (fop-character-string)
(fop-small-character-string)
(let* ((len (clone-arg))
(string (make-string len)))
(read-string-as-bytes *fasl-input-stream* string)
(base-string-to-core string)))
(bug "CHARACTER-STRING dumped by cross-compiler."))

(clone-cold-fop (fop-vector)
(fop-small-vector)
Expand Down
16 changes: 16 additions & 0 deletions src/compiler/target-dump.lisp
Expand Up @@ -13,6 +13,22 @@

(in-package "SB!FASL")

;;; a helper function shared by DUMP-SIMPLE-CHARACTER-STRING and
;;; DUMP-SYMBOL (in the target compiler: the cross-compiler uses the
;;; portability knowledge and always dumps BASE-STRINGS).
(defun dump-characters-of-string (s fasl-output)
(declare (type string s) (type fasl-output fasl-output))
(dovector (c s)
;; DUMP-UNSIGNED-32 soon
(dump-byte (char-code c) fasl-output))
(values))

(defun dump-simple-character-string (s file)
(declare (type (simple-array character (*)) s))
(dump-fop* (length s) fop-small-character-string fop-character-string file)
(dump-characters-of-string s file)
(values))

;;; Dump the first N bytes of VEC out to FILE. VEC is some sort of unboxed
;;; vector-like thing that we can BLT from.
(defun dump-raw-bytes (vec n fasl-output)
Expand Down
2 changes: 1 addition & 1 deletion version.lisp-expr
Expand Up @@ -17,4 +17,4 @@
;;; checkins which aren't released. (And occasionally for internal
;;; versions, especially for internal versions off the main CVS
;;; branch, it gets hairier, e.g. "0.pre7.14.flaky4.13".)
"0.8.13.77.character.23"
"0.8.13.77.character.24"

0 comments on commit af580e8

Please sign in to comment.