Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

1.0.32.15: update Unicode data files to Unicode 5.2

We do still need also to update a small bit of code, but at least the
explanatory comment now makes it obvious which bits.
  • Loading branch information...
commit 9b2b4bc76fc9026598e0cd1a9787a69392166f57 1 parent 3eae72c
@csrhodes csrhodes authored
View
6 NEWS
@@ -7,7 +7,11 @@ changes relative to sbcl-1.0.32:
* new feature: SB-INTROSPECT:WHO-SPECIALIZES-GENERALLY to get a list of
definitions for methods specializing on the passed class itself, or on
subclasses of it.
- * fixes and improvements related to external formats:
+ * fixes and improvements related to Unicode and external formats:
+ ** the Unicode character database has been upgraded to the
+ Unicode 5.2 standard, giving names and properties to a number of new
+ characters, and providing a few extra characters with case
+ transformations.
** fix a typo preventing conversion of strings into octet vectors
in the latin-2 encoding. (reported by Attila Lendvai; launchpad bug
#471689)
View
10 src/code/target-char.lisp
@@ -162,7 +162,7 @@
;;;; UCD accessor functions
-;;; The first (* 8 206) => 1648 entries in **CHARACTER-DATABASE**
+;;; The first (* 8 215) => 1720 entries in **CHARACTER-DATABASE**
;;; contain entries for the distinct character attributes:
;;; specifically, indexes into the GC kinds, Bidi kinds, CCC kinds,
;;; the decimal digit property, the digit property and the
@@ -189,12 +189,12 @@
;;;
;;; To look up information about a character, take the high 13 bits of
;;; its code point, and index the character database with that and a
-;;; base of 1648 (going past the miscellaneous information[*], so
+;;; base of 1720 (going past the miscellaneous information[*], so
;;; treating (a) as the start of the array). This, labelled A, gives
;;; us another index into the detailed pages[-], which we can use to
;;; look up the details for the character in question: we add the low
;;; 8 bits of the character, shifted twice (because we have four-byte
-;;; table entries) to 1024 times the `page' index, with a base of 6000
+;;; table entries) to 1024 times the `page' index, with a base of 6072
;;; to skip over everything else. This gets us to point B. If we're
;;; after a transformed code point (i.e. an upcase or downcase
;;; operation), we can simply read it off now, beginning with an
@@ -208,8 +208,8 @@
(defun ucd-index (char)
(let* ((cp (char-code char))
(cp-high (ash cp -8))
- (page (aref **character-database** (+ 1648 cp-high))))
- (+ 6000 (ash page 10) (ash (ldb (byte 8 0) cp) 2))))
+ (page (aref **character-database** (+ 1720 cp-high))))
+ (+ 6072 (ash page 10) (ash (ldb (byte 8 0) cp) 2))))
(declaim (ftype (sfunction (t) (unsigned-byte 8)) ucd-value-0))
(defun ucd-value-0 (char)
View
10 tools-for-build/Jamo.txt
@@ -1,14 +1,14 @@
-# Jamo-5.1.0.txt
-# Date: 2008-03-20, 17:59:00 PDT [KW]
+# Jamo-5.2.0.txt
+# Date: 2009-05-22, 13:02:00 PDT [KW]
#
# Unicode Character Database
-# Copyright (c) 1991-2008 Unicode, Inc.
+# Copyright (c) 1991-2009 Unicode, Inc.
# For terms of use, see http://www.unicode.org/terms_of_use.html
-# For documentation, see UCD.html
+# For documentation, see http://www.unicode.org/reports/tr44/
#
# This file defines the Jamo Short Name property.
#
-# See Section 3.12 of The Unicode Standard, Version 5.0
+# See Section 3.12 of The Unicode Standard, Version 5.2
# for more information.
#
# Each line contains two fields, separated by a semicolon.
View
3,377 tools-for-build/UnicodeData.txt
2,935 additions, 442 deletions not shown
View
6 tools-for-build/ucd.lisp
@@ -456,17 +456,17 @@
(values))
;;; The stuff below is dependent on misc.lisp-expr being
-;;; (:LENGTH 206 :UPPERCASE (0 2) :LOWERCASE (1 3) :TITLECASE (4)).
+;;; (:LENGTH 215 :UPPERCASE (0 2) :LOWERCASE (1 3) :TITLECASE (4)).
;;;
;;; There are two entries for UPPERCASE and LOWERCASE because some
;;; characters have case (by Unicode standards) but are not
-;;; transformable character-by-character in a locale-independet way
+;;; transformable character-by-character in a locale-independent way
;;; (as CL requires for its standard operators).
;;;
;;; for more details on these debugging functions, see the description
;;; of the character database format in src/code/target-char.lisp
-(defparameter *length* 206)
+(defparameter *length* 215)
(defun cp-index (cp)
(let* ((cp-high (cp-high cp))
View
2  version.lisp-expr
@@ -17,4 +17,4 @@
;;; checkins which aren't released. (And occasionally for internal
;;; versions, especially for internal versions off the main CVS
;;; branch, it gets hairier, e.g. "0.pre7.14.flaky4.13".)
-"1.0.32.14"
+"1.0.32.15"

0 comments on commit 9b2b4bc

Please sign in to comment.
Something went wrong with that request. Please try again.