Library for operations on Unicode codepoints, and UCD properties.
This package provides types that describe individual Unicode codepoints,
codepoint ranges, and character properties. The following
example demonstrates the query of certain character properties that explain the behavior of combining the letter #\a
with the character U+0304
.
(codepoint? #x0304)
; -> #t
(ucd-name #x0304)
; -> "COMBINING MACRON"
(ucd-general-category #x0304)
; -> 'Mn
(cdr (assoc (ucd-general-category #x0304) *general-categories*))
; -> "Non-spacing mark"
(ucd-canonical-combining-class #x0304)
; -> 230
(cdr (assoc (ucd-canonical-combining-class #x0304) *combining-classes*))
; -> "Distinct marks directly above"
(string #\a (codepoint->char #x0304))
; -> "ā"
The functions in codepoint/properties
return the values extracted from the Unicode Character Database, and the only
conversion is typically string to number or string to symbol. Descriptions of the values that are returned are gathered in
codepoint/enums
for display purposes.
codepoint
-- functions on the typecodepoint?
.codepoint/range
-- functions on an inclusive range ofcodepoint?
values.codepoint/range-dict
-- a dictionary keyed bycodepoint-range?
values.codepoint/properties
-- Unicode Character Database (UCD) properties forcodepoint?
values.codepoint/enums
-- enumeration values found in UCD properties.codepoint/fold
-- implementation of case-folding based on UCD properties.
TBD
Version 0.2
- Most documentation now complete, renamed scribbling root file from index to codepoint.
- Removed the shell script for fetching UCD files and rewrote as ucd module.
Version 0.1
- Initial upload.