You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This code should print the entire (code point -> canonical name)
mapping, right ? :
(with-hash-table-iterator ( it CL-UNICODE::*CODE-POINTS-TO-UNICODE1-NAMES* )
(loop
(multiple-value-bind (more i) (it)
(progn
(when (not more) (return))
(format t "~A ~A~%" i (gethash i CL-UNICODE::*CODE-POINTS-TO-UNICODE1-NAMES*))
)
)
)
)
But that print-out misses some characters, such as \u2248 and \u2249 - where are they ? :
an excerpt from the print-out obtained by running the above code:
...
8682 WHITE UP ARROW FROM BAR
8788 COLON EQUAL
8789 EQUAL COLON
8804 LESS THAN OR EQUAL TO
...
Why isn't it printing \u2248 (8776) : '≈' or \u2449 (8777) : ' ≉' ?
Yet unicode-name resolves them OK :
CL-USER> (CL-UNICODE:unicode-name #\u2248)
-> "ALMOST EQUAL TO"
CL-USER> (CL-UNICODE:unicode-name #\u2249)
-> "NOT ALMOST EQUAL TO"
And they are in Unicode v1 :
CL-USER> (CL-UNICODE:age #\u2248)
-> (1 1)
So that symbol is in unicode v1, so it should be a unicode1 name, and hence in
the hash table ? What am I missing ? Why doesn't the print-out produced
by above code include #\ALMOST_EQUAL_TO ?
Just wondering what the rules for inclusion in that table were,
and if there is a more complete way of printing ALL recognized
code points and names ?
Is cl-unicode somehow checking my locale and deciding which version
of unicode names to include in the table, and omitting some because of version issues ?
It is very easy to print out a unicode table with eg. bash, not so
easy to browse it by symbol name / meaning :-)
Thanks for cl-unicode!
Best Regards,
Jason
The text was updated successfully, but these errors were encountered:
On Fri, 17 Jun 2022 at 16:08, Phoebe Goldman ***@***.***> wrote:
*code-points-to-unicode1-names*` is an internal variable, and shouldn't
be treated as part of CL-UNICODE's interface.
That map contains only Unicode v1.0 code points, and as age is telling
you, the characters you're asking about were introduced in Unicode v1.1.
If you want to print all the Unicode characters known to CL-UNICODE, you
can do:
(defun print-all-unicode-chars (&optional (stream *standard-output*))
(loop :for i :below cl-unicode:+code-point-limit+
:for name := (cl-unicode:unicode-name i)
:when name
:do (format stream "~&~d ~a ~a~%" i (cl-unicode:age i) name)))
—
Reply to this email directly, view it on GitHub
<#33 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZTWV4C5PMNLOEB5OUVEOZLVPSIFXANCNFSM5ZCREUNQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
But that print-out misses some characters, such as \u2248 and \u2249 - where are they ? :
an excerpt from the print-out obtained by running the above code:
...
8682 WHITE UP ARROW FROM BAR
8788 COLON EQUAL
8789 EQUAL COLON
8804 LESS THAN OR EQUAL TO
...
Why isn't it printing \u2248 (8776) : '≈' or \u2449 (8777) : ' ≉' ?
Yet unicode-name resolves them OK :
CL-USER> (CL-UNICODE:unicode-name #\u2248)
-> "ALMOST EQUAL TO"
CL-USER> (CL-UNICODE:unicode-name #\u2249)
-> "NOT ALMOST EQUAL TO"
And they are in Unicode v1 :
CL-USER> (CL-UNICODE:age #\u2248)
-> (1 1)
So that symbol is in unicode v1, so it should be a unicode1 name, and hence in
the hash table ? What am I missing ? Why doesn't the print-out produced
by above code include #\ALMOST_EQUAL_TO ?
Just wondering what the rules for inclusion in that table were,
and if there is a more complete way of printing ALL recognized
code points and names ?
Is cl-unicode somehow checking my locale and deciding which version
of unicode names to include in the table, and omitting some because of version issues ?
It is very easy to print out a unicode table with eg. bash, not so
easy to browse it by symbol name / meaning :-)
Thanks for cl-unicode!
Best Regards,
Jason
The text was updated successfully, but these errors were encountered: