Skip to content
Permalink
Browse files

Fix the unicode character limit (0 .. 0x10ffff)

For some reason I had limited things to 0xffff, it really should be 0x10ffff.

We don't actually support a full 32-bit unicode model anyway, since we
use the high bits for the control/meta/^X/special bits, but there was no
reason to limit things to 16 bits when we had 28 bits available.  And
the real limit for real Unicode characters is 0x10ffff.

Add a silly example character past the 16-bit range to the UTF8 demo
file:
  'SMILING FACE WITH HALO' (U+1F607)
from the 'emoticons' block.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  • Loading branch information...
torvalds committed Sep 25, 2012
1 parent dbf1a01 commit 8899ed4e1f076d6f3c400731992210513bb629d3
Showing with 3 additions and 1 deletion.
  1. +2 −0 UTF-8-demo.txt
  2. +1 −1 main.c
@@ -210,3 +210,5 @@ Box drawing alignment tests: █
║└─╥─┘║ │╚═╤═╝│ │╘═╪═╛│ │╙─╀─╜│ ┃└─╂─┘┃ ░░▒▒▓▓██ ┊ ┆ ╎ ╏ ┇ ┋ ▏
╚══╩══╝ └──┴──┘ ╰──┴──╯ ╰──┴──╯ ┗━━┻━━┛ ▗▄▖▛▀▜ └╌╌┘ ╎ ┗╍╍┛ ┋ ▁▂▃▄▅▆▇█
▝▀▘▙▄▟

😇
2 main.c
@@ -500,7 +500,7 @@ int execute(int c, int f, int n)
|| (c >= 0x80 && c <= 0xFE)) {
#else
#if VMS || BSD || USG /* 8BIT P.K. */
|| (c >= 0xA0 && c <= 0xFFFF)) {
|| (c >= 0xA0 && c <= 0x10FFFF)) {
#else
) {
#endif

0 comments on commit 8899ed4

Please sign in to comment.
You can’t perform that action at this time.