Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Corrected UTF-8 decoder's behavior when encountering a surrogate codepoint #8

Closed
wants to merge 5 commits into from

2 participants

@Ralith

No description provided.

@sionescu
Collaborator

the test (and (> start #xD800) (< start #xDFFF)) is incorrect, it should include the extremes

Wups. Thanks for catching that; resolved.

@sionescu
Collaborator

This should be (<= #xD800 start #xDFFF)

Okay. Applied that too.

@sionescu sionescu closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 5 additions and 3 deletions.
  1. +5 −3 src/enc-unicode.lisp
View
8 src/enc-unicode.lisp
@@ -217,9 +217,11 @@ in 2 to 4 bytes."
((and (= u1 #xe0) (< u2 #xa0))
(handle-error 3 overlong-utf8-sequence))
((< u1 #xf0) ; 3 octets
- (logior (f-ash (f-logand u1 #x0f) 12)
- (f-logior (f-ash (f-logand u2 #x3f) 6)
- (f-logand u3 #x3f))))
+ (let ((start (f-logior (f-ash (f-logand u1 #x0f) 12)
+ (f-ash (f-logand u2 #x3f) 6))))
+ (if (<= #xD800 start #xDFC0)
+ (handle-error 3 character-out-of-range)
+ (logior start (f-logand u3 #x3f)))))
(t ; 4 octets
(setq u4 (consume-octet))
(handle-error-if-icb u4 3)
Something went wrong with that request. Please try again.