Skip to content

Conversation

masakielastic
Copy link
Contributor

mbstring miscaliculate the codepoint when converting string from GB18030 to UTF-32. This pull request add missing missing parentheses.

// http://icu-project.org/docs/papers/gb18030.html#h7
// uFirst = 0x10000;
// bFirst = [0x90, 0x30, 0x81, 0x30];

int linear(byte bytes[4]) {
    return ((bytes[0]*10+bytes[1])*126+bytes[2])*10+bytes[3];
}

 int mapToUnicode(byte bytes[4]) {
    int lin=linear(bytes);
    for each range {
        if(linear(bFirst)<=lin&lt=linear(bLast)) {
            // range found
            return uFirst+(lin-linear(bFirst));
        }
    }
    // the byte sequence is not in any known range
    return error;
}

@jpauli jpauli added the Bug label Feb 19, 2015
@smalyshev
Copy link
Contributor

merged

@smalyshev smalyshev closed this Mar 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants