Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite recursion for utf8proc_decompose_char #82

Closed
jvit opened this issue Aug 29, 2016 · 9 comments
Closed

Infinite recursion for utf8proc_decompose_char #82

jvit opened this issue Aug 29, 2016 · 9 comments

Comments

@jvit
Copy link
Contributor

jvit commented Aug 29, 2016

In master branch, this code leads to infinite recursion and stack overflow:

int32_t codepoints[10];
int boundclass;
utf8proc_ssize_t decomp_result = utf8proc_decompose_char(64257, codepoints, 10, 
    (utf8proc_option_t)(UTF8PROC_COMPAT | UTF8PROC_DECOMPOSE), &boundclass);

call stack:

utf8proc_decompose_char(939, ...)
seqindex_write_char_decomposed(65535, ...)
utf8proc_decompose_char(933, ...)
seqindex_write_char_decomposed(9062, ...)
utf8proc_decompose_char(939, ...)
seqindex_write_char_decomposed(65535, ...)
utf8proc_decompose_char(933, ...)

In release-1.2, it works fine.

@jiahao
Copy link
Collaborator

jiahao commented Aug 29, 2016

The code for utf8proc_decompose_char with UTF8PROC_DECOMPOSE was most recently changed in eeebf70#diff-3008d7ea25bc1a2ce66a424402cae328R459 - could you git bisect and verify?

@jvit
Copy link
Contributor Author

jvit commented Aug 29, 2016

Yes, I verified with git bisect that this error was introduced in that commit.

@ivarne
Copy link

ivarne commented Aug 29, 2016

cc @benibela

@benibela
Copy link
Contributor

Weird. It can never call seqindex_write_char_decomposed(65535, .. since it tests property->decomp_seqindex != UINT16_MAX before calling seqindex_write_char_decomposed(property->decomp_seqindex

What is UINT16_MAX?

@jvit
Copy link
Contributor Author

jvit commented Aug 30, 2016

UINT16_MAX is 65535.

I think the problem is in #define UINT16_MAX ~(utf8proc_uint16_t)0.

I don't know why, but 65535 != ~(utf8proc_uint16_t)0 evaluates to 1.

If I use #define UINT16_MAX 65535U everything works fine.

I am using Visual Studio 2015, Update 3. Maybe they have changed something in the compiler with the latest update?

@stevengj
Copy link
Member

#define UINT16_MAX 65535U seems like an easy fix, can you make a PR?

@stevengj
Copy link
Member

(Shouldn't we also be using UTF8PROC_UINT16_MAX etcetera since UINT16_MAX is defined in stdint.h?)

@eschnett
Copy link

eschnett commented Sep 4, 2016

C and C++ apply "integer promotion": Expressions that evaluate to an integer that is smaller than int are automatically converted to int. You are likely missing an explicit conversion to utf8proc_uint16_t after applying the ~ operator. The existing conversion is likely redundant because of integer promotion.

@stevengj
Copy link
Member

stevengj commented Sep 4, 2016

Should be fixed by #84.

@stevengj stevengj closed this as completed Sep 4, 2016
bdb added a commit to ProsoftEngineering/core that referenced this issue Nov 12, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants