-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix segmentation fault #64
Conversation
I want to make fallback. please add |
Sorry, I couldn't understand what that means.
This is |
I'm thinking this possibly make user confusing because return latin-1 letters even though OnigRegexp can handle unicode. How do you think? |
If string_split return latin-1 letters, regexp_split also return latin-1 lettres. #ifdef MRUBY_UTF8_STRING
err = onig_new(&rp, (OnigUChar*)ptr, (OnigUChar*)ptr + len, ONIG_OPTION_DEFAULT,
ONIG_ENCODING_UTF8, OnigDefaultSyntax, NULL);
#else
err = onig_new(&rp, (OnigUChar*)ptr, (OnigUChar*)ptr + len, ONIG_OPTION_DEFAULT,
ONIG_ENCODING_ASCII, OnigDefaultSyntax, NULL);
#endif |
OK, I have to confess I didn't understand the unicode. I think, Users are not thinking about using OnigRegexp in this case. "あいうえお".split("") Because, It don't seems to be using Regexp object anywhere. |
Then, we will have to make error like below.
I just don't like contradiction between string(without utf8) and onig-regexp(can handle utf8).
|
Thank you for comment. Is it intentional? (It installed mruby-onig-regexp)
I don't know why result should be This test case seems to be passing both MRB_UTF8_STRING defined or not. mruby-onig-regexp/test/mruby_onig_regexp.rb Line 331 in 071fc4e
|
Yes, I know current implementation have contradiction. So I want to fix.
Because current implementation do this at https://github.com/mattn/mruby-onig-regexp/blob/master/src/mruby_onig_regexp.c#L833-L840 Let's fix like below.
|
1 => I agree. I don't understand the meaning of problem. My goal is this.( and fix segv )
But, It's not your goal. right? |
When MRB_UTF8_STRING is not defined, mruby core and mruby-onig-regexp should NOT handle utf-8 string. Agree? So |
Oh, I noticed that mruby-onig-regexp already support UTF-8 not just "あいうえお".gsub(/./, "a")
#=> "aaaaa"
"あ".scan(/./)
#=> ["\343\201\202"]
"あ".split(//)
#=> ["\343\201\202"] This is mruby-onig-regexp's spec. I respect this. So, I understood this behavior. "あ".split("")
#=> ["\343\201\202"] This is a unified result. |
I rewrote a patch to fix segv. |
Thanks! |
I think, it should delegate to core if argument is none OnigRegexp object.