Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ibus-mozc may fail to extracted surrounding text correctly #226

Closed
GoogleCodeExporter opened this issue Apr 22, 2015 · 1 comment
Closed

Comments

@GoogleCodeExporter
Copy link

Version: Mozc-1.15.1811.102 (r231)
OS: Ubuntu 12.04

What steps will reproduce the problem?
1. Launch gedit.
2. Make sure all the conversion histories are cleared and user dictionary is 
empty.
3. Make sure Mozc is turned off.
4. Copy-and-paste '1' to the gedit.
5. Hit Hankaku/Zenkaku key to turn on Mozc
6. Enter 'ひき' and hit space key to convert it.
7. Hit ESC key.
8. Replace '1' with ' 1' with copy-and-paste.
9. Enter 'ひき' and hit space key to convert it.

What is the expected output?
At step 6, you will see "匹" as the top candidate.
At step 9, you will see "匹" as the top candidate.

What do you see instead?
At step 6, you will see "匹" as the top candidate.
At step 9, you will see "引き" as the top candidate.

Here is the root cause.

https://code.google.com/p/mozc/source/browse/trunk/src/unix/ibus/mozc_engine.cc?
r=163#269
>  const uint32 selection_start = min(cursor_pos, anchor_pos);
>  const uint32 selection_length = abs(info->relative_selected_length);
>  info->preceding_text = surrounding_text.substr(0, selection_start);
>  Util::SubString(surrounding_text,
>                  selection_start,
>                  selection_length,
>                  &info->selection_text);
>  info->following_text = surrounding_text.substr(
>      selection_start + selection_length);

|cursor_pos| and |anchor_pos| are the count of Unicode characters, not the byte 
offset in UTF-8 string. However,
|info->preceding_text| and |info->following_text| are extracted as if 
|cursor_pos| and |anchor_pos| were the byte offset in UTF-8 string.  As a 
result, these strings could be initialized with invalid UTF-8 sequence.

Note that |info->selection_text| are correctly initialized with the selected 
text.

It should noted that there is another concern that we have forgotten to make 
sure if |selection_start| and |selection_start + selection_length| are within 
the range of |surrounding_text|. This might be problematic because this means 
that we are using parameters passed from external program (IBus server in this 
case) without any range check. 
Actually the crash reported in Red Hat Bug 1100974 is highly likely to be 
avoided if we had verified these parameters.
https://bugzilla.redhat.com/show_bug.cgi?id=1100974

Original issue reported on code.google.com by yukawa@google.com on 21 Jun 2014 at 1:51

@GoogleCodeExporter
Copy link
Author

Should be fixed in Mozc-1.15.1813.102 (r233).

Original comment by yukawa@google.com on 21 Jun 2014 at 4:13

  • Changed state: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant