Crash in shortcuts.run when unicode character falls on chunk boundary #119

lubomir · 2019-10-01T07:43:50Z

run reads output in chunks. If a unicode character happens to be split between two chunks, decoding fails and there is a crash.

This bug triggered a failure in Fedora 31 updates-testing compose: https://pagure.io/releng/failed-composes/issue/237

The text was updated successfully, but these errors were encountered:

Relates: release-engineering#119

lubomir · 2019-10-02T12:53:09Z

An easy workaround is to run the command with universal_newlines=True. This will decode the output at Python stdlib level, and the wrapper will see unicode. It would not work if the command does not use UTF-8, but in such case the decoding in shortcuts.run would not work either.

When there is an incomplete multibyte sequence, process the data only until the start of this sequence. When next chunk is read, prepend the left overs to it. This should complete the sequence and processing should continue normally. Fixes: release-engineering#119

Fix #119

This will mean the output is returned as unicode (decoded as UTF-8). Thus Kobo will not have to do any decoding. This should work around possible errors with breaking multibyte unicode character sequences into different chunks. Relates: https://pagure.io/releng/failed-composes/issue/237 Relates: release-engineering/kobo#119 Signed-off-by: Lubomír Sedlář <lsedlar@redhat.com>

lubomir added a commit to lubomir/kobo that referenced this issue Oct 1, 2019

Add failing test for unicode character on chunk boundary

b2a8250

Relates: release-engineering#119

lubomir mentioned this issue Oct 1, 2019

Fix #119 #120

Merged

lubomir added a commit to lubomir/kobo that referenced this issue Oct 2, 2019

Add failing test for unicode character on chunk boundary

9efaeeb

Relates: release-engineering#119

lubomir added a commit to lubomir/kobo that referenced this issue Oct 2, 2019

Add failing test for unicode character on chunk boundary

750a0eb

Relates: release-engineering#119

rohanpm mentioned this issue Oct 3, 2019

Fix unicode issues with shortcuts.run #127

Closed

rohanpm closed this as completed in #120 Jan 10, 2020

rohanpm added a commit that referenced this issue Jan 10, 2020

Merge pull request #120 from lubomir/unicode-boundary

047e9ed

Fix #119

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash in shortcuts.run when unicode character falls on chunk boundary #119

Crash in shortcuts.run when unicode character falls on chunk boundary #119

lubomir commented Oct 1, 2019 •

edited

lubomir commented Oct 2, 2019

Crash in shortcuts.run when unicode character falls on chunk boundary #119

Crash in shortcuts.run when unicode character falls on chunk boundary #119

Comments

lubomir commented Oct 1, 2019 • edited

lubomir commented Oct 2, 2019

lubomir commented Oct 1, 2019 •

edited