Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in shortcuts.run when unicode character falls on chunk boundary #119

Closed
lubomir opened this issue Oct 1, 2019 · 1 comment · Fixed by #120
Closed

Crash in shortcuts.run when unicode character falls on chunk boundary #119

lubomir opened this issue Oct 1, 2019 · 1 comment · Fixed by #120

Comments

@lubomir
Copy link
Contributor

lubomir commented Oct 1, 2019

run reads output in chunks. If a unicode character happens to be split between two chunks, decoding fails and there is a crash.

This bug triggered a failure in Fedora 31 updates-testing compose: https://pagure.io/releng/failed-composes/issue/237

lubomir added a commit to lubomir/kobo that referenced this issue Oct 1, 2019
@lubomir lubomir mentioned this issue Oct 1, 2019
lubomir added a commit to lubomir/kobo that referenced this issue Oct 2, 2019
lubomir added a commit to lubomir/kobo that referenced this issue Oct 2, 2019
@lubomir
Copy link
Contributor Author

lubomir commented Oct 2, 2019

An easy workaround is to run the command with universal_newlines=True. This will decode the output at Python stdlib level, and the wrapper will see unicode. It would not work if the command does not use UTF-8, but in such case the decoding in shortcuts.run would not work either.

lubomir added a commit to lubomir/kobo that referenced this issue Oct 2, 2019
When there is an incomplete multibyte sequence, process the data only
until the start of this sequence. When next chunk is read, prepend the
left overs to it. This should complete the sequence and processing
should continue normally.

Fixes: release-engineering#119
lubomir added a commit to lubomir/kobo that referenced this issue Oct 2, 2019
When there is an incomplete multibyte sequence, process the data only
until the start of this sequence. When next chunk is read, prepend the
left overs to it. This should complete the sequence and processing
should continue normally.

Fixes: release-engineering#119
lubomir added a commit to lubomir/kobo that referenced this issue Oct 2, 2019
When there is an incomplete multibyte sequence, process the data only
until the start of this sequence. When next chunk is read, prepend the
left overs to it. This should complete the sequence and processing
should continue normally.

Fixes: release-engineering#119
wangqingfree pushed a commit to wangqingfree/pungi that referenced this issue Jun 29, 2021
This will mean the output is returned as unicode (decoded as UTF-8).
Thus Kobo will not have to do any decoding. This should work around
possible errors with breaking multibyte unicode character sequences into
different chunks.

Relates: https://pagure.io/releng/failed-composes/issue/237
Relates: release-engineering/kobo#119
Signed-off-by: Lubomír Sedlář <lsedlar@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant