Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError when running command #4

Closed
oafilipoai opened this issue Sep 18, 2014 · 4 comments
Closed

UnicodeDecodeError when running command #4

oafilipoai opened this issue Sep 18, 2014 · 4 comments
Milestone

Comments

@oafilipoai
Copy link

I get the following when I run a command from a dodo.py file:

Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/utils/python/2.7.6/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/home/utils/python/2.7.6/lib/python2.7/threading.py", line 763, in run
self.__target(_self.__args, *_self.__kwargs)
File "/home/utils/python/2.7.6/lib/python2.7/site-packages/doit/action.py", line 142, in print_process_output
line = input
.readline().decode('utf-8')
File "/home/utils/python/2.7.6/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa9 in position 10: invalid start byte

It seems that doit cannot handle non utf8 encoded output. In my case, the tool I am running outputs the copyright symbol to stdout.

I attached a sample of the failing setup. Please change the extention to *.tgz and unpack it. I could not figure out how to attach a non-image file to the issue report.

files

@schettino72
Copy link
Member

I did 2 changes:

1- you can explicitly specify the encoding of your process output like:

from doit.action import CmdAction

    return { ...
                'actions': [CmdAction('cat sample.txt', encoding='iso-8859-1')]}

2- doit doesnt use the output of the process (it only captures it and re-display it).
I suspect you don't care so much about it. So I changed the default behaviour of
the process output decoding from strict to replace.

So now by default you will get � instead of errors (even the cat command does the same).
If you want a strict decoding you can also specify it to CmdAction in the parameter decode_error.

Check the test cases from the commit.

@schettino72 schettino72 added this to the 0.27 milestone Sep 19, 2014
@oafilipoai
Copy link
Author

I verified and the problem is fixed in the latest checkin

@schettino72
Copy link
Member

@oafilipoai , so are you specifying the encoding or replacing errors with question mark?

@oafilipoai
Copy link
Author

Just replaced the errors. I do not care about the missing characters as long as the output is readable and the script does not crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants