Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.5] Allow oc_ modules to pass unicode results #4098

Merged

Conversation

mtnbikenc
Copy link
Member

@mtnbikenc mtnbikenc commented May 4, 2017

Backports #4089

Fixes bug 1444806
Fixes bug 1446471

@mtnbikenc mtnbikenc self-assigned this May 4, 2017
@mtnbikenc mtnbikenc requested review from sdodson and kwoodson May 4, 2017 18:46
@mtnbikenc
Copy link
Member Author

aos-ci-test

@enj
Copy link
Contributor

enj commented May 4, 2017

We need some test data that has unicode characters to actually exercise this.

@enj
Copy link
Contributor

enj commented May 4, 2017

Also, it does not really work:

>>> import subprocess
>>> proc = subprocess.Popen(['cat', '/tmp/dev_random_data.txt'], stdout=subprocess.PIPE)
>>> stdout, stderr = proc.communicate()
>>> stdout.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x95 in position 1: invalid start byte

We need to actually deal with errors:

decode(...)
    S.decode([encoding[,errors]]) -> object
    
    Decodes S using the codec registered for encoding. encoding defaults
    to the default encoding. errors may be given to set a different error
    handling scheme. Default is 'strict' meaning that encoding errors raise
    a UnicodeDecodeError. Other possible values are 'ignore' and 'replace'
    as well as any other name registered with codecs.register_error that is
    able to handle UnicodeDecodeErrors.

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.5_NOT_containerized, aos-ci-jenkins/OS_3.5_NOT_containerized_e2e_tests" for 3fdf282 (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.5_containerized, aos-ci-jenkins/OS_3.5_containerized_e2e_tests" for 3fdf282 (logs)

@ashcrow
Copy link
Member

ashcrow commented May 5, 2017

When it comes to random data I think ignore is fine. When it comes to data we expect to be valid strings/unicode I would expect an error to raise if it is, in fact, not valid.

@mtnbikenc @enj thoughts?

@abutcher
Copy link
Member

abutcher commented May 5, 2017

I don't think we'll encounter any random data so this should be sufficient.

>>> beef = "ÐÊÄÐ-ßÈËF"
>>> beef.decode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
>>> beef.decode('utf-8')
>>> print beef.decode('utf-8')
ÐÊÄÐ-ßÈËF

@ashcrow
Copy link
Member

ashcrow commented May 5, 2017

Added the unicode used in @abutcher's example to a file, and used subprocess to check out decode('utf-8'):

In [1]: import subprocess

In [2]: p = subprocess.Popen(["cat", "data"], stdout=subprocess.PIPE)

In [3]: o, e = p.communicate()

In [4]: o
Out[4]: b'\xc3\x90\xc3\x8a\xc3\x84\xc3\x90-\xc3\x9f\xc3\x88\xc3\x8bF\n'

In [5]: o.decode('utf-8')
Out[5]: 'ÐÊÄÐ-ßÈËF\n'

In [6]: 

@abutcher abutcher merged commit 5b874ad into openshift:release-1.5 May 5, 2017
@mtnbikenc mtnbikenc deleted the release-1.5-decode-unicode branch May 9, 2017 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants