Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode bug while using fetch module #4014

Closed
mkhattab opened this issue Sep 4, 2013 · 11 comments
Closed

Unicode bug while using fetch module #4014

mkhattab opened this issue Sep 4, 2013 · 11 comments
Labels
bug This issue/PR relates to a bug.

Comments

@mkhattab
Copy link

mkhattab commented Sep 4, 2013

First and foremost, here's the stack trace (running off devel@2696135b):

TASK: [fetch static files] **************************************************** 
fatal: [192.168.33.10] => Traceback (most recent call last):
  File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/runner/
__init__.py", line 386, in _executor                                                  
    exec_rc = self._executor_internal(host, new_stdin)
  File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/runner/
__init__.py", line 475, in _executor_internal                                         
    return self._executor_internal_inner(host, self.module_name, self.module_args, inje
ct, port, complex_args=complex_args)                                                  
  File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/runner/
__init__.py", line 650, in _executor_internal_inner                                   
    result = handler.run(conn, tmp, module_name, module_args, inject, complex_args)
  File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/runner/
action_plugins/fetch.py", line 82, in run                                             
    remote_md5 = utils.md5s(remote_data)
  File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/utils/_
_init__.py", line 397, in md5s                                                        
    digest.update(data.encode('utf-8'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in 
range(128)       

I think the bug is introduced by commit faf82bf, after which digest.update(data.encode('utf-8')) L397 is called and the error is thrown.

I'm not sure if doing data.encode('utf-8', 'ignore') would change the md5 sum. That might introduce other issues.

@mpdehaan
Copy link
Contributor

mpdehaan commented Sep 4, 2013

Thanks for the heads up.

I'd say "exposed" not "introduced" maybe :)

On Tue, Sep 3, 2013 at 8:25 PM, Mustafa Khattab notifications@github.comwrote:

First and foremost, here's the stack trace (running off devel@26961352696135b
):

TASK: [fetch static files] **************************************************** fatal: [192.168.33.10] => Traceback (most recent call last):
File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/runner/init.py", line 386, in _executor
exec_rc = self._executor_internal(host, new_stdin)
File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/runner/init.py", line 475, in _executor_internal
return self._executor_internal_inner(host, self.module_name, self.module_args, inject, port, complex_args=complex_args)
File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/runner/init.py", line 650, in _executor_internal_inner
result = handler.run(conn, tmp, module_name, module_args, inject, complex_args)
File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/runner/action_plugins/fetch.py", line 82, in run
remote_md5 = utils.md5s(remote_data)
File "/Users/mustafa/.virtualenvs/sandbox/lib/python2.7/site-packages/ansible/utils/init.py", line 397, in md5s
digest.update(data.encode('utf-8'))UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)

I think the bug is introduced by commit faf82bfhttps://github.com/ansible/ansible/commit/faf82bf8,
after which digest.update(data.encode('utf-8')) L397https://github.com/ansible/ansible/blob/devel/lib/ansible/utils/__init__.py#L397is called and the error is thrown.

I'm not sure if doing data.encode('utf-8', 'ignore') would change the md5
sum. That might introduce other issues.


Reply to this email directly or view it on GitHubhttps://github.com//issues/4014
.

@mkhattab
Copy link
Author

mkhattab commented Sep 4, 2013

Exposed sounds better. :)

Perhaps it would be better to just pass the variable data directly without trying to encode it. From my quick googling about this issue, it seems like binary data and unicode don't mix well, especially when encoding data to base64.

@mkhattab
Copy link
Author

mkhattab commented Sep 4, 2013

But I guess it would break things if we just passed data, according to this commit 7a8b27f

@jimi-c
Copy link
Member

jimi-c commented Sep 4, 2013

If you add the 'ignore', does it fail the md5 check? That would be my only concern with modifying this. (edited: didn't realize bytearray just used str.encode() under the hood).

@mkhattab
Copy link
Author

mkhattab commented Sep 6, 2013

I'll go ahead and try this. I think it would fail since I'm fetching a binary file (gzipped tarball).

@mkhattab
Copy link
Author

mkhattab commented Sep 6, 2013

Update: it doesn't work as I expected.

TASK: [fetch tar file] ******************************************************** 
failed: [192.168.33.10] => {"dest": "/Users/mustafa/sandbox/ansibletest/tmp/192.168.33.10/tmp/temp.tar.gz", "failed": true, "file": "/tmp/temp.tar.gz", "md5sum": "6747c85f736cf
bf62df0177c649a1b7f"}                                                                                                                                                          
msg: md5 mismatch

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/mustafa/test.retry

192.168.33.10              : ok=1    changed=0    unreachable=0    failed=1   

@jsmartin
Copy link
Contributor

jsmartin commented Sep 6, 2013

I'm hitting a wall with this one too, I'm also fetching a gz'd tarball. Trying to understand what the code is doing exactly -- what is it attempting to utf-8 encode exactly?

@jimi-c
Copy link
Member

jimi-c commented Sep 6, 2013

It's trying to encode the data being fed to the md5 module to prevent it from failing, according to the original patch. The problem is, the same invalid characters that can trip up the md5 module are tripping up str.encode() it seems.

@jimi-c
Copy link
Member

jimi-c commented Sep 6, 2013

This patch seems to fix the issue for me, if you guys would like to test it:

diff --git a/lib/ansible/utils/__init__.py b/lib/ansible/utils/__init__.py
index b5f1886..de6ab6b 100644
--- a/lib/ansible/utils/__init__.py
+++ b/lib/ansible/utils/__init__.py
@@ -393,8 +393,9 @@ def merge_hash(a, b):
 def md5s(data):
     ''' Return MD5 hex digest of data. '''
 
+    buf = StringIO.StringIO(data)
     digest = _md5()
-    digest.update(data.encode('utf-8'))
+    digest.update(buf.read())
     return digest.hexdigest()
 
 def md5(filename):

@jimi-c jimi-c closed this as completed in 4e9dee6 Sep 6, 2013
@mkhattab
Copy link
Author

mkhattab commented Sep 6, 2013

Thanks @jimi1283 -- patch works for me.

@analytically
Copy link

This breaks ansible for me:

  File "/usr/local/lib/python2.7/dist-packages/ansible/runner/__init__.py", line 382, in _executor
    exec_rc = self._executor_internal(host, new_stdin)
  File "/usr/local/lib/python2.7/dist-packages/ansible/runner/__init__.py", line 471, in _executor_internal
    return self._executor_internal_inner(host, self.module_name, self.module_args, inject, port, complex_args=complex_args)
  File "/usr/local/lib/python2.7/dist-packages/ansible/runner/__init__.py", line 654, in _executor_internal_inner
    result = handler.run(conn, tmp, module_name, module_args, inject, complex_args)
  File "/usr/local/lib/python2.7/dist-packages/ansible/runner/action_plugins/template.py", line 91, in run
    local_md5 = utils.md5s(resultant)
  File "/usr/local/lib/python2.7/dist-packages/ansible/utils/__init__.py", line 409, in md5s
    digest.update(buf.read())
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufb01' in position 2671: ordinal not in range(128)

@ansibot ansibot added bug This issue/PR relates to a bug. and removed bug_report labels Mar 6, 2018
@ansible ansible locked and limited conversation to collaborators Apr 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue/PR relates to a bug.
Projects
None yet
Development

No branches or pull requests

6 participants