Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

UnicodeDecodeError: 'utf-8' codec can't decode byte #1

Closed
pascalcaron opened this issue Jul 2, 2021 · 2 comments
Closed

UnicodeDecodeError: 'utf-8' codec can't decode byte #1

pascalcaron opened this issue Jul 2, 2021 · 2 comments

Comments

@pascalcaron
Copy link

Hello,

When I run the script it gives

UnicodeDecodeError: 'utf-8' codec can't decode byte

error.

python main.py -u **** -p ****** -s encoded -o decoded


descended into encoded
uploading 3 files...Traceback (most recent call last):
  File "main.py", line 188, in <module>
    process_files(session, dir, dest, f)
  File "main.py", line 130, in process_files
    res = upload(session, dir, phpfiles)
  File "main.py", line 96, in upload
    files=upload)
  File "/root/myproject/env/lib/python3.6/site-packages/requests/sessions.py", line 590, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/root/myproject/env/lib/python3.6/site-packages/requests/sessions.py", line 528, in request
    prep = self.prepare_request(req)
  File "/root/myproject/env/lib/python3.6/site-packages/requests/sessions.py", line 466, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/root/myproject/env/lib/python3.6/site-packages/requests/models.py", line 319, in prepare
    self.prepare_body(data, files, json)
  File "/root/myproject/env/lib/python3.6/site-packages/requests/models.py", line 507, in prepare_body
    (body, content_type) = self._encode_files(files, data)
  File "/root/myproject/env/lib/python3.6/site-packages/requests/models.py", line 159, in _encode_files
    fdata = fp.read()
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbf in position 1709: invalid start byte
@ip-rw
Copy link
Owner

ip-rw commented Jul 2, 2021

What encoding are your files? Python3 expects files to be valid utf-8 unless otherwise specified. The reason this didn’t throw an error earlier is because I deliberately I open the files in binary mode to check that they’re IonCube’d, rather sidestepping any decoding errors. I just checked and requests seems to recommend we do the same thing, so could you try changing the relevant line in upload(...) to:

full = open(os.path.join(dir, file), ‘rb’)

or you could try codecs, which should have a go at detecting the file encoding (I think. Python’s Unicode handling has always been a bit jumpy and codecs never seemed to help):

import codecs
...
full = codecs.open(os.path.join(dir, file), ‘rb’)

If either fixes it then actually this is good enough to close this, otherwise you’ll need to fix your files/specify correct encoding and I should probably add some better error handling.

At the very least the code should open the uploads in ‘rb’ mode and should catch any exception and add the files in that batch to the list of files that failed to unpack. The way ‘requests’ seems to work means we’d need to check the files for ourselves in upload(...) if we wanted to fix it properly. Pull requests always welcome or I’ll make the updates myself a little later.

@ip-rw
Copy link
Owner

ip-rw commented Jul 2, 2021

try now

@ip-rw ip-rw closed this as completed Jul 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants