Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File name encoding cannot be set when uploading files #4218

Closed
fym0121 opened this issue Aug 5, 2017 · 6 comments
Closed

File name encoding cannot be set when uploading files #4218

fym0121 opened this issue Aug 5, 2017 · 6 comments
Milestone

Comments

@fym0121
Copy link

fym0121 commented Aug 5, 2017

Summary.

The server code is windows-1251, and the program file code is UTF-8. When I upload the file, the file name obtained by the server is garbled. File name cannot be encoded as windows-1251, otherwise an exception will be thrown.

Expected Result

the file name obtained by the server is normal

Actual Result

the file name obtained by the server is garbled.

Reproduction Steps

# Server can receive fym_Пуск
data["RETURN"] = 'fym_Пуск'.encode('windows-1251')

# Server can not receive fym_Пуск.pdf. 
f = {'FILE_LNK': ('fym_Пуск.pdf',  #'fym_Пуск.pdf'.encode('windows-1251') will raise exception
                  open('fym_Пуск.pdf', 'rb'),
                  'application/pdf')

}

r = self.session.post(url, files=f, data=data)

System Information

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "implementation": {
    "name": "CPython",
    "version": "3.6.1"
  },
  "platform": {
    "release": "7",
    "system": "Windows"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.18.1"
  },
  "system_ssl": {
    "version": "100020bf"
  },
  "urllib3": {
    "version": "1.21.1"
  },
  "using_pyopenssl": false
}

This command is only available on Requests v2.16.4 and greater. Otherwise,
please provide some basic information about your system (Python version,
operating system, &c).

@Lukasa
Copy link
Member

Lukasa commented Aug 5, 2017

Can you please provide me with the full traceback that occurs when you encode with cp1251?

@fym0121
Copy link
Author

fym0121 commented Aug 6, 2017

Traceback (most recent call last):
  File "C:\Users\Administrator\vso_python\vso.py", line 453, in <module>
    v.upload_pdf('169519', u'fym_Пуск.pdf')
  File "C:\Users\Administrator\vso_python\vso.py", line 267, in upload_pdf
    r = self.session.post(url, files=f, data=data)
  File "D:\Program Files\python\lib\site-packages\requests\sessions.py", line 549, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "D:\Program Files\python\lib\site-packages\requests\sessions.py", line 488, in request
    prep = self.prepare_request(req)
  File "D:\Program Files\python\lib\site-packages\requests\sessions.py", line 431, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "D:\Program Files\python\lib\site-packages\requests\models.py", line 310, in prepare
    self.prepare_body(data, files, json)
  File "D:\Program Files\python\lib\site-packages\requests\models.py", line 498, in prepare_body
    (body, content_type) = self._encode_files(files, data)
  File "D:\Program Files\python\lib\site-packages\requests\models.py", line 164, in _encode_files
    rf.make_multipart(content_type=ft)
  File "D:\Program Files\python\lib\site-packages\urllib3\fields.py", line 174, in make_multipart
    (('name', self._name), ('filename', self._filename))
  File "D:\Program Files\python\lib\site-packages\urllib3\fields.py", line 134, in _render_parts
    parts.append(self._render_part(name, value))
  File "D:\Program Files\python\lib\site-packages\urllib3\fields.py", line 114, in _render_part
    return format_header_param(name, value)
  File "D:\Program Files\python\lib\site-packages\urllib3\fields.py", line 35, in format_header_param
    if not any(ch in value for ch in '"\\\r\n'):
  File "D:\Program Files\python\lib\site-packages\urllib3\fields.py", line 35, in <genexpr>
    if not any(ch in value for ch in '"\\\r\n'):
TypeError: a bytes-like object is required, not 'str'
[Finished in 37.1s with exit code 1]
[shell_cmd: python -u "C:\Users\Administrator\vso_python\vso.py"]
[dir: C:\Users\Administrator\vso_python]
[path: C:\Users\Public\PTGo\Program\;C:\Program Files (x86)\RSA SecurID Token Common;C:\Program Files (x86)\AMD APP\bin\x86_64;C:\Program Files (x86)\AMD APP\bin\x86;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\ATI Technologies\ATI.ACE\Core-Static;D:\Program Files (x86)\php-7.1.4-Win32-VC14-x64;D:\Program Files\Sublime Text 3;D:\Program Files\nodejs\;D:\Program Files\python\Scripts\;D:\Program Files\python\;C:\Users\Administrator\AppData\Roaming\npm;D:\Program Files\python\Scripts]

I read the code, and Requests calls urllib3, but urllib3 seems to accept only STR, not bytes. If I pass the string in, it will encode the filename with UTF-8, but my server doesn't recognize the code. So I don't think it's a bug. I would be happy if I could encode my own file name

@Lukasa
Copy link
Member

Lukasa commented Aug 6, 2017

Hrm. The server really should recognise it. However, urllib3 should also allow you to provide your own bytes, so I think you should raise this as a urllib3 bug.

@kaos
Copy link

kaos commented May 27, 2018

I've spent some time with this (related #2117, #2217), where I needed to post utf-8 encoded filename to a confluence server. The issue being that confluence does not recognise the filename*= and throws a 500 Internal Server Error.

I found the solution in RFC2047, by encoding the filename to only ascii printables, and thus sending it in the old standard filename= header, it works. :D

python code to encoding unicode filename to such a string:

import base64
# https://tools.ietf.org/html/rfc2047
def rfc2047_encode(s):
    # make a point of returning a string here!
    return str('=?utf-8?B?{}?='.format(base64.b64encode(s.encode('utf-8'))))

Would be nice if this could somehow be provided as a feature in requests (that is, an option wether to use filename*= or encode according to the above and stick with filename=.

@trotamundos
Copy link

Summary.

The server code is windows-1251, and the program file code is UTF-8. When I upload the file, the file name obtained by the server is garbled. File name cannot be encoded as windows-1251, otherwise an exception will be thrown.

Expected Result

the file name obtained by the server is normal

Actual Result

the file name obtained by the server is garbled.

Reproduction Steps

# Server can receive fym_Пуск
data["RETURN"] = 'fym_Пуск'.encode('windows-1251')

# Server can not receive fym_Пуск.pdf. 
f = {'FILE_LNK': ('fym_Пуск.pdf',  #'fym_Пуск.pdf'.encode('windows-1251') will raise exception
                  open('fym_Пуск.pdf', 'rb'),
                  'application/pdf')

}

r = self.session.post(url, files=f, data=data)

System Information

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "implementation": {
    "name": "CPython",
    "version": "3.6.1"
  },
  "platform": {
    "release": "7",
    "system": "Windows"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.18.1"
  },
  "system_ssl": {
    "version": "100020bf"
  },
  "urllib3": {
    "version": "1.21.1"
  },
  "using_pyopenssl": false
}

This command is only available on Requests v2.16.4 and greater. Otherwise,
please provide some basic information about your system (Python version,
operating system, &c).

Hi. Have you been able to solve this problem? The server I work with also accepts windows-1251 encoding only and when I upload files their names become corrupt. Thanks.

@sethmlarson
Copy link
Member

In an effort to clean up the issue tracker to only have issues that are still relevant to the project we've done a quick pass and decided this issue may no longer be relevant for a variety of potential reasons, including:

  • Applies to a much older version, unclear whether the issue still applies.
  • Change requires a backwards incompatible release and it's unclear if the benefits are worth the migration effort from the community.
  • There isn't a clear demand from the community on the change landing in Requests.

If you think the issue should remain open, please comment so below or open a new issue and link back to the original issue. Again, thank you for opening the issue and for the discussion, it's much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants