-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disparity between methods accepting filenames as bytes
or str
#62
Comments
Please be explicit. Which methods? |
OpenSSL.SSL.Context.use_privatekey_file - "keyfile must be a byte string" Three different methods all accepting a file name and each has different requirements. There are other methods I haven't used that I didn't check. I think |
Thank you. |
This looks like a duplicate of issue #15. |
@flavio - looks related, but not a duplicate. My issue is that there's no consistency between what the different methods take as far as "string" type. Also, that it changed between the versions and broke some of my existing code. |
Considering the direction Python has chosen for handling paths, I think the sensible thing for pyOpenSSL is to:
It would be nice to have a function which performed the necessary check and encoding for a single value. This could be used throughout pyOpenSSL anywhere an API accepts a path. |
I've been using something like this:
As per the docs saying "Return the name of the encoding used to convert Unicode filenames into system file names, or None if the system default encoding is used." (https://docs.python.org/2/library/sys.html#sys.getfilesystemencoding) |
how is this for a utility function: from six import PY3, binary_type, text_type
def filename_to_bytes(filename):
if isinstance(filename, binary_type):
return filename
elif isinstance(filename, text_type):
fs_encoding = sys.getfilesystemencoding()
if fs_encoding is None:
fs_encoding = sys.getdefaultencoding()
return filename.encode(fs_encoding)
else:
raise TypeError("filenames must either be a string or bytes") |
I'm not sure it makes sense to apply Another thing to consider is that, according to the documentation (https://docs.python.org/3/library/sys.html#sys.getfilesystemencoding), |
As I quoted, the docs say it returns None when the file encoding is the
|
I don't think so, no. I don't think that making the fallback be either "ascii" or "utf-8" depending on the Python version is a good idea, I don't think that tying it to the Python "default encoding" (which itself is simply a bad idea) is particularly useful, and I don't think that letting people screw with it by using I'm not entirely decided on whether hard-coding "ascii" or "utf-8" makes more sense. The advantage of ascii is that if the encoding succeeds you can be pretty sure you're addressing the file the user meant to address (there are ascii-incompatible encodings but ... for all intents and purposes they are unused). The disadvantage is that many possible path names will not be encodable using ascii. The mitigating factor, perhaps, is that if you're using non-ascii path names your environment really, really should be set up so that The advantage of utf-8 is that it can encode anything. The disadvantage is that it might be the wrong encoding and so the referenced file won't be found. What's worse? Having |
Well, wouldn't the best thing be to mimic I guess when I came across this issue, that was the most surprising thing... I expected that I could pass it a string in the same way I had been doing with |
Yes. That's fine. How does
|
^_^ I knew that was coming next... I've tried making sense of https://hg.python.org/cpython/file/00c982c9f681/Python/fileutils.c but it's a little beyond me. PyUnicode_EncodeFSDefault says it uses the filesytem default encoding but if it's not set that it defaults to the current locale encoding. That may be what |
sorry.. I meant that's what |
It seems like Py2 does what you're saying... if it has no value for So, maybe this is a good starting point: from six import PY3, binary_type, text_type
def filename_to_bytes(filename):
if isinstance(filename, binary_type):
return filename
elif isinstance(filename, text_type):
return filename.encode(sys.getfilesystemencoding() or 'ascii')
else:
raise TypeError("filenames must either be a string or bytes") |
This seems to be fixed by #209 so I'll close it. |
I upgraded from 0.13.1 to 0.14 and got several errors. One was that several methods that took filenames have been changed to only accept binary strings instead of regular unicode strings. So, in py3,
bytes
instead ofstr
.I think 0.13.1 actually had the opposite situation for some of those same methods where it only accepted a unicode string and not a binary string.
I think the underlying C library uses binary strings (ie no encoding enforced), so it makes sense to use that (on *nix). However, the python wrapper should be able to accept both and translate accordingly.
The text was updated successfully, but these errors were encountered: