-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[numpy_pick_utils] BinaryZlibFile couldn't handle unicode filenames. #384
Conversation
Tell me if that fix is enough for you and if you think its maintainable enough. |
@@ -299,7 +299,7 @@ def __init__(self, filename, mode="rb", compresslevel=9): | |||
else: | |||
raise ValueError("Invalid mode: %r" % (mode,)) | |||
|
|||
if isinstance(filename, (str, bytes)): | |||
if isinstance(filename, (str, bytes, (sys.version_info < (3,) and unicode) or str)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using isinstance(filename, _compat._bytes_or_unicode)
is the right way to go about it I reckon.
It would be good to add a regression test in joblib/test/test_numpy_pickle_utils.py to make sure that b'myfilename', 'myfilename' and u'myfilename' can be used in BinaryZlibFile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe better to do
from ._compat import _basestring
if isinstance(filename, _basestring):
to be more consistent with other places in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright ! I will make a new commit soon
Another small thing, it is considered best practice to create a feature branch when you do a PR i.e. not to use master. Try to remember this for your next PR. |
…nicode filenames works for BinaryZlibFile.
Yeah sorry I am not used to fork/PR, thanks for remembering me. In the end I used Is there anything else ? Do you want me to squash the commits correctly ? (one for code and one for tests?) |
def teardown_module(): | ||
"""Test teardown.""" | ||
shutil.rmtree(env['dir']) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wil add another newline here when I will squash my commits.
Don't worry about squashing your commits too much, we tend to do it via the GitHub web interface when we merge the PR. You need Travis to be green, you still have some flake8 violations, look at https://travis-ci.org/joblib/joblib/jobs/149166066 for more details. You can also run: |
Please stick with _basestring to be consistent with the rest of the codebase. We can think about use _bytes_or_unicode in a separate PR. To be honest, I don't think it is very common to pass |
Also if you could add an entry in CHANGES.rst that would be great. |
Oh ok I won't squash then. Ok I thought it would be better consistency to use For the |
Remove it since it is not going to work with Python 3 I reckon. We can reintegrate it later when we revisit whether we should not use _bytes_or_unicode everywhere. |
…ryZlibFile constructor.
Ok so I remove it from the tests and code. |
Vincent Latrouite | ||
|
||
FIX a bug in the constructor of BinaryZlibFile that would throw an | ||
exception when passing unicode filename. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add "(Python 2 only)" or something like this at the end of the sentence.
LGTM, merging |
Thanks a lot! Feel free to have a look at using _bytes_or_unicode vs _basestring in a consistent manner across the source tree if you have some spare time! |
No problem ! Yeah I will look at the differences and how it's implemented as soon as I can :) |
* tag '0.10.2': (55 commits) Bump up __version__ Update release to 0.10.2 in CHANGES.rst nosetests should run tests only from the joblib folder. API expose joblib.parallel.ParallelBackendBase and joblib.parallel.AutoBatchingMixin Update to numpydoc 0.6 [numpy_pick_utils] Handle unicode filenames in BinaryZlibFile (joblib#384) Fix format_stack with compiled code (e.g. .so or .pyd) PEP8: cosmit fix (joblib#376) FIX typo Release 0.10.0 FIX: __all__ should hold symbol names as strings Fix bench_auto_batching.py [MRG] Persistence in/from file objects (joblib#351) Minor tweaks in auto batching benchmark script Improve flake8_diff.sh (joblib#371) FIX numpy array persistence with pickle HIGHEST_PROTOCOL (joblib#370) DOC: remove easy_install from joblib installation documentation (joblib#363) MAINT fix typo DOC Add documentation of mmap_mode Explicit handling of job cancellation on first collected exception (joblib#361) ...
Hi everyone,
It seems that BinaryZlibFile cannot handle unicode filenames in python2. I just added the unicode type to check, but it's not python3 compliant now I guess .. Does it need to be ? Is there a workaround otherwise to make it works for both py2 and py3 ?