-
Notifications
You must be signed in to change notification settings - Fork 441
[MRG] persistence in/from file objects #351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
56d6805 to
ba0f914
Compare
|
I think this is in a pretty good shape now, so changing the status to MRG. A few comments though:
|
ba0f914 to
2d9833c
Compare
|
@lesteve, I addressed what we discussed IRL. Feel free to have a look when you have time. |
joblib/numpy_pickle.py
Outdated
| filename = str(filename) | ||
|
|
||
| isFilename = isinstance(filename, _basestring) | ||
| isFileobj = hasattr(filename, "read") and hasattr(filename, "write") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚨
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, habits from my old Qt life. I renamed using 🐍 snake_case 🐍 😄
7996f7b to
f33fac6
Compare
| compress_level)) as f: | ||
| NumpyPickler(f, protocol=protocol).dump(value) | ||
|
|
||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a elif here so that all the conditions are on the same indentation level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed !
06d650e to
99a275a
Compare
|
Just for the record, here is the kind of things now possible with this PR: https://gist.github.com/aabadie/074587354d97d872aff6abb65510f618 |
99a275a to
6e9d362
Compare
joblib/test/test_numpy_pickle.py
Outdated
| nose.tools.assert_equal(len(caught_warnings), 1) | ||
| for warn in caught_warnings: | ||
| nose.tools.assert_equal(warn.category, DeprecationWarning) | ||
| nose.tools.assert_equal(warn.category, Warning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why DeprecationWarning -> Warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is the type of warning raised here and that we want to test: it checks one cannot load a compressed pickle using mmap_mode so I think there's no meaning using a DeprecationWarning in this case.
6e9d362 to
1af045f
Compare
joblib/numpy_pickle.py
Outdated
| 'This feature is not supported by joblib.') | ||
| new_exc.__cause__ = exc | ||
| raise new_exc | ||
| # Raise exception "as-is" with python 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reraise exception with Python 2
e7b1cb3 to
2002371
Compare
|
I had issues with doctest and the usage of file object in a context manager. I skipped the lines in the doc but that may not be a solution. Any better idea ? Otherwise, I rebased on the lastest master (including the flake8_diff.sh recent changes) and it works just fine. |
What was the error? It would be great to understand where it comes from and fix it properly ...
Great, thanks a lot! |
There were 2 things:
|
Hmmm I am not sure what we should do in this case: with gzip.GzipFile('/tmp/test.gz', 'wb') as f:
res = joblib.dump({'a': [1, 2, 3], 'b': 'asadf'}, f)Should
Pfff 😭 is there a way we can just skip doctests in Python 2.6? |
Seems like scikit-learn has some |
I think returning nothing in case a file object is given as input is reasonable because this means the user knows in advance the destination of the dump and the returned list is here, as you said, for historical reasons. |
Nice, it works like a charm. The |
3a96f67 to
7d1e947
Compare
doc/persistence_fixture.py
Outdated
| def setup_module(module): | ||
| """Setup module.""" | ||
| if _compat.PY26: | ||
| raise SkipTest("Skipping persitence doctest in Python 2.6") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in persistence (missing s)
Not entirely convinced, but it is a good point that joblib.dump returning something is mostly for companion files (i.e. npy files) since they are not passed in by the user. Let's ask @GaelVaroquaux and @ogrisel whether they have an opinion on this! |
The danger with returning the file object is that it risks having users |
So I guess we can keep it as it is in this PR (returns nothing). |
Let's do it like this in this PR. It'd be good to try to make the joblib.dump return type consistent whether it takes a filename or a file object though in a further PR. |
| def setup_module(module): | ||
| """Setup module.""" | ||
| if _compat.PY26: | ||
| raise SkipTest("Skipping persistence doctest in Python 2.6") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment to say that GzipFile can not be used as a context manager in Python 2.6 or some better explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added.
|
Merging thanks! |
* tag '0.10.2': (55 commits) Bump up __version__ Update release to 0.10.2 in CHANGES.rst nosetests should run tests only from the joblib folder. API expose joblib.parallel.ParallelBackendBase and joblib.parallel.AutoBatchingMixin Update to numpydoc 0.6 [numpy_pick_utils] Handle unicode filenames in BinaryZlibFile (joblib#384) Fix format_stack with compiled code (e.g. .so or .pyd) PEP8: cosmit fix (joblib#376) FIX typo Release 0.10.0 FIX: __all__ should hold symbol names as strings Fix bench_auto_batching.py [MRG] Persistence in/from file objects (joblib#351) Minor tweaks in auto batching benchmark script Improve flake8_diff.sh (joblib#371) FIX numpy array persistence with pickle HIGHEST_PROTOCOL (joblib#370) DOC: remove easy_install from joblib installation documentation (joblib#363) MAINT fix typo DOC Add documentation of mmap_mode Explicit handling of job cancellation on first collected exception (joblib#361) ...
This PR should fix #341 and fix #312.
Examples of new usage:
Test functions are still missing.