New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for saving matrices to files in npz format #154
Conversation
Codecov Report
@@ Coverage Diff @@
## master #154 +/- ##
==========================================
+ Coverage 96.89% 96.93% +0.04%
==========================================
Files 10 11 +1
Lines 1191 1207 +16
==========================================
+ Hits 1154 1170 +16
Misses 37 37
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this is an awesome effort. We strive for perfection, so you can make changes and I'll approve. 😄
One thing not in this review is that you should add load_npz
and save_npz
into the list of functions in docs/generated/sparse.rst
before building the docs.
sparse/io.py
Outdated
[0. , 0.86522495]]]) | ||
>>> os.remove('mat.npz') | ||
|
||
:param filename: string or file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameters must be documented Numpy-style. See https://numpydoc.readthedocs.io/en/latest/format.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're using PyCharm, see "Docstring format" https://www.jetbrains.com/help/pycharm/settings-tools-python-integrated-tools.html
sparse/io.py
Outdated
[0. , 0.86522495]]]) | ||
>>> os.remove('mat.npz') | ||
|
||
:param filename: file-like object, string, or pathlib.Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above.
sparse/tests/test_io.py
Outdated
from sparse.utils import assert_eq | ||
|
||
|
||
def test_save_load_npz_file(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try parametrizing compressed
as True/False
. See example parametrizations in test_coo.py
.
sparse/tests/test_io.py
Outdated
assert_eq(x, z) | ||
assert_eq(y, z.todense()) | ||
|
||
# test exception on wrong format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this into another test.
sparse/io.py
Outdated
""" | ||
|
||
nodes = {'data': matrix.data, | ||
'coords': matrix.coords} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want to store the shape too, sometimes these aren't enough on their own.
sparse/io.py
Outdated
@@ -0,0 +1,96 @@ | |||
import numpy as np | |||
|
|||
from sparse.coo.core import COO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be .coo.core
. Inside a package, we should always use relative imports.
@@ -0,0 +1,6 @@ | |||
io.load_npz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this file and build the docs. It will be auto-generated, then you can do git add
.
@@ -0,0 +1,6 @@ | |||
io.save_npz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
sparse/tests/test_io.py
Outdated
with pytest.raises(RuntimeError): | ||
load_npz(filename) | ||
|
||
shutil.rmtree(dir_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try os.rmdir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, ignore. I just read the docs on that... It doesn't remove unless empty.
sparse/io.py
Outdated
except KeyError: | ||
raise RuntimeError('The file {} does not contain a valid sparse matrix'.format(filename)) | ||
|
||
return COO(coords=coords, data=data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add sorted=True, has_duplicates=False
and shape. The first two will make it a lot faster, and shape
ensures consistency.
If this is too much for you, let me know and I'll fix it up. 🙂 The PR will still be in your name. |
no worries, just got caught up at work. |
addressed all your comments and added the doc files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like most of the major concerns have been addressed, here are some other recommendations. Good work by the way. 😄
sparse/io.py
Outdated
Whether to save in compressed or uncompressed mode | ||
|
||
Returns | ||
------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this section, as the function isn't returning anything.
sparse/tests/test_io.py
Outdated
|
||
import sparse | ||
|
||
from sparse.io import save_npz, load_npz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be addressed. In scipy.sparse
, the functions are in scipy.sparse
, not scipy.sparse.io
.
Therefore, here, they should be in sparse
directly, not sparse.io
. You need to import them from sparse
here, not sparse.io
.
with pytest.raises(RuntimeError): | ||
load_npz(filename) | ||
|
||
shutil.rmtree(dir_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flake8
is failing because of a missing newline here.
[[0. , 0. ], | ||
[0. , 0.86522495]]]) | ||
>>> os.remove('mat.npz') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional, but would be really nice to have: A see also section with load_npz
, np.savez
and scipy.sparse.save_npz
. Also a note saying that it's binary incompatible. See examples in our code and here: https://numpydoc.readthedocs.io/en/latest/format.html
[[0. , 0. ], | ||
[0. , 0.86522495]]]) | ||
>>> os.remove('mat.npz') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above:
Optional, but would be really nice to have: A see also section with save_npz
, np.load
and scipy.sparse.load_npz
. Also a note saying that it's binary incompatible.
docs/generated/sparse.rst
Outdated
@@ -53,3 +53,6 @@ API | |||
|
|||
where | |||
|
|||
save_npz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these should be added in alphabetical order, otherwise they're hard to find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more changes.
We want the user-facing API to be compatible with scipy.sparse
, so the .io
should be removed everywhere and it should just be treated as an internal submodule.
sparse/io.py
Outdated
>>> mat | ||
<COO: shape=(2, 2, 2), dtype=float64, nnz=2> | ||
>>> sparse.io.save_npz('mat.npz', mat) | ||
>>> loaded_mat = sparse.io.load_npz('mat.npz') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this to sparse.load_npz
sparse/io.py
Outdated
>>> mat = sparse.COO(dense_mat) | ||
>>> mat | ||
<COO: shape=(2, 2, 2), dtype=float64, nnz=2> | ||
>>> sparse.io.save_npz('mat.npz', mat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this to sparse.save_npz
sparse/io.py
Outdated
>>> mat | ||
<COO: shape=(2, 2, 2), dtype=float64, nnz=2> | ||
>>> sparse.io.save_npz('mat.npz', mat) | ||
>>> loaded_mat = sparse.io.load_npz('mat.npz') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as above. Also, it would be best to remove the examples section and in the main description, just add
See :obj:`save_npz` for usage examples.
This is in! Thanks, @nimroha! |
thank you! pleasure working with you
…On Sun, May 20, 2018, 16:45 Hameer Abbasi ***@***.***> wrote:
This is in! Thanks, @nimroha <https://github.com/nimroha>!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#154 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/APXkV-ORzdvI59Af740eEnn13VHn_BeNks5t0XOKgaJpZM4UDK5E>
.
|
Closes #153
100% test coverage python 2.7 and 3.5
added all the docs but could not build HTML locally