New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for saving matrices to files in npz format #153

Closed
nimroha opened this Issue May 17, 2018 · 7 comments

Comments

Projects
None yet
3 participants
@nimroha
Contributor

nimroha commented May 17, 2018

scipy.sparse has a save_npz and load_npz methods.

It would be nice to have the same functionality in this package

I implemented this for myself locally. If you have an interest in this feature, tell me and I'll make it GitHub worthy and open a Pull Request.

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented May 17, 2018

Yes, this would be good, but it must meet a few requirements:

  • It must support an arbitrary amount of dimensions.
  • It musn't use SciPy's methods, our own would be nice.
  • Binary compatibility with scipy.sparse isn't required.
@mrocklin

This comment has been minimized.

Collaborator

mrocklin commented May 17, 2018

@mrocklin

This comment has been minimized.

Collaborator

mrocklin commented May 17, 2018

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented May 17, 2018

This would be nice, but not at the cost of the other two. Someone can easily do

scipy.sparse.save_npz(filename, x.to_scipy_sparse())
x = COO.from_scipy_sparse(scipy.sparse.load_npz(filename))

Our COO layout is fairly different (one array for both rows and columns). With fill-values in the picture, it might have something extra that won't be compatible at all with scipy.sparse.

Of course, we could always fail-over to the methods in scipy.sparse, but I'd prefer coupling be kept as small as possible.

@nimroha

This comment has been minimized.

Contributor

nimroha commented May 17, 2018

@hameerabbasi 's requirements are satisfied

for @mrocklin 's requirement save_npz is easy.
But for load_npz I need to add an argument from_scipy with default False, and users who want to load scipy.sparse.coo files must provide from_scipy=True

Is this design acceptable?

@mrocklin

This comment has been minimized.

Collaborator

mrocklin commented May 17, 2018

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented May 17, 2018

What @mrocklin is suggesting is definitely possible, but it will introduce unneeded complexity and maintenance costs (special cases for ndim=2, fillvalue=0, etc.), @nimroha You're welcome to add a PR. You'll need:

  • Tests (100% coverage is the norm here, if we can't hit something with tests it probably is dead code anyway)
    • Doctests count in coverage.
  • Docstrings for each function with examples
  • Functions added in appropriate places (e.g. docs/generated/sparse.rst)
  • Flake8 compatible code.

Speaking of, I should probably add these to CONTRIBUTING anyway, we've been following them for a decent amount of time but it isn't documented anywhere. 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment