New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add random() function #41
Conversation
Thanks for this. Some notes:c
|
Definitely! What would you like to be tested?
Might be worth investigating. I didn't understand what |
Well, we might ensure a variety of things we know about sprase matrices. Here are a few that come to mind:
|
It attempts to do the same thing that you're doing here, just much less efficiently, and much less generally :) |
The proper way to test a random function is currently a hot debate topic. There seem to be two ways to test it, both which I'm not happy with:
I will also add that |
I don't think we need to test the random number generator within this. We can rely on SciPy/NumPy to produce a decent distribution of numbers. I'm mostly concerned that we test basic user expectations. |
sparse/core.py
Outdated
elements = np.prod(shape) | ||
|
||
return COO.from_scipy_sparse( | ||
scipy.sparse.rand(elements, 1, density, dtype=dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, it would be useful to specify format='coo'
explicitly. Might speed up the conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am missing something, but scipy.sparse.rand()
defaults to format='coo'
...
It looks like the version of |
That's unfortunate |
Really? Looking at the reference it says
Note the |
Ah, my mistake, the notes on
|
I have replaced |
@@ -9,28 +9,6 @@ | |||
import sparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove random
here, it's unneeded.
sparse/tests/test_core.py
Outdated
x[tuple(random.randint(0, d - 1) for d in x.shape)] = True | ||
return x | ||
|
||
|
||
@pytest.mark.parametrize('reduction,kwargs', [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an extra newline here.
The issue you were having here earlier was already fixed with #56. I've added comments that should help fix the |
I'm good to merge as soon as Edit: It'd be helpful to add the test @mrocklin described, though. |
Don't merge yet, the tests still randomly fail due to the input data |
You might want to rebase this on master and see if the tests still fail. If we're talking about the one I saw fail ( |
Since it's just failing for the |
If I raise |
Great! Fix Thanks for your work on all this. Edit: You just need to run |
Of course, this is relative tolerance, but since the maximum value of |
While playing around with ways of comparing arrays I realized that |
It would be nice to have an option |
During testing I found that the Edit: I also added tests and rebased this onto |
Long term we might consider copying scipy's logic to produce the coordinates but then create our own data. |
Do you think this is good to merge? It's good from my side. I took a look at SciPy's code, it's got a lot of unnecessary branches and I think we're better off just using it. |
Or we could use the coordinates from SciPy and then do
|
|
That's right... no Idea why I thought it had one... |
Maybe this is starting to overengineer the solution a bit, but we could allow for a
|
No, I actually really like this solution. However, I think it should be in a separate method rather than in |
Merging if there are no comments until 21:00 German time. This can hold tests on other branches back, and they may need to be rebased on this, so I don't want to hold this out too long. |
Actually, I've pushed a commit that implements the parameter, and tests it with a few possible shapes, densities and RVS's (one taken from the |
Hmmm... This seems rather inefficient and wasteful to generate the data twice. Maybe it's worth it to copy this part of the Scipy code if we can guarantee consistency internally. |
@nils-werner Any more changes? If not, we should merge this. |
Slow down... we've just doubled the LOC in |
Good point. I just really don't want to put off changing the test code in the other branches. |
sparse/utils.py
Outdated
@@ -2,7 +2,7 @@ | |||
from .core import COO | |||
|
|||
|
|||
def assert_eq(x, y): | |||
def assert_eq(x, y, rtol=1.e-5, atol=1.e-8): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just pass through **kwargs
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that seems like a better option.
Overall this seems great to me. It looks like we're able to handle most cases and we're staying within the scipy.sparse API. |
I think a
random()
function is could be pretty handy for generating some quick testing data.I tried several methods, like generating
.coords
and.data
myself, and creating aCOO
object from them. In the end it turned out to be the easiest to create a linearscipy.sparse
array and simply reshape it to the desired output shape.