Add random() function #41
Conversation
Thanks for this. Some notes:c
|
Definitely! What would you like to be tested?
Might be worth investigating. I didn't understand what |
Well, we might ensure a variety of things we know about sprase matrices. Here are a few that come to mind:
|
It attempts to do the same thing that you're doing here, just much less efficiently, and much less generally :) |
The proper way to test a random function is currently a hot debate topic. There seem to be two ways to test it, both which I'm not happy with:
I will also add that |
I don't think we need to test the random number generator within this. We can rely on SciPy/NumPy to produce a decent distribution of numbers. I'm mostly concerned that we test basic user expectations. |
elements = np.prod(shape) | ||
|
||
return COO.from_scipy_sparse( | ||
scipy.sparse.rand(elements, 1, density, dtype=dtype) |
hameerabbasi
Dec 27, 2017
Collaborator
Here, it would be useful to specify format='coo'
explicitly. Might speed up the conversion.
Here, it would be useful to specify format='coo'
explicitly. Might speed up the conversion.
nils-werner
Dec 29, 2017
•
Author
Contributor
Maybe I am missing something, but scipy.sparse.rand()
defaults to format='coo'
...
Maybe I am missing something, but scipy.sparse.rand()
defaults to format='coo'
...
It looks like the version of |
That's unfortunate |
Really? Looking at the reference it says
Note the |
Ah, my mistake, the notes on
|
I have replaced |
@@ -9,28 +9,6 @@ | |||
import sparse |
hameerabbasi
Dec 31, 2017
Collaborator
Please remove random
here, it's unneeded.
Please remove random
here, it's unneeded.
x[tuple(random.randint(0, d - 1) for d in x.shape)] = True | ||
return x | ||
|
||
|
||
@pytest.mark.parametrize('reduction,kwargs', [ |
hameerabbasi
Dec 31, 2017
Collaborator
Add an extra newline here.
Add an extra newline here.
The issue you were having here earlier was already fixed with #56. I've added comments that should help fix the |
I'm good to merge as soon as Edit: It'd be helpful to add the test @mrocklin described, though. |
Don't merge yet, the tests still randomly fail due to the input data |
You might want to rebase this on master and see if the tests still fail. If we're talking about the one I saw fail ( |
Since it's just failing for the |
If I raise |
Great! Fix Thanks for your work on all this. Edit: You just need to run |
Of course, this is relative tolerance, but since the maximum value of |
While playing around with ways of comparing arrays I realized that |
It would be nice to have an option |
During testing I found that the Edit: I also added tests and rebased this onto |
Long term we might consider copying scipy's logic to produce the coordinates but then create our own data. |
Do you think this is good to merge? It's good from my side. I took a look at SciPy's code, it's got a lot of unnecessary branches and I think we're better off just using it. |
Or we could use the coordinates from SciPy and then do
|
|
That's right... no Idea why I thought it had one... |
Maybe this is starting to overengineer the solution a bit, but we could allow for a
|
No, I actually really like this solution. However, I think it should be in a separate method rather than in |
Merging if there are no comments until 21:00 German time. This can hold tests on other branches back, and they may need to be rebased on this, so I don't want to hold this out too long. |
Actually, I've pushed a commit that implements the parameter, and tests it with a few possible shapes, densities and RVS's (one taken from the |
Hmmm... This seems rather inefficient and wasteful to generate the data twice. Maybe it's worth it to copy this part of the Scipy code if we can guarantee consistency internally. |
@nils-werner Any more changes? If not, we should merge this. |
Slow down... we've just doubled the LOC in |
Good point. I just really don't want to put off changing the test code in the other branches. |
@@ -2,7 +2,7 @@ | |||
from .core import COO | |||
|
|||
|
|||
def assert_eq(x, y): | |||
def assert_eq(x, y, rtol=1.e-5, atol=1.e-8): |
mrocklin
Jan 1, 2018
Collaborator
Should we just pass through **kwargs
here?
Should we just pass through **kwargs
here?
hameerabbasi
Jan 1, 2018
Collaborator
Yes, that seems like a better option.
Yes, that seems like a better option.
Overall this seems great to me. It looks like we're able to handle most cases and we're staying within the scipy.sparse API. |
I think a
random()
function is could be pretty handy for generating some quick testing data.I tried several methods, like generating
.coords
and.data
myself, and creating aCOO
object from them. In the end it turned out to be the easiest to create a linearscipy.sparse
array and simply reshape it to the desired output shape.