Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sparse.dot returning incorrect shape for N-D arrays. #204

Merged
merged 9 commits into from
Oct 23, 2018
Merged

Fix sparse.dot returning incorrect shape for N-D arrays. #204

merged 9 commits into from
Oct 23, 2018

Conversation

gongliyu
Copy link
Contributor

@gongliyu gongliyu commented Oct 22, 2018

Only the recursive approach is implemented. Considering the use cases of n-d sparse array, this will be enough. Fixes #202

Copy link
Collaborator

@hameerabbasi hameerabbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Would you possibly add tests to the already existing sparse.dot code for this?

# if b.ndim == 1:
# b_axis = -1

# return tensordot(a, b, axes=(a_axis, b_axis))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for this code, you will be credited in the PR and in the git history.

@hameerabbasi hameerabbasi changed the title fix #202 Fix sparse.dot returning incorrect shape for N-D arrays. Oct 22, 2018
elif any(mask):
res = [x.todense() if isinstance(x, SparseArray) else x
for x in res]
return np.stack(res)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to use sparse.stack or np.stack here depending on the results of mask, otherwise the result is always dense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it is already depend on the mask, please check the following lines

        if all(mask):
            return stack(res)
        elif any(mask):
            res = [x.todense() if isinstance(x, SparseArray) else x
                   for x in res]
        return np.stack(res)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.stack always gives dense results, that's a problem here. You need use sparse.stack in the first case only.

@gongliyu
Copy link
Contributor Author

Sure, I will add some test cases. Do I need to close this PR and open another one after I add the tests in my forked repo? Does pydata/sparse contains a script to run the tests before push to remote repo? I'm sorry I'm very new to the github PR feature. I will really appreciate it if you can help me on this.

Thanks!

@hameerabbasi
Copy link
Collaborator

Hi! Like you, I was also very new a year or so ago! I'd like to personally thank (and congratulate) you for doing this.

Usually it's recommended to push your changes to a branch and keep your master branch sync'd to the main branch, but for this PR it's okay.

If you push changes to your branch, this PR will be automatically updated.

@hameerabbasi
Copy link
Collaborator

To run the tests, you can run pip install -e .[all] in the root of the repository and then run py.test there. It will run all tests for you, including for code style.

@hameerabbasi
Copy link
Collaborator

This pull request introduces 1 alert when merging 99c15a9 into 87f414c - view on LGTM.com

new alerts:

  • 1 for Syntax error

Comment posted by LGTM.com

@codecov
Copy link

codecov bot commented Oct 22, 2018

Codecov Report

Merging #204 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #204      +/-   ##
==========================================
- Coverage   97.64%   97.63%   -0.01%     
==========================================
  Files          11       11              
  Lines        1444     1483      +39     
==========================================
+ Hits         1410     1448      +38     
- Misses         34       35       +1
Impacted Files Coverage Δ
sparse/coo/core.py 97.14% <100%> (ø) ⬆️
sparse/coo/__init__.py 100% <100%> (ø) ⬆️
sparse/coo/common.py 97.95% <100%> (-0.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 87f414c...de1b734. Read the comment docs.

@gongliyu
Copy link
Contributor Author

I realized that np.dot and np.matmul are different for some cases,
https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.matmul.html
So I add a function: common.matmul with some tests, and redirect COO.matmul to common.matmul.

@hameerabbasi
Copy link
Collaborator

This pull request introduces 1 alert when merging 6fe10d5 into 87f414c - view on LGTM.com

new alerts:

  • 1 for Syntax error

Comment posted by LGTM.com

Copy link
Collaborator

@hameerabbasi hameerabbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage is a bit lacking. If you want, I can push to your branch and make sure all the tests are hit. You will have the authorship of the commit still.

return dot(a, b.reshape(b.shape[-2:]))

while a.ndim < b.ndim:
a = a[np.newaxis]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, you're adding just one axis, what if more axes are needed? Should be a = a[(None,) * (b.ndim - a.ndim)].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I didn't know this trick so I used a while loop. I will try this.

while a.ndim < b.ndim:
a = a[np.newaxis]
while a.ndim > b.ndim:
b = b[np.newaxis]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment, same here.

elif any(mask):
res = [x.todense() if isinstance(x, SparseArray) else x
for x in res]
return np.stack(res)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.stack always gives dense results, that's a problem here. You need use sparse.stack in the first case only.

@@ -1404,14 +1404,15 @@ def dot(self, other):
return dot(self, other)

def __matmul__(self, other):
print('__matmul__')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Libraries shouldn't have print statements in them, please remove this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will do.

if i != 1 and j != 1 and i != j:
raise ValueError('shapes of a and b are not broadcastable')

def _dot_recurser(a, b):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn’t close over anything. Does it need to be defined inside of the other function?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind since there's little point to use it outside the function.

@gongliyu
Copy link
Contributor Author

I tried to read the report and improve it. If it still can not pass the coverage test, please help me add some test cases.

@hameerabbasi
Copy link
Collaborator

hameerabbasi commented Oct 23, 2018

Sure thing! You will need to do one more thing, add docs for matmul. You need to edit the file docs/generated/sparse.rst and add matmul (everything is in alphabetical order). Then run sphinx-build -W -b html . _build/html while in the docs/ directory. And you will see an extra file, docs/generated/sparse.matmul.rst, you can add that.

@hameerabbasi
Copy link
Collaborator

This is the last thing, I promise. :-)

@hameerabbasi hameerabbasi merged commit 5333d7c into pydata:master Oct 23, 2018
@hameerabbasi
Copy link
Collaborator

This is in! Thanks @gongliyu for your first FOSS contribution!

@gongliyu
Copy link
Contributor Author

@hameerabbasi Thank you for the helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants