Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sparse.dot returning incorrect shape for N-D arrays. #204

Merged
merged 9 commits into from Oct 23, 2018
Merged

Fix sparse.dot returning incorrect shape for N-D arrays. #204

merged 9 commits into from Oct 23, 2018

Conversation

@gongliyu
Copy link
Contributor

@gongliyu gongliyu commented Oct 22, 2018

Only the recursive approach is implemented. Considering the use cases of n-d sparse array, this will be enough. Fixes #202

Copy link
Collaborator

@hameerabbasi hameerabbasi left a comment

This looks good. Would you possibly add tests to the already existing sparse.dot code for this?

# if b.ndim == 1:
# b_axis = -1

# return tensordot(a, b, axes=(a_axis, b_axis))

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 22, 2018
Collaborator

There is no need for this code, you will be credited in the PR and in the git history.

@hameerabbasi hameerabbasi changed the title fix #202 Fix sparse.dot returning incorrect shape for N-D arrays. Oct 22, 2018
elif any(mask):
res = [x.todense() if isinstance(x, SparseArray) else x
for x in res]
return np.stack(res)

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 22, 2018
Collaborator

You may want to use sparse.stack or np.stack here depending on the results of mask, otherwise the result is always dense.

This comment has been minimized.

@gongliyu

gongliyu Oct 22, 2018
Author Contributor

Now it is already depend on the mask, please check the following lines

        if all(mask):
            return stack(res)
        elif any(mask):
            res = [x.todense() if isinstance(x, SparseArray) else x
                   for x in res]
        return np.stack(res)

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018
Collaborator

np.stack always gives dense results, that's a problem here. You need use sparse.stack in the first case only.

@gongliyu
Copy link
Contributor Author

@gongliyu gongliyu commented Oct 22, 2018

Sure, I will add some test cases. Do I need to close this PR and open another one after I add the tests in my forked repo? Does pydata/sparse contains a script to run the tests before push to remote repo? I'm sorry I'm very new to the github PR feature. I will really appreciate it if you can help me on this.

Thanks!

@hameerabbasi
Copy link
Collaborator

@hameerabbasi hameerabbasi commented Oct 22, 2018

Hi! Like you, I was also very new a year or so ago! I'd like to personally thank (and congratulate) you for doing this.

Usually it's recommended to push your changes to a branch and keep your master branch sync'd to the main branch, but for this PR it's okay.

If you push changes to your branch, this PR will be automatically updated.

@hameerabbasi
Copy link
Collaborator

@hameerabbasi hameerabbasi commented Oct 22, 2018

To run the tests, you can run pip install -e .[all] in the root of the repository and then run py.test there. It will run all tests for you, including for code style.

@hameerabbasi
Copy link
Collaborator

@hameerabbasi hameerabbasi commented Oct 22, 2018

This pull request introduces 1 alert when merging 99c15a9 into 87f414c - view on LGTM.com

new alerts:

  • 1 for Syntax error

Comment posted by LGTM.com

@codecov
Copy link

@codecov codecov bot commented Oct 22, 2018

Codecov Report

Merging #204 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #204      +/-   ##
==========================================
- Coverage   97.64%   97.63%   -0.01%     
==========================================
  Files          11       11              
  Lines        1444     1483      +39     
==========================================
+ Hits         1410     1448      +38     
- Misses         34       35       +1
Impacted Files Coverage Δ
sparse/coo/core.py 97.14% <100%> (ø) ⬆️
sparse/coo/__init__.py 100% <100%> (ø) ⬆️
sparse/coo/common.py 97.95% <100%> (-0.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 87f414c...de1b734. Read the comment docs.

@gongliyu
Copy link
Contributor Author

@gongliyu gongliyu commented Oct 22, 2018

I realized that np.dot and np.matmul are different for some cases,
https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.matmul.html
So I add a function: common.matmul with some tests, and redirect COO.matmul to common.matmul.

@hameerabbasi
Copy link
Collaborator

@hameerabbasi hameerabbasi commented Oct 22, 2018

This pull request introduces 1 alert when merging 6fe10d5 into 87f414c - view on LGTM.com

new alerts:

  • 1 for Syntax error

Comment posted by LGTM.com

Copy link
Collaborator

@hameerabbasi hameerabbasi left a comment

Test coverage is a bit lacking. If you want, I can push to your branch and make sure all the tests are hit. You will have the authorship of the commit still.

return dot(a, b.reshape(b.shape[-2:]))

while a.ndim < b.ndim:
a = a[np.newaxis]

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018
Collaborator

Here, you're adding just one axis, what if more axes are needed? Should be a = a[(None,) * (b.ndim - a.ndim)].

This comment has been minimized.

@gongliyu

gongliyu Oct 23, 2018
Author Contributor

Thanks! I didn't know this trick so I used a while loop. I will try this.

while a.ndim < b.ndim:
a = a[np.newaxis]
while a.ndim > b.ndim:
b = b[np.newaxis]

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018
Collaborator

See above comment, same here.

elif any(mask):
res = [x.todense() if isinstance(x, SparseArray) else x
for x in res]
return np.stack(res)

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018
Collaborator

np.stack always gives dense results, that's a problem here. You need use sparse.stack in the first case only.

@@ -1404,14 +1404,15 @@ def dot(self, other):
return dot(self, other)

def __matmul__(self, other):
print('__matmul__')

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018
Collaborator

Libraries shouldn't have print statements in them, please remove this line.

This comment has been minimized.

@gongliyu

gongliyu Oct 23, 2018
Author Contributor

Sure. Will do.

if i != 1 and j != 1 and i != j:
raise ValueError('shapes of a and b are not broadcastable')

def _dot_recurser(a, b):

This comment has been minimized.

@ewmoore

ewmoore Oct 23, 2018

This doesn’t close over anything. Does it need to be defined inside of the other function?

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018
Collaborator

I don't mind since there's little point to use it outside the function.

gongliyu added 2 commits Oct 23, 2018
@gongliyu
Copy link
Contributor Author

@gongliyu gongliyu commented Oct 23, 2018

I tried to read the report and improve it. If it still can not pass the coverage test, please help me add some test cases.

@hameerabbasi
Copy link
Collaborator

@hameerabbasi hameerabbasi commented Oct 23, 2018

Sure thing! You will need to do one more thing, add docs for matmul. You need to edit the file docs/generated/sparse.rst and add matmul (everything is in alphabetical order). Then run sphinx-build -W -b html . _build/html while in the docs/ directory. And you will see an extra file, docs/generated/sparse.matmul.rst, you can add that.

@hameerabbasi
Copy link
Collaborator

@hameerabbasi hameerabbasi commented Oct 23, 2018

This is the last thing, I promise. :-)

@hameerabbasi hameerabbasi merged commit 5333d7c into pydata:master Oct 23, 2018
5 checks passed
5 checks passed
LGTM analysis: Python No alert changes
Details
ci/circleci Your tests passed on CircleCI!
Details
@codecov
codecov/patch 100% of diff hit (target 97.64%)
Details
@codecov
codecov/project Absolute coverage decreased by -<.01% but relative coverage increased by +2.35% compared to 87f414c
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@hameerabbasi
Copy link
Collaborator

@hameerabbasi hameerabbasi commented Oct 23, 2018

This is in! Thanks @gongliyu for your first FOSS contribution!

@gongliyu
Copy link
Contributor Author

@gongliyu gongliyu commented Oct 23, 2018

@hameerabbasi Thank you for the helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants