New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sparse.dot returning incorrect shape for N-D arrays. #204

Merged
merged 9 commits into from Oct 23, 2018

Conversation

Projects
None yet
3 participants
@gongliyu
Copy link
Contributor

gongliyu commented Oct 22, 2018

Only the recursive approach is implemented. Considering the use cases of n-d sparse array, this will be enough. Fixes #202

@hameerabbasi
Copy link
Collaborator

hameerabbasi left a comment

This looks good. Would you possibly add tests to the already existing sparse.dot code for this?

# if b.ndim == 1:
# b_axis = -1

# return tensordot(a, b, axes=(a_axis, b_axis))

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 22, 2018

Collaborator

There is no need for this code, you will be credited in the PR and in the git history.

@hameerabbasi hameerabbasi changed the title fix #202 Fix sparse.dot returning incorrect shape for N-D arrays. Oct 22, 2018

elif any(mask):
res = [x.todense() if isinstance(x, SparseArray) else x
for x in res]
return np.stack(res)

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 22, 2018

Collaborator

You may want to use sparse.stack or np.stack here depending on the results of mask, otherwise the result is always dense.

This comment has been minimized.

@gongliyu

gongliyu Oct 22, 2018

Contributor

Now it is already depend on the mask, please check the following lines

        if all(mask):
            return stack(res)
        elif any(mask):
            res = [x.todense() if isinstance(x, SparseArray) else x
                   for x in res]
        return np.stack(res)

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018

Collaborator

np.stack always gives dense results, that's a problem here. You need use sparse.stack in the first case only.

@gongliyu

This comment has been minimized.

Copy link
Contributor

gongliyu commented Oct 22, 2018

Sure, I will add some test cases. Do I need to close this PR and open another one after I add the tests in my forked repo? Does pydata/sparse contains a script to run the tests before push to remote repo? I'm sorry I'm very new to the github PR feature. I will really appreciate it if you can help me on this.

Thanks!

@hameerabbasi

This comment has been minimized.

Copy link
Collaborator

hameerabbasi commented Oct 22, 2018

Hi! Like you, I was also very new a year or so ago! I'd like to personally thank (and congratulate) you for doing this.

Usually it's recommended to push your changes to a branch and keep your master branch sync'd to the main branch, but for this PR it's okay.

If you push changes to your branch, this PR will be automatically updated.

@hameerabbasi

This comment has been minimized.

Copy link
Collaborator

hameerabbasi commented Oct 22, 2018

To run the tests, you can run pip install -e .[all] in the root of the repository and then run py.test there. It will run all tests for you, including for code style.

@hameerabbasi

This comment has been minimized.

Copy link
Collaborator

hameerabbasi commented Oct 22, 2018

This pull request introduces 1 alert when merging 99c15a9 into 87f414c - view on LGTM.com

new alerts:

  • 1 for Syntax error

Comment posted by LGTM.com

@codecov

This comment has been minimized.

Copy link

codecov bot commented Oct 22, 2018

Codecov Report

Merging #204 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #204      +/-   ##
==========================================
- Coverage   97.64%   97.63%   -0.01%     
==========================================
  Files          11       11              
  Lines        1444     1483      +39     
==========================================
+ Hits         1410     1448      +38     
- Misses         34       35       +1
Impacted Files Coverage Δ
sparse/coo/core.py 97.14% <100%> (ø) ⬆️
sparse/coo/__init__.py 100% <100%> (ø) ⬆️
sparse/coo/common.py 97.95% <100%> (-0.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 87f414c...de1b734. Read the comment docs.

@gongliyu

This comment has been minimized.

Copy link
Contributor

gongliyu commented Oct 22, 2018

I realized that np.dot and np.matmul are different for some cases,
https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.matmul.html
So I add a function: common.matmul with some tests, and redirect COO.matmul to common.matmul.

@hameerabbasi

This comment has been minimized.

Copy link
Collaborator

hameerabbasi commented Oct 22, 2018

This pull request introduces 1 alert when merging 6fe10d5 into 87f414c - view on LGTM.com

new alerts:

  • 1 for Syntax error

Comment posted by LGTM.com

@hameerabbasi
Copy link
Collaborator

hameerabbasi left a comment

Test coverage is a bit lacking. If you want, I can push to your branch and make sure all the tests are hit. You will have the authorship of the commit still.

return dot(a, b.reshape(b.shape[-2:]))

while a.ndim < b.ndim:
a = a[np.newaxis]

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018

Collaborator

Here, you're adding just one axis, what if more axes are needed? Should be a = a[(None,) * (b.ndim - a.ndim)].

This comment has been minimized.

@gongliyu

gongliyu Oct 23, 2018

Contributor

Thanks! I didn't know this trick so I used a while loop. I will try this.

while a.ndim < b.ndim:
a = a[np.newaxis]
while a.ndim > b.ndim:
b = b[np.newaxis]

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018

Collaborator

See above comment, same here.

elif any(mask):
res = [x.todense() if isinstance(x, SparseArray) else x
for x in res]
return np.stack(res)

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018

Collaborator

np.stack always gives dense results, that's a problem here. You need use sparse.stack in the first case only.

@@ -1404,14 +1404,15 @@ def dot(self, other):
return dot(self, other)

def __matmul__(self, other):
print('__matmul__')

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018

Collaborator

Libraries shouldn't have print statements in them, please remove this line.

This comment has been minimized.

@gongliyu

gongliyu Oct 23, 2018

Contributor

Sure. Will do.

if i != 1 and j != 1 and i != j:
raise ValueError('shapes of a and b are not broadcastable')

def _dot_recurser(a, b):

This comment has been minimized.

@ewmoore

ewmoore Oct 23, 2018

This doesn’t close over anything. Does it need to be defined inside of the other function?

This comment has been minimized.

@hameerabbasi

hameerabbasi Oct 23, 2018

Collaborator

I don't mind since there's little point to use it outside the function.

gongliyu added some commits Oct 23, 2018

@gongliyu

This comment has been minimized.

Copy link
Contributor

gongliyu commented Oct 23, 2018

I tried to read the report and improve it. If it still can not pass the coverage test, please help me add some test cases.

@hameerabbasi

This comment has been minimized.

Copy link
Collaborator

hameerabbasi commented Oct 23, 2018

Sure thing! You will need to do one more thing, add docs for matmul. You need to edit the file docs/generated/sparse.rst and add matmul (everything is in alphabetical order). Then run sphinx-build -W -b html . _build/html while in the docs/ directory. And you will see an extra file, docs/generated/sparse.matmul.rst, you can add that.

@hameerabbasi

This comment has been minimized.

Copy link
Collaborator

hameerabbasi commented Oct 23, 2018

This is the last thing, I promise. :-)

@hameerabbasi hameerabbasi merged commit 5333d7c into pydata:master Oct 23, 2018

5 checks passed

LGTM analysis: Python No alert changes
Details
ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 97.64%)
Details
codecov/project Absolute coverage decreased by -<.01% but relative coverage increased by +2.35% compared to 87f414c
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@hameerabbasi

This comment has been minimized.

Copy link
Collaborator

hameerabbasi commented Oct 23, 2018

This is in! Thanks @gongliyu for your first FOSS contribution!

@gongliyu

This comment has been minimized.

Copy link
Contributor

gongliyu commented Oct 23, 2018

@hameerabbasi Thank you for the helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment