Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: sparse: Propagate dtype through CSR/CSC constructors #13403

Merged
merged 4 commits into from Jan 30, 2021

Conversation

perimosocordiae
Copy link
Member

Reference issue

Fixes gh-13329

What does this implement/fix?

Propagates the dtype parameter through the intermediate COO format constructor.

@perimosocordiae perimosocordiae changed the title Perimosocordiae patch 5 BUG: sparse: Propagate dtype through CSR/CSC constructors Jan 19, 2021
@perimosocordiae perimosocordiae added scipy.sparse defect A clear bug or issue that prevents SciPy from being installed or used as expected labels Jan 19, 2021
Copy link
Contributor

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@perimosocordiae When I run this feature branch with the original example code in the matching issue I still get a traceback (below). Is there still something off in the original reproducer from @matteodellamico?

scipy.__version__: 1.7.0.dev0+7185d00
Traceback (most recent call last):
  File "test.py", line 21, in <module>
    assert v == data[i], (i, data[i], v)
AssertionError: (4, 14115858573662012539, 14115858573662013440)

@tylerjereddy tylerjereddy added the backport-candidate This fix should be ported by a maintainer to previous SciPy versions. label Jan 20, 2021
@perimosocordiae
Copy link
Member Author

Did you use the fixed reproducer? The original version could have duplicate entries.

@tylerjereddy
Copy link
Contributor

I thought I did--I believe the author edited the code and I copy-pasted from there.

Copy link
Member

@WarrenWeckesser WarrenWeckesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test passes in the master branch, because 2**63 happens to be a number that survives the round trip through floating point. The big value used in the test should be changed to, say, 2**63 + 1.

The test will then fail, because the fix is not complete. The dtype has to be passed to the array function that is eventually called at

self.data = np.array(obj, copy=copy)

@perimosocordiae
Copy link
Member Author

Thanks for the double-checking @tylerjereddy and the explanation @WarrenWeckesser. I learned something new about the way numpy does dtype inference today:

In [1]: np.array([2**64 - 1]).dtype                                                                                                      
Out[1]: dtype('uint64')

In [2]: np.array([2**64 - 1] * 5).dtype                                                                                                  
Out[2]: dtype('uint64')

In [3]: np.array([2**64 - 1, 2**32 - 1]).dtype                                                                                           
Out[3]: dtype('float64')

With my latest commit, this bug should actually now be fixed.

Copy link
Member

@WarrenWeckesser WarrenWeckesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small PEP-8 change, otherwise looks good.

scipy/sparse/compressed.py Outdated Show resolved Hide resolved
Co-authored-by: Warren Weckesser <warren.weckesser@gmail.com>
@WarrenWeckesser WarrenWeckesser merged commit 4f49abd into scipy:master Jan 30, 2021
@WarrenWeckesser
Copy link
Member

Merged, thanks @perimosocordiae!

@WarrenWeckesser WarrenWeckesser added this to the 1.7.0 milestone Jan 30, 2021
tylerjereddy pushed a commit to tylerjereddy/scipy that referenced this pull request Feb 7, 2021
Propagate the dtype parameter through the intermediate COO format constructor,
so the given dtype is preserved throughout the intermediate stages of creating a
sparse matrix.

Closes scipygh-13329.
tylerjereddy pushed a commit to tylerjereddy/scipy that referenced this pull request Feb 11, 2021
Propagate the dtype parameter through the intermediate COO format constructor,
so the given dtype is preserved throughout the intermediate stages of creating a
sparse matrix.

Closes scipygh-13329.
@tylerjereddy tylerjereddy modified the milestones: 1.7.0, 1.6.1 Feb 12, 2021
@tylerjereddy tylerjereddy removed the backport-candidate This fix should be ported by a maintainer to previous SciPy versions. label Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.sparse
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large sparse matrices of big integers lose information
3 participants