Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: sparse: speedup LIL matrix slicing #3338

Merged
merged 5 commits into from
Feb 1, 2016
Merged

Conversation

pv
Copy link
Member

@pv pv commented Feb 16, 2014

The current implementation for slicing LIL matrices is very pessimal. Replace it with a more optimal implementation.

Gives a factor of ~1/density speedup:

In []: A = rand(3000, 3000, density=1e-3, format="lil")

# Before (9ccc68475)
In [4]: %timeit A[::2,::-2]
10 loops, best of 3: 87.5 ms per loop

# After (d47e27b0)
In [4]: %timeit A[::2,::-2]
1000 loops, best of 3: 603 µs per loop

Gives speedups also for smaller slices:

# Before
In [5]: %timeit A[3,:]
1000 loops, best of 3: 213 µs per loop
In [6]: %timeit A[:,3]
1000 loops, best of 3: 901 µs per loop
In [7]: %timeit A[1:5,1:5]
10000 loops, best of 3: 83.3 µs per loop

# After
In [5]: %timeit A[3,:]
10000 loops, best of 3: 38.8 µs per loop
In [6]: %timeit A[:,3]
1000 loops, best of 3: 702 µs per loop
In [7]: %timeit A[1:5,1:5]
10000 loops, best of 3: 40.3 µs per loop

(Column slices in LIL are still problematic, but that's due to the matrix format.)

EDIT updated benchmarks

@pv
Copy link
Member Author

pv commented Feb 16, 2014

TBH, the LIL/DOK matrices should be reimplemented e.g. in Cython, with appropriate low-level data storage.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 63988b9 on pv:lil-speed into 2df405a on scipy:master.

@pv pv added the PR label Feb 19, 2014
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 77929e5 on pv:lil-speed into 2df405a on scipy:master.

@rgommers
Copy link
Member

Guess this can be closed?

@pv
Copy link
Member Author

pv commented Feb 24, 2014

No, this is a different optimization.
Probably needs re-benchmarking and reimplementation in Cython.

@pv
Copy link
Member Author

pv commented Mar 2, 2014

Rebased and cythonized.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling d47e27b on pv:lil-speed into * on scipy:master*.

@@ -220,11 +220,30 @@ def getrowview(self, i):
def getrow(self, i):
"""Returns a copy of the 'i'th row.
"""
if i < 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is inlined rather than using _check_row_bounds?

@jnothman
Copy link
Contributor

jnothman commented Mar 5, 2014

Otherwise, this LGTM

@pv pv removed the PR label Aug 13, 2014
@rgommers
Copy link
Member

rgommers commented Jan 1, 2015

This needs only a rebase and a very minor tweak it looks like. Time to get it in?

@perimosocordiae
Copy link
Member

Rebased in #5789, closing.

@larsmans
Copy link
Contributor

larsmans commented Feb 1, 2016

Reopening so that the status of this will be "merged" when I merge #5789.

@larsmans larsmans reopened this Feb 1, 2016
@larsmans larsmans merged commit d47e27b into scipy:master Feb 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.sparse
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants