Skip to content

Commit

Permalink
CS matrix prune method should copy data from large unpruned arrays (#…
Browse files Browse the repository at this point in the history
…6623)

* BUG: CS matrix prune should copy data from larger arrays

Pruning resizes the "data" and "indices" arrays after explict zeros
have been removed from a sparse matrix. However, this is implemented
using slicing, which just returns a view of the original array. If
the original array is relatively large, a new array should be used
instead of a view, allowing the original to be garbage collected.

* Add self to list of contributors
  • Loading branch information
ninepints authored and perimosocordiae committed Dec 1, 2016
1 parent 4b77de5 commit 9cec221
Show file tree
Hide file tree
Showing 4 changed files with 19 additions and 10 deletions.
1 change: 1 addition & 0 deletions THANKS.txt
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ on data.
Antonio Ribeiro for implementing irrnotch and iirpeak functions.
Ilhan Polat for bug fixes on Riccati solvers.
Sebastiano Vigna for code in the stats package related to Kendall's tau.
John Draper for bug fixes.

Institutions
------------
Expand Down
4 changes: 4 additions & 0 deletions doc/release/0.19.0-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,10 @@ FIR filters to minimum phase.
The functions `scipy.sparse.save_npz` and `scipy.sparse.load_npz` were added,
providing simple serialization for some sparse formats.

The `prune` method of classes `bsr_matrix`, `csc_matrix`, and `csr_matrix`
was updated to reallocate backing arrays under certain conditions, reducing
memory usage.

`scipy.special` improvements
----------------------------

Expand Down
10 changes: 10 additions & 0 deletions scipy/_lib/_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,16 @@ def _aligned_zeros(shape, dtype=float, order="C", align=None):
return data


def _prune_array(array):
"""Return an array equivalent to the input array. If the input
array is a view of a much larger array, copy its contents to a
newly allocated array. Otherwise, return the input unchaged.
"""
if array.base is not None and array.size < array.base.size // 2:
return array.copy()
return array


class DeprecatedImport(object):
"""
Deprecated import, with redirection + warning.
Expand Down
14 changes: 4 additions & 10 deletions scipy/sparse/compressed.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

import numpy as np
from scipy._lib.six import zip as izip
from scipy._lib._util import _prune_array

from .base import spmatrix, isspmatrix, SparseEfficiencyWarning
from .data import _data_matrix, _minmax_mixin
Expand Down Expand Up @@ -1026,8 +1027,8 @@ def prune(self):
if len(self.data) < self.nnz:
raise ValueError('data array has fewer than nnz elements')

self.data = self.data[:self.nnz]
self.indices = self.indices[:self.nnz]
self.indices = _prune_array(self.indices[:self.nnz])
self.data = _prune_array(self.data[:self.nnz])

###################
# utility methods #
Expand Down Expand Up @@ -1075,15 +1076,8 @@ def _binopt(self, other, op):
other.data,
indptr, indices, data)

actual_nnz = indptr[-1]
indices = indices[:actual_nnz]
data = data[:actual_nnz]
if actual_nnz < maxnnz // 2:
# too much waste, trim arrays
indices = indices.copy()
data = data.copy()

A = self.__class__((data, indices, indptr), shape=self.shape)
A.prune()

return A

Expand Down

0 comments on commit 9cec221

Please sign in to comment.