📚 references:

1. https://stackoverflow.com/q/19784868
2. https://stackoverflow.com/q/12129948
3. https://stackoverflow.com/q/28427236
4. https://stackoverflow.com/q/15896588

💡 [`csr_matrix` information on indexing](https://stackoverflow.com/a/52299730) \
💡 [`scipy.sparse.coo_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html#scipy-sparse-coo-matrix) \
💡 [`scipy.sparse.csr_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html#scipy-sparse-csr-matrix) \
💡 [`scipy.sparse.lil_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.lil_matrix.html#scipy.sparse.lil_matrix) \
💡 [`%%timeit` magic documentation](https://ipython.readthedocs.io/en/stable/interactive/magics.html) 

### 0. imports

In [157]:
# data science
import scipy as sp
import numpy as np
# random numbers https://numpy.org/doc/stable/reference/random/index.html#random-quick-start
from numpy.random import default_rng
rng = default_rng()
# plotting
import matplotlib as plt
# type hints
from scipy.sparse import (
    coo_matrix,
    csr_matrix,
    csc_matrix,
    lil_matrix
)
from IPython.core.magics.execution import TimeitResult

### 1.1. generate sparse matrices

In [49]:
M_coo = sp.sparse.random(
    m = 20000,
    n = 20000,
    density = 0.01,
    format = 'coo',
    dtype = float,
    random_state = 42
)

In [51]:
M_csr = sp.sparse.random(
    m = 20000,
    n = 20000,
    density = 0.01,
    format = 'csr',
    dtype = float,
    random_state = 42
)

In [52]:
M_csc = sp.sparse.random(
    m = 20000,
    n = 20000,
    density = 0.01,
    format = 'csc',
    dtype = float,
    random_state = 42
)

In [140]:
M_lil = sp.sparse.random(
    m = 20000,
    n = 20000,
    density = 0.01,
    format = 'lil',
    dtype = float,
    random_state = 42
)

### 1.2. generate index for matrix

In [60]:
index_matrix: np.array = np.arange(
    start = 0,
    stop = 20000,
    step = 1,
    dtype=int
)

### 1.3. generate list of rows to set to zero

In [83]:
# contains M.shape[0] 0s and 1s of a specified ratio
# 1s will be masked and then removed below, 0s will remain untouched
array_for_masking: np.array = rng.choice(
    a = [0, 1],
    size = (M.shape[0], ),
    p = [0.2, 0.8]
)

# contains only indices of rows to be set to zero 
index_matrix_zero: np.array = np.ma.masked_array(
    data = index_matrix,
    mask = array_rows
).compressed()

In [84]:
index_matrix_zero.shape

(3915,)

### 2.1. setting rows to zero
#### 2.1.1 ["multiply with a diagonal matrix"](https://stackoverflow.com/a/65364784)

In [124]:
array_for_diagonal: np.array = np.ones(
    shape = [M_coo.shape[0]],
    dtype = int
)
array_for_diagonal[index_matrix_zero] = 0

In [125]:
D = sp.sparse.diags(array_for_diagonal)

#### 2.1.2 ["take advantage of CSR format: modify data"](https://stackoverflow.com/q/12129948)

💡 [`csr_matrix` information on indexing](https://stackoverflow.com/a/52299730)

In [159]:
def csr_data_set_rows_to_zero(
    matrix: csr_matrix,
    rows: np.array
) -> csr_matrix:

    for row in rows:
        matrix.data[
            matrix.indptr[row]:matrix.indptr[row+1]
        ] = 0
        
    return matrix

#### 2.1.3 ["take advantage of CSR format: use regular indexing notation"](https://stackoverflow.com/a/15900629)

⚠️ this runs too long for `%%timeit` to complete in a reasonable time

In [132]:
def csr_set_rows_to_zero_2(
    matrix: csr_matrix,
    rows: np.array
) -> csr_matrix:

    for row in rows:
        matrix[row,:] = 0
    
    return matrix

In [144]:
"""
%%timeit
M_csr_with_some_rows_zero= csr_set_rows_to_zero_2(
    matrix = M_csr,
    rows = index_matrix_zero
)
"""

'\n%%timeit\nM_csr_with_some_rows_zero= csr_set_rows_to_zero_2(\n    matrix = M_csr,\n    rows = index_matrix_zero\n)\n'

#### 2.1.4 ["use lil_matrix"](https://stackoverflow.com/a/69310639)

💡 [`scipy.sparse.lil_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.lil_matrix.html#scipy.sparse.lil_matrix) \

In [141]:
def lil_set_rows_to_zero(
    matrix: lil_matrix,
    rows: np.array
) -> lil_matrix:

    for row in rows:
        matrix[row,:] = 0
    
    return matrix

#### 2.1.5 "convert back-and-forth between dense and sparse matrices"

In [162]:
def convert_to_dense_and_back(
    matrix: coo_matrix,
    rows: np.array
) -> coo_matrix:

SyntaxError: incomplete input (4092711852.py, line 4)

### 3.1 measuring performance

In [158]:
time_lil: TimeitResult =  %timeit -o lil_set_rows_to_zero(M_lil, index_matrix_zero)

1.42 s ± 26.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [160]:
time_csr_data: TimeitResult =  %timeit -o csr_data_set_rows_to_zero(M_csr, index_matrix_zero)

9 ms ± 144 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [161]:
time_coo: TimeitResult = %timeit -o M_coo @ D

175 ms ± 1.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
