# Q25: Large Sparse Matrix & Efficient Storage

Create a large, sparse matrix using NumPy (one that is mostly zeros).
Implement a custom function to efficiently store, manage, and perform basic statistical calculations (e.g., finding the non-zero mean) on this sparse matrix, focusing on minimizing memory usage.

In [4]:
import numpy as np

In [5]:

# Create a large sparse matrix (e.g., 1000x1000, 99.9% zeros)
shape = (1000, 1000)
density = 0.001  # 0.1% non-zero
num_nonzero = int(np.prod(shape) * density)


In [6]:

# Randomly choose positions for non-zero elements
rows = np.random.randint(0, shape[0], num_nonzero)
cols = np.random.randint(0, shape[1], num_nonzero)
values = np.random.randn(num_nonzero)


In [7]:

# Efficient storage: dictionary of coordinates
sparse_dict = {(r, c): v for r, c, v in zip(rows, cols, values)}

def nonzero_mean(sparse_matrix_dict):
    vals = list(sparse_matrix_dict.values())
    if vals:
        return np.mean(vals)
    else:
        return 0.0

def nonzero_count(sparse_matrix_dict):
    return len(sparse_matrix_dict)

def nonzero_sum(sparse_matrix_dict):
    return np.sum(list(sparse_matrix_dict.values()))


In [8]:

print(f'Non-zero count: {nonzero_count(sparse_dict)}')
print(f'Non-zero mean: {nonzero_mean(sparse_dict):.4f}')
print(f'Non-zero sum: {nonzero_sum(sparse_dict):.4f}')

Non-zero count: 1000
Non-zero mean: -0.0111
Non-zero sum: -11.1315


## Why Dictionary Storage is Efficient for Sparse Matrices
A dense NumPy array of size 1000x1000 uses memory for every element, even zeros. By storing only non-zero values in a dictionary with their coordinates as keys, we save memory and can quickly access, update, or compute statistics on non-zero elements. This is ideal for matrices with very few non-zero entries.

In [None]:
# Function to get value at (row, col) in sparse matrix
def get_value(sparse_matrix_dict, row, col):
    return sparse_matrix_dict.get((row, col), 0.0)

# Function to set/update value at (row, col) in sparse matrix
def set_value(sparse_matrix_dict, row, col, value):
    if value != 0.0:
        sparse_matrix_dict[(row, col)] = value
    elif (row, col) in sparse_matrix_dict:
        del sparse_matrix_dict[(row, col)]

# Example usage:
print('Value at (0,0):', get_value(sparse_dict, 0, 0))
set_value(sparse_dict, 0, 0, 5.5)
print('Updated value at (0,0):', get_value(sparse_dict, 0, 0))