# Estimate Memory

This notebook estimates the memory required to store the Amat in the Scipy csc_matrix format.

The document about the csc_matrix format is [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_matrix.html).

Also, We can confirm how the data type of indices and indptr is determined by [this implementation code](https://github.com/scipy/scipy/blob/v1.11.2/scipy/sparse/_compressed.py#L36), and especially about the dtype, [this code](https://github.com/scipy/scipy/blob/main/scipy/sparse/_sputils.py#L147)


In [1]:
import numpy as np
from exputils.actual_Amat import get_actual_Amat
from exputils.stabilizer_group import total_stabilizer_group_size


def estimate_each_memory(n: int, K: float = 1.0):
    # https://github.com/scipy/scipy/blob/v1.11.2/scipy/sparse/_compressed.py#L36
    M = 4**n
    N = int(round(K * total_stabilizer_group_size(n)))
    max_val = max(M, N)

    # https://github.com/scipy/scipy/blob/main/scipy/sparse/_sputils.py#L147
    int32max = np.int32(np.iinfo(np.int32).max)
    dtype = np.int32 if np.intc().itemsize == 4 else np.int64
    if max_val > int32max:
        dtype = np.int64
    idx_dtype_size = 4 if dtype == np.int32 else 8
    assert K != 1.0 or (idx_dtype_size == 4 if n <= 6 else idx_dtype_size == 8)

    sz = N

    # the number of non-zero elements are (2**n) * sz

    # 1 byte (np.int8)
    data_estimate = 1 * (2**n) * sz

    # 4 byte (np.int32) if max_val < 2**31 else 8 byte (np.int64)
    indices_estimate = idx_dtype_size * ((2**n) * sz)
    indptr_estimate = idx_dtype_size * (sz + 1)

    return (data_estimate, indices_estimate, indptr_estimate)


def check_each_memory(n):
    Amat = get_actual_Amat(n)
    return (Amat.data.nbytes, Amat.indices.nbytes, Amat.indptr.nbytes)


# check the correctness
for n in range(1, 5 + 1):
    assert estimate_each_memory(n) == check_each_memory(n)

for n in range(1, 8 + 1):
    if n <= 6:
        assert total_stabilizer_group_size(n) < 2**31
    else:
        assert total_stabilizer_group_size(n) > 2**31


# estimate the memory
unit = ["B", "KiB", "MiB", "GiB", "TiB", "PiB"]
for n in range(4, 8 + 1):
    memory = sum(estimate_each_memory(n))
    exponent = 0
    tmp = memory
    while tmp / 1024 >= 1:
        tmp /= 1024
        exponent += 1
    print(n, f"{tmp:>8.3f} {unit[exponent]} ({memory:,} B)")

print("=" * 20)

# estimate the memory with K
for n, K in [(4, 0.1), (5, 0.01), (6, 0.001), (7, 0.00001), (8, 0.00000001)]:
    memory = sum(estimate_each_memory(n, K))
    exponent = 0
    tmp = memory
    while tmp / 1024 >= 1:
        tmp /= 1024
        exponent += 1
    print(n, f"{tmp:>8.3f} {unit[exponent]} ({memory:,} B) with K={K:e}")

4    2.942 MiB (3,084,484 B)
5  379.045 MiB (397,457,284 B)
6   95.068 GiB (102,078,662,404 B)
7   85.757 TiB (94,290,438,528,008 B)
8   85.795 PiB (96,596,327,459,174,408 B)
4  301.223 KiB (308,452 B) with K=1.000000e-01
5    3.790 MiB (3,974,544 B) with K=1.000000e-02
6   97.350 MiB (102,078,796 B) with K=1.000000e-03
7  499.224 MiB (523,474,760 B) with K=1.000000e-05
8  511.608 MiB (536,460,340 B) with K=1.000000e-08


From this results, we can estimate the memory required to store the Amat as follows:

```
n=6: 95GiB
n=7: 86TiB
n=8: 86PiB
```


In [2]:
assert 0.1 == 10 ** (-1)
assert 0.01 == 10 ** (-2)
assert 0.001 == 10 ** (-3)
assert 0.00001 == 10 ** (-5)
assert 0.00000001 == 10 ** (-8)