#### Code:
Write a large numpy array and read it using numpy memory mapped arrays.

Print the size required by the array

In [31]:
%%writefile src/mmarrays.py

import sys
import numpy as np
from memory_profiler import profile


def get_data(chunk_size):
    return np.random.rand(chunk_size).astype(np.float64)

@profile
def write(file_name,chunk_size):

#open a file for data of a single column
    with open(file_name, 'wb') as f:
        #for 1024 "csv files"
        for _ in range(1024):
            csv_data =  get_data(chunk_size)
            f.write(csv_data.tobytes())
    
@profile
def read(file_name):
    a = np.memmap(file_name, dtype=np.float64)
    return a

if __name__=="__main__":
    CHUNK_SIZE=int(sys.argv[1])
    FILE_NAME = 'build/mmarr.dat'

    write(FILE_NAME,CHUNK_SIZE)
    a = read(FILE_NAME)
    print("Data size = {:.3f} MB".format(a.nbytes*1e-6))
    print("Chunk size = {:.3f} MB".format(CHUNK_SIZE*8*1e-6))


Overwriting src/mmarrays.py


#### Task 1: Compare the number of bytes required for the array and the memory consumed by the program

In [47]:
%%bash
/usr/bin/time --format="Memory used by the program: %M Kb" python src/mmarrays.py 100000

Filename: /workspaces/hdf5-tutorial/python/src/mmarrays.py

Line #    Mem usage    Increment  Occurrences   Line Contents
    10     76.4 MiB     76.4 MiB           1   @profile
    11                                         def write(file_name,chunk_size):
    12                                         
    13                                         #open a file for data of a single column
    14     78.8 MiB      0.0 MiB           2       with open(file_name, 'wb') as f:
    15                                                 #for 1024 "csv files"
    16     78.8 MiB      0.0 MiB        1025           for _ in range(1024):
    17     78.8 MiB      1.4 MiB        1024               csv_data =  get_data(chunk_size)
    18     78.8 MiB      1.0 MiB        1024               f.write(csv_data.tobytes())


Filename: /workspaces/hdf5-tutorial/python/src/mmarrays.py

Line #    Mem usage    Increment  Occurrences   Line Contents
    20     76.8 MiB     76.8 MiB           1   @profile
    21   

Memory used by the program: 80652 Kb


In [48]:
!ls -lh build/mm*

-rw-r--r-- 1 vscode vscode 782M Jan  9 17:34 build/mmarr.dat


In [40]:
!rm build/mmarr.dat

### Summary:

The memory consumed by the program is proportional to the chunk size, and does not depend on the array size.
