# Memory Usage Backend Comparisson

This experiment aims to evaluate the different measurements for the memory usage of the envelope seismic attribute operator.
On this notebook you will find:
- The problem statement
- The data collection for the experiment
- The evaluation of the experiment results.

## Problem statement

In order to evaluate the memory usage of the envelope seismic attribute operator, we need to compare the memory usage of the different backends available.
The backends available are the following:
- [resource](https://docs.python.org/3/library/resource.html)
- [psutil](https://psutil.readthedocs.io/en/latest/)
- [tracemalloc](https://docs.python.org/3/library/tracemalloc.html)
- [Direct kernel /proc file system](https://man7.org/linux/man-pages/man5/proc.5.html)

After executing the experiment, we expect the measurements to be consistent and the memory usage to be similar across the different backends.

## Data Collection

The first step for data collection is generating the synthetic data.
For testing purposes we will use the same data for all the backends.

### Setup Environment

In [1]:
import sys
import os

seismic_path = os.path.abspath('../tools/seismic')
traceq_path = os.path.abspath('../tools/traceq')

if seismic_path not in sys.path:
    sys.path.append(seismic_path)

if traceq_path not in sys.path:
    sys.path.append(traceq_path)

print(sys.path)

['/Users/delucca/Workspaces/src/unicamp/msc/seismic-attributes-memory-profile', '/Users/delucca/.pyenv/versions/3.8.10/lib/python38.zip', '/Users/delucca/.pyenv/versions/3.8.10/lib/python3.8', '/Users/delucca/.pyenv/versions/3.8.10/lib/python3.8/lib-dynload', '', '/Users/delucca/.pyenv/versions/seismic-attributes-memory-profile/lib/python3.8/site-packages', '/Users/delucca/Workspaces/src/unicamp/msc/seismic-attributes-memory-profile/tools/seismic', '/Users/delucca/Workspaces/src/unicamp/msc/seismic-attributes-memory-profile/tools/traceq']


Now, lets setup some relevant global variables

In [18]:
from pprint import pprint

NUM_INLINES = 600
NUM_XLINES = 600
NUM_SAMPLES = 600

LOG_TRANSPORTS = ['CONSOLE','FILE']
LOG_LEVEL = 'DEBUG'

print('Experiment config:')
pprint({
    'NUM_INLINES': NUM_INLINES,
    'NUM_XLINES': NUM_XLINES,
    'NUM_SAMPLES': NUM_SAMPLES,
    'LOG_TRANSPORTS': LOG_TRANSPORTS,
    'LOG_LEVEL': LOG_LEVEL,
}, indent=2, sort_dicts=True)

Experiment config:
{ 'LOG_LEVEL': 'DEBUG',
  'LOG_TRANSPORTS': ['CONSOLE', 'FILE'],
  'NUM_INLINES': 600,
  'NUM_SAMPLES': 600,
  'NUM_XLINES': 600}


### Setup Dependencies


In [3]:
%pip install -r ../tools/seismic/requirements.txt
%pip install -r ../tools/traceq/requirements.txt

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


### Setup output directory

In [4]:
import uuid
import os

from datetime import datetime

EXPERIMENT_ID = f'001-{datetime.now().strftime("%Y%m%d%H%M%S")}-{uuid.uuid4().hex[:6]}'
OUTPUT_DIR = f'../output/{EXPERIMENT_ID}'

os.makedirs(OUTPUT_DIR)

OUTPUT_DIR

'../output/001-20240909232752-c7ebee'

### Generate Synthetic Data

In [15]:
from seismic.data.synthetic import generate_and_save_synthetic_data

DATA_OUTPUT_DIR = f'{OUTPUT_DIR}/data'

synthetic_data_path = generate_and_save_synthetic_data(
    NUM_INLINES,
    NUM_XLINES,
    NUM_SAMPLES,
    output_dir=DATA_OUTPUT_DIR,
)
print(synthetic_data_path)

../output/001-20240909232752-c7ebee/data/600-600-600.segy


### Collecting Data for Resource Backend

In [None]:
import traceq

from seismic.attributes import envelope

traceq.load_config(
    {
        "output_dir": OUTPUT_DIR,
        "logger": {
            "enabled_transports": LOG_TRANSPORTS,
            "level": LOG_LEVEL,
        },
        "profiler": {
            "session_id": 'resource',
            "memory_usage": {
                "enabled_backends": ['resource'],
            },
        },
    }
)

traceq.profile(envelope.run, synthetic_data_path)

### Collecting Data for Psutil Backend

In [None]:
import traceq

from seismic.attributes import envelope

traceq.load_config(
    {
        "output_dir": OUTPUT_DIR,
        "logger": {
            "enabled_transports": LOG_TRANSPORTS,
            "level": LOG_LEVEL,
        },
        "profiler": {
            "session_id": 'psutil',
            "memory_usage": {
                "enabled_backends": ['psutil'],
            },
        },
    }
)

traceq.profile(envelope.run, synthetic_data_path)

### Collecting Data for Tracemalloc Backend

In [None]:
import traceq

from seismic.attributes import envelope

traceq.load_config(
    {
        "output_dir": OUTPUT_DIR,
        "logger": {
            "enabled_transports": LOG_TRANSPORTS,
            "level": LOG_LEVEL,
        },
        "profiler": {
            "session_id": 'tracemalloc',
            "memory_usage": {
                "enabled_backends": ['tracemalloc'],
            },
        },
    }
)

traceq.profile(envelope.run, synthetic_data_path)

### Collect Data for Kernel Backend

In [None]:
import traceq

from seismic.attributes import envelope

traceq.load_config(
    {
        "output_dir": OUTPUT_DIR,
        "logger": {
            "enabled_transports": LOG_TRANSPORTS,
            "level": LOG_LEVEL,
        },
        "profiler": {
            "session_id": 'kernel',
            "memory_usage": {
                "enabled_backends": ['kernel'],
            },
        },
    }
)

traceq.profile(envelope.run, synthetic_data_path)