# Software Analysis - 03 - Heap Memory

Dynamically allocating memory is a everyday task of software developers. Although modern hardware platforms usually offer a large amount of RAM, at some point it comes to an end. Also, single applications might not be allowed to use all available memory. Further, embedded Linux systems still contain relatively few RAM. Therefore, reducing heap memory usage is an important task in software engineering.

In the following, the analysis and optimization of an application's heap memory usage is explained.

Shell command calls are handled by the Python module ``subprocess`` (https://docs.python.org/3/library/subprocess.html).

And shell output can be filtered by reular expressions with the Python module ``re`` (https://docs.python.org/3/library/re.html).

Random test data is generated by the Python module ``random`` (https://docs.python.org/3/library/random.html).

Measurement processes can be visualized by a progress bar from the Python module ``tqdm``.

In [None]:
import subprocess
import re
import random
from tqdm import tqdm

## Measuring Heap Memory Usage with ``valgrind``

The Linux tool collection ``valgrind`` (https://valgrind.org) is a typical choice for heap memory usage.

It contains the Dynamic Heap Analysis Tool (DHAT) that enables simple heap memory measurements for a Linux binary by calling it with ``valgrind --tool=dhat`` (https://valgrind.org/docs/manual/dh-manual.html). Note that the execution of binary to test might take significantly longer because ``valgrind`` catches many events during runtime.

In the following, the (useless) operation ``dd if=/dev/urandom of=/dev/null bs=16M count=1``, that moves 16 megabytes of random data into ``/dev/null``, is analyzed for its heap usage.

In [None]:
valgrind = subprocess.run(['valgrind', '--tool=dhat', '--dhat-out-file=dhat-output.tmp',
                           'dd', 'if=/dev/urandom', 'of=/dev/null', 'bs=16M', 'count=1'],
                          cwd='./', capture_output=True, text=True)
print(valgrind.stderr)
valgrind = subprocess.run(['rm', 'dhat-output.tmp'], cwd='./',)

Let's perform it several times and filter its output to understand, if heap usage varies from one test to another.

In [None]:
n_measurements = 3
for i in range(n_measurements):
    valgrind = subprocess.run(['valgrind', '--tool=dhat', '--dhat-out-file=dhat-output.tmp',
                               'dd', 'if=/dev/urandom', 'of=/dev/null', 'bs=16M', 'count=1'],
                              cwd='./', capture_output=True, text=True)
    print('TRY', i, '(allocated heap bytes at global maximum):', re.search('At.t-gmax:.(\d+|\d{1,3}(,\d{3})*)\s', valgrind.stderr).group(1))
    valgrind = subprocess.run(['rm', 'dhat-output.tmp'], cwd='./',)

## Analysis of Prime Checking Applications in Rust

Checking numbers for primality is a common algorithm implementation case that leaves a lot of space for optimizations.

This exercise uses different versions of a command line tool for prime checking.

In [None]:
prime_check_tools = [
    'prime_check_lookup_table',
    'prime_check_dyngen_table',
    'prime_check_no_table',
]

Let's compile all versions.

In [None]:
print('### Compiling Prime Check Tools ###')
for tool in prime_check_tools:
    cargo = subprocess.run(['cargo', 'build', '--release'], cwd='./' + tool + '/')
    print('Compiling', '{:<24}'.format(tool), '--> Return Code:', cargo.returncode)

### Heap Memory Analysis with 1 Request

In the following, all implementations are tested for their heap memory usage when issuing one single prime checking request.

In [None]:
numbers_to_check = ['9', '999983']
heap_gmax_data = [[] for tool in prime_check_tools]

for t in range(len(prime_check_tools)):
    tool = prime_check_tools[t]
    for n in tqdm(range(len(numbers_to_check))):
        valgrind = subprocess.run(['valgrind', '--tool=dhat', '--dhat-out-file=dhat-output.tmp', './' + tool, numbers_to_check[n]],
                                  cwd='./' + tool + '/target/release/', capture_output=True, text=True)
        heap_gmax_str = re.sub(',', '',  re.search('At.t-gmax:.(\d+|\d{1,3}(,\d{3})*)\s', valgrind.stderr).group(1))
        heap_gmax_data[t].append(int(heap_gmax_str))
        valgrind = subprocess.run(['rm', 'dhat-output.tmp'], cwd='./',)

Let's print the relevant result data.

In [None]:
print('### ALLOCATED HEAP BYTES AT GLOBAL MAXIMUM ###')
print('')

for t in range(len(prime_check_tools)):
    tool = prime_check_tools[t]
    
    # print data
    print('###', tool, '###')
    for n in range(len(numbers_to_check)):
        print('Test number', '{:>6}'.format(numbers_to_check[n]), ':',
              '{:>9}'.format(heap_gmax_data[t][n]), 'bytes')
    print('')

### Heap Memory Analysis with a High Number of Requests

It is interesting to analyze, if the provided tools allocate more heap memory when fed with a high number of numbers to check for primality.

For such testing purposes, the module  can be useful to generate random test data.

Since the provided tools take numbers as command line arguments, one has to care about the maximum size of the shell string. On many Linux systems, it is only possible to pass 128 x 1024 = 131072 characters to the shell. Assuming 6 characters per number plus one space, one can probably check only about 131072 / 7 = 18724 numbers with a single call.

Therefore, in the following, all implementations are tested for their heap memory usage when confronted with a near-maximum number of requests (18700).

The 18700 numbers to check are randomly choosen from the range between 0 and 999999.

In [None]:
n_numbers = 18700
numbers_to_check = random.sample(range(1000000), n_numbers)
heap_gmax_data = []

for t in tqdm(range(len(prime_check_tools))):
    tool = prime_check_tools[t]
    valgrind = subprocess.run(['valgrind', '--tool=dhat', '--dhat-out-file=dhat-output.tmp',
                               './' + tool] + [str(x) for x in numbers_to_check],
                              cwd='./' + tool + '/target/release/', capture_output=True, text=True)
    heap_gmax_str = re.sub(',', '',  re.search('At.t-gmax:.(\d+|\d{1,3}(,\d{3})*)\s', valgrind.stderr).group(1))
    heap_gmax_data.append(int(heap_gmax_str))
    valgrind = subprocess.run(['rm', 'dhat-output.tmp'], cwd='./',)

The following code generates the plots and relevant numbers.

In [None]:
print('### ALLOCATED HEAP BYTES AT GLOBAL MAXIMUM ###')
print('')

for t in range(len(prime_check_tools)):
    tool = prime_check_tools[t]
    
    # print data
    print('###', tool, '###')
    print('{:>12}'.format(heap_gmax_data[t]), 'bytes')
    print('')