# Memory Ananlysis

# References

https://medium.com/survata-engineering-blog/monitoring-memory-usage-of-a-running-python-program-49f027e3d1ba

Memory Allocation and Management in Python - simplified tutorial for beginners
https://www.youtube.com/watch?v=arxWaw-E8QQ


Python Tips & Tricks: How to Check Memory Usage
https://www.youtube.com/watch?v=_OlNOKg3PpY

 simplefunde  - lots of ComSci Concepts
https://www.youtube.com/channel/UC0ob0iy_DRTyL00IsgSMzgw




# ps

```
$ ps -m -o %cpu,%mem,command
%CPU %MEM COMMAND
23.4  7.2 python analyze_data.py
 0.0  0.0 bash
```

**watch free -h** - # system wide if you are only running one app

# Profilers

- like py-spy
https://pypi.org/project/memory-profiler/



This containerization process obscures how much memory is being used inside the container. 

So — we need to monitor memory usage inside the container.

Your first inclination might be to use the same operating system techniques, but inside the container. While this does technically work, general advice is that a Docker container should run a single process — so running a second monitoring process inside a container isn’t a good option.

# tracemalloc

## example

```
import tracemalloc

tracemalloc.start()
my_complex_analysis_method()
current, peak = tracemalloc.get_traced_memory()
print(f"Current memory usage is {current / 10**6}MB; Peak was {peak / 10**6}MB")
tracemalloc.stop()
```

There’s a price to be paid for this level of detail, though. tracemalloc injects itself deep into the running Python process — which, as you might expect, comes with a performance cost. In our testing, we observed a 30% slowdown when using tracemalloc on a running analysis run.

## Option 3: Sampling

Luckily, the Python standard library provides another way to observe memory usage — the resource module. The resource module provides basic controls for resources that a program allocates — including memory usage:

```
import resource

usage = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
```

```
import resource

from time import sleep

class MemoryMonitor:
    def __init__(self):
        self.keep_measuring = True

    def measure_usage(self):
        max_usage = 0
        while self.keep_measuring:
            max_usage = max(
                max_usage,
                resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
            )
            sleep(0.1)

        return max_usage
```
But what tells the loop to exit? And where do we call the code being monitored? We do that in a separate thread.

```
from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor() as executor:
    monitor = MemoryMonitor()
    mem_thread = executor.submit(monitor.memory_usage)
    try:
        fn_thread = executor.submit(my_analysis_function)
        result = fn_thread.result()
    finally:
        monitor.keep_measuring = False
        max_usage = mem_thread.result()
        
    print(f"Peak memory usage: {max_usage}")
    
```

# Conclusion

It’s impossible to improve something you aren’t measuring. Armed with more information about the memory usage of our analysis tasks, we’re now in a much better position to optimize our resource usage. And, we’ve been able to collect that information with relatively little code and relatively little performance overhead.

# More References

https://medium.com/the-andela-way/machine-monitoring-tool-using-python-from-scratch-8d10411782fd