# Basic Profiling

A simple way to estimate the environmental impact of our code is to make use of functions from python's `time`, `psutil` and `os` modules. 

We can easily measure the total runtime. 

Although memory allocation in python is handled by the garbage collector, we can still get a very rough idea of the memory used for the operation by recording the total memory before and after.


In [None]:
import psutil, os, gc, time

# Function to get memory usage
def get_memory_usage():
    process = psutil.Process(os.getpid())
    mem_info = process.memory_info()
    return mem_info.rss / 1024 / 1024  # Memory in MB

def fib(n):
    if(n == 0):
        return 0
    if(n == 1):
        return 1
    return fib(n - 1) + fib(n - 2)

def fibs(n):
    return [fib(i) for i in range(n)]

# Clear the memory before starting
gc.collect() 

mem_before = get_memory_usage()
time_before = time.time()

print(fibs(35))

mem_after = get_memory_usage()
time_after = time.time()

print(f"execution time: {(time_after - time_before):,.4f}s")
print(f"memory diff: {(mem_after - mem_before):,.4f}M")


## Task 1: Memoization

[Memoization](https://en.wikipedia.org/wiki/Memoization) is a simple technique to avoid repeated calculations.

A memoised version of the `fib()` function would massively reduce the required runtime.

Write one below and see how it changes the runtime and memory usage.

## Task 2: Pathfinding

Here we will use a graph theory problem as an example of a task that can be accomplished in different ways.

A *[graph](https://en.wikipedia.org/wiki/Graph_(discrete_mathematics)#Undirected_graph)* is a set of *vertices* plus a set of *edges* (vertex pairs) that connect some of the vertices.
We will consider only undirected graphs.

The file `data/facebook_combined.txt` contains a graph of friends taken from Facebook (4,039 nodes and 88,234 edges). Each person is a vertex, labelled with a number.

(You can find more information about this dataset at https://snap.stanford.edu/data/ego-Facebook.html)

Find the shortest path from person 0 to person 4038.

You will probably find the package [NetworkX](https://networkx.org/documentation/stable/index.html) useful.


In [None]:
import networkx as nx

file_path = "data/facebook_combined.txt"
G = nx.read_edgelist(file_path, nodetype=int)

### (1) Using a brute force method

### (2) Using a more efficient algorithm

### Compare the runtime and memory usage of your two versions.

## Task 3: More robust estimates

You will have noticed that there can be big variations in both the runtime and the memory usage.

Write a wrapper function `report_stats()` that accepts a function (the operation to be profiled) and an integer (the number of times to repeat the profiling). 

Your function should report the mean and standard deviation for both the runtime and memory usage.

## Task 4: More detailed profiling


Profiling by function gives us a high-level idea of how often functions are called and how long those calls last. One way to do this is to import the `cProfile` module and run a function using the `cProfile.run()` function, providing a string argument which is the command used to invoke the function. `cProfile` is part of the Python standard library and so is available without installing any additional packages. For example:

In [None]:
# Import the cProfile module
import cProfile

# Call the cProfile.run() method with a string argument that is the call to the function you want to profile
cProfile.run('fibs(35)', sort='tottime')

The output shows the total time spent running the code and the total number of function calls. Then, for each function, it shows:
* ```ncalls```: the number of times the function was called.
* ```tottime```: the total time spent in the function, excluding time spent in functions called by the function.
* ```percall```: the time spent in the function per call, excluding time spent in functions called by the function.
* ```cumtime```: the total time spent in the function, including time spent in functions called by the function.
* ```percall```: the time spent in the function per call, including time spent in functions called by the function.
* ```filename:lineno(function)```: the filename, line number and function name.

In the output you will see a number of functions which you have neither written nor explicitly called. These are often called as part of how Python internally executes your code. They are normally not very consequential in terms of run-time and can often be ignored.

Try using `cProfile.run()` to evaluate your pathfinding code.

---