# Tutorial -- Accelerated Computing (Free Lunch)

This section introduce tools that provide accleration **without the need to interact with GPU resources**

## Preperation check

### Check for GPU availiability

In [1]:
!nvidia-smi

Thu Mar 21 17:27:03 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.112                Driver Version: 537.42       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA GeForce RTX 2070 ...    On  | 00000000:01:00.0  On |                  N/A |
| 46%   42C    P0              48W / 319W |    883MiB /  8192MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [15]:
import importlib.util

def check_module_availiability(names: list[str]):
    remaining = []
    for name in names:
        if (spec := importlib.util.find_spec(name)) is not None:
            print(f"{name} already installed")
        else:
            print(f"{name} not installed")
            remaining.append(name)
            
    if remaining: # not empty i.e. not all libraries are installed
        print("Install the required modules for the tutorial with the following command:")
        print(f"pip install {' '.join(remaining)}")
        
names = ["math", "numba", "sklearn", "matplotlib"]

check_module_availiability(names)

math already installed
numba already installed
sklearn already installed
matplotlib already installed


### Check for CUDA availiability

In [2]:
import numba.cuda

numba.cuda.is_available()

True

## Accelerating functions with numba headers

### numba @jit compiler 

We first consider a function that estimates the area under curve:
(If maximum/ minimum does not occur at end of range, will need to provide extrema value manually)

In [36]:
import random
import math

def estimate_area_under_quarter_circle(num_samples):
    # Implicitly: area of square = 1
    under_curve_points = 0
    total_points = 0

    for _ in range(num_samples):
        x = random.uniform(0, 1)
        y = random.uniform(0, 1)

        if math.sqrt(x**2 + y**2) < 1:
            under_curve_points += 1
        total_points += 1

    area_estimate = 1 * (under_curve_points / total_points)
    return area_estimate

Calling the function

In [39]:
n_samples = 1000000

estimated_area = estimate_area_under_quarter_circle(n_samples)

print(f"Estimated area under the curve: {estimated_area}") # should return 0.7853...

Estimated area under the curve: 0.78586


Now, the function could be accelerated by the simple use of a jit decorator

In [40]:
import random
import math
from numba import jit

@jit(nopython = True)
def estimate_area_under_quarter_circle_jit(num_samples):
    # Implicitly: area of square = 1
    under_curve_points = 0
    total_points = 0

    for _ in range(num_samples):
        x = random.uniform(0, 1)
        y = random.uniform(0, 1)

        if math.sqrt(x**2 + y**2) < 1:
            under_curve_points += 1
        total_points += 1

    area_estimate = 1 * (under_curve_points / total_points)
    return area_estimate

  @jit


In [45]:
estimated_area =  estimate_area_under_quarter_circle_jit(n_samples)

print(f"Estimated area under the curve: {estimated_area}") # should return 0.7853...

Estimated area under the curve: 0.784984


Time Comparison:

In [52]:
# Uncompiled function
%timeit -r 5 -n 10 estimate_area_under_quarter_circle(n_samples)

374 ms ± 3.23 ms per loop (mean ± std. dev. of 5 runs, 10 loops each)


In [53]:
# jit compiled function
%timeit -r 5 -n 10 estimate_area_under_quarter_circle_jit(n_samples)

8.1 ms ± 66.8 µs per loop (mean ± std. dev. of 5 runs, 10 loops each)


Notes:
- @jit decorator allows for compilation of python code, but it only supports a [subset](https://numba.pydata.org/numba-doc/dev/reference/pysupported.html) of python features
- @jit by default compiles does not compile completely (as of writing, version = 0.57.1), will need to provide arguement i.e. @jit(nopython = true) = @njit, so that the python interpreter is not involved (complete compilation).