# Monte-Carlo Estimate of $\pi$

We want to estimate the number $\pi$ using a [Monte-Carlo method](https://en.wikipedia.org/wiki/Pi#Monte_Carlo_methods) exploiting that the area of a quarter circle of unit radius is $\pi/4$ and that hence the probability of any randomly chosen point in a unit square to lie in a unit circle centerd at a corner of the unit square is $\pi/4$ as well.  So for N randomly chosen pairs $(x, y)$ with $x\in[0, 1)$ and $y\in[0, 1)$, we count the number $N_{circ}$ of pairs that also satisfy $(x^2 + y^2) < 1$ and estimage $\pi \approx 4 \cdot N_{circ} / N$.

[<img src="https://upload.wikimedia.org/wikipedia/commons/8/84/Pi_30K.gif" 
     width="50%" 
     align=top
     alt="PI monte-carlo estimate">](https://en.wikipedia.org/wiki/Pi#Monte_Carlo_methods)

## Core Lessons

- Adaptive clusters
- Tuning the adaptivity

## Set up a Slurm cluster

In [None]:
from dask.distributed import Client
from dask_jobqueue import SLURMCluster

In [None]:
import os

In [None]:
cluster = SLURMCluster(
    cores=24,
    processes=2,
    memory="100GB",
    shebang='#!/usr/bin/env bash',
    queue="batch",
    walltime="00:30:00",
    local_directory='/tmp',
    death_timeout="15s",
    interface="ib0",
    log_directory=f'{os.environ["SCRATCH_cecam"]}/{os.environ["USER"]}/dask_jobqueue_logs/',
    project="ecam")

In [None]:
client = Client(cluster)
client

## The job scripts

In [None]:
print(cluster.job_script())

## Scale the cluster to two nodes

In [None]:
cluster.scale(4)

## The Monte Carlo Method

In [None]:
import dask.array as da
import numpy as np

In [None]:
def calc_pi_mc(size_in_bytes, chunksize_in_bytes=200e6):
    """Calculate PI using a Monte Carlo estimate."""
    
    size = int(size_in_bytes / 8)
    chunksize = int(chunksize_in_bytes / 8)
    
    xy = da.random.uniform(0, 1,
                           size=(size / 2, 2),
                           chunks=(chunksize / 2, 2))
    
    in_circle = ((xy ** 2).sum(axis=-1) < 1)
    pi = 4 * in_circle.mean()

    return pi

In [None]:
def print_pi_stats(size, pi, time_delta, num_workers):
    """Print pi, calculate offset from true value, and print some stats."""
    print(f"{size / 1e9} GB\n"
          f"\tMC pi: {pi : 13.11f}"
          f"\tErr: {abs(pi - np.pi) : 10.3e}\n"
          f"\tWorkers: {num_workers}"
          f"\t\tTime: {time_delta : 7.3f}s")

## The actual calculations

We loop over different volumes of double-precision random numbers and estimate $\pi$ as described above.

In [None]:
from time import time

In [None]:
for size in (1e9 * n for n in (1, 10, 100)):
    
    start = time()
    pi = calc_pi_mc(size).compute()
    elaps = time() - start

    print_pi_stats(size, pi, time_delta=elaps,
                   num_workers=len(cluster.scheduler.workers))

## Scaling the Cluster to twice its size

We increase the number of workers by 2 and the re-run the experiments.

In [None]:
from time import sleep

In [None]:
new_num_workers = 2 * len(cluster.scheduler.workers)

print(f"Scaling from {len(cluster.scheduler.workers)} to {new_num_workers} workers.")

cluster.scale(new_num_workers)

sleep(3)

In [None]:
client

## Re-run same experiments with doubled cluster

In [None]:
for size in (1e9 * n for n in (1, 10, 100)):
    
        
    start = time()
    pi = calc_pi_mc(size).compute()
    elaps = time() - start

    print_pi_stats(size, pi,
                   time_delta=elaps,
                   num_workers=len(cluster.scheduler.workers))

## Automatically scale the cluster towards a target duration

We'll target a wall time of 30 seconds.

_**Watch** how the cluster will scale down to the minimum a few seconds after being made adaptive._

In [None]:
ca = cluster.adapt(
    minimum=2, maximum=30,
    target_duration="360s",  # measured in CPU time per worker
                             # -> 30 seconds at 12 cores / worker
    scale_factor=1.0  # prevent from scaling up because of CPU or MEM need
);

sleep(4)  # Allow for scale-down

In [None]:
client

## Repeat the calculation from above with larger work loads

(And watch the dash board!)

In [None]:
for size in (n * 1e9 for n in (200, 400, 800)):
    
    
    start = time()
    pi = calc_pi_mc(size, min(size / 1000, 500e6)).compute()
    elaps = time() - start

    print_pi_stats(size, pi, time_delta=elaps,
                   num_workers=len(cluster.scheduler.workers))
    
    sleep(20)  # allow for scale-down time

## Summary

- adaptivity with a target duration

## Complete listing of software used here

In [None]:
%pip list

In [None]:
%conda list --explicit