# Monte-Carlo Estimate of $\pi$

We want to estimate the number $\pi$ using a [Monte-Carlo method](https://en.wikipedia.org/wiki/Pi#Monte_Carlo_methods) exploiting that the area of a quarter circle of unit radius is $\pi/4$ and that hence the probability of any randomly chosen point in a unit square to lie in a unit circle centerd at a corner of the unit square is $\pi/4$ as well.  So for N randomly chosen pairs $(x, y)$ with $x\in[0, 1)$ and $y\in[0, 1)$, we count the number $N_{circ}$ of pairs that also satisfy $(x^2 + y^2) < 1$ and estimage $\pi \approx 4 \cdot N_{circ} / N$.

[<img src="https://upload.wikimedia.org/wikipedia/commons/8/84/Pi_30K.gif" 
     width="50%" 
     align=top
     alt="PI monte-carlo estimate">](https://en.wikipedia.org/wiki/Pi#Monte_Carlo_methods)

## Core Lessons

- setting up SLURM (and other jobqueue) clusters
- Scaling clusters
- Adaptive clusters

## Set up a Slurm cluster

In [1]:
from dask.distributed import Client
from dask_jobqueue import SLURMCluster

The dedent function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use inspect.cleandoc instead.
  s = dedents('\n' + '\n'.join(lines[first:]))


In [2]:
cluster = SLURMCluster(cores=24, processes=2, memory="100GB",
                       interface="ib0", ip="10.80.8.37", project="ecam")
client = Client(cluster)
client

0,1
Client  Scheduler: tcp://10.80.8.37:46254  Dashboard: http://10.80.8.37:8787/status,Cluster  Workers: 0  Cores: 0  Memory: 0 B


## The Monte Carlo Method

In [3]:
import dask.array as da
import numpy as np

In [4]:
def calc_pi_mc(size_in_bytes):
    """Calculate PI using a Monte Carlo estimate."""
    xy = da.random.uniform(0, 1,
                           size=(int(size_in_bytes / 8 / 2), 2),
                           chunks=(200e6 / 8, 2))
    
    in_circle = ((xy ** 2).sum(axis=-1) < 1)
    pi = 4 * in_circle.mean()
    
    pi_err = 4 * (in_circle.var() / len(in_circle)) ** 0.5

    return pi.compute(), pi_err.compute()

In [5]:
def print_pi_stats(size, pi, pi_err, time_delta, num_workers):
    """Print pi, calculate offset from true value, and print some stats."""
    print(f"{size / 1e9} GB\n"
          f"\tMC pi: {pi : 13.11f} +/- {pi_err: 10.3e}"
          f"\tErr: {abs(pi - np.pi) : 10.3e}\n"
          f"\tWorkers: {num_workers}"
          f"\t\tTime: {time_delta : 7.3f}s")

## Scale cluster to two full nodes

In [6]:
cluster.scale(4)

## The actual calculations

We loop over different volumes of double-precision random numbers and estimate $\pi$ as described above.

In [7]:
from time import time

In [8]:
for size in (1e9 * n for n in (1, 10, 100)):
    
    start = time()
    pi, pi_err = calc_pi_mc(size)
    elaps = time() - start

    print_pi_stats(size, pi, pi_err,
                   time_delta=elaps,
                   num_workers=len(cluster.scheduler.workers))

1.0 GB
	MC pi:  3.14147276800 +/-  2.077e-04	Err:  1.199e-04
	Workers: 4		Time:  14.436s
10.0 GB
	MC pi:  3.14156061440 +/-  6.569e-05	Err:  3.204e-05
	Workers: 4		Time:   3.432s
100.0 GB
	MC pi:  3.14159843072 +/-  2.077e-05	Err:  5.777e-06
	Workers: 4		Time:  20.759s


## Scaling the Cluster

We increase the number of workers by 2 and the re-run the experiments.

In [9]:
from time import sleep

In [10]:
new_num_workers = 2 * len(cluster.scheduler.workers)

print(f"Scaling from {len(cluster.scheduler.workers)} to {new_num_workers} workers.")

cluster.scale(new_num_workers)

sleep(3)

Scaling from 4 to 8 workers.


In [11]:
client

0,1
Client  Scheduler: tcp://10.80.8.37:46254  Dashboard: http://10.80.8.37:8787/status,Cluster  Workers: 4  Cores: 48  Memory: 200.00 GB


In [12]:
for size in (1e9 * n for n in (1, 10, 100)):
    
        
    start = time()
    pi, pi_err = calc_pi_mc(size)
    elaps = time() - start

    print_pi_stats(size, pi, pi_err,
                   time_delta=elaps,
                   num_workers=len(cluster.scheduler.workers))

1.0 GB
	MC pi:  3.14133523200 +/-  2.077e-04	Err:  2.574e-04
	Workers: 4		Time:   2.758s
10.0 GB
	MC pi:  3.14155939200 +/-  6.569e-05	Err:  3.326e-05
	Workers: 4		Time:   3.372s
100.0 GB
	MC pi:  3.14159551552 +/-  2.077e-05	Err:  2.862e-06
	Workers: 8		Time:  11.774s


## Automatically Scaling the Cluster

We want each calculation to take approximately the same time irrespective of the actual work load.

_**Watch** how the cluster will scale down to the minimum a few (three!) seconds after being made adaptive._

In [13]:
# Check docstring of distributed.Adaptive for keywords
ca = cluster.adapt(
    minimum=4, maximum=40,
    target_duration="240s",  # times threads per process
    scale_factor=1.5);

sleep(4)  # Allow for scale-down

In [14]:
client

0,1
Client  Scheduler: tcp://10.80.8.37:46254  Dashboard: http://10.80.8.37:8787/status,Cluster  Workers: 4  Cores: 48  Memory: 200.00 GB


Repeat the calculation from above with larger work loads.  (And watch the dash board!)

In [15]:
for size in (n * 1e9 for n in (1, 10, 100, 1000)):
    
    
    start = time()
    pi, pi_err = calc_pi_mc(size)
    elaps = time() - start

    print_pi_stats(size, pi, pi_err,
                   time_delta=elaps,
                   num_workers=len(cluster.scheduler.workers))
    
    sleep(4)  # allow for scale-down time

1.0 GB
	MC pi:  3.14156832000 +/-  2.077e-04	Err:  2.433e-05
	Workers: 4		Time:   2.716s
10.0 GB
	MC pi:  3.14167440000 +/-  6.569e-05	Err:  8.175e-05
	Workers: 4		Time:   3.394s


JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending


100.0 GB
	MC pi:  3.14156795008 +/-  2.077e-05	Err:  2.470e-05
	Workers: 6		Time:  17.806s


JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called with a number of workers lower that what is already running or pending
JobQueueCluster.scale_up was called w

1000.0 GB
	MC pi:  3.14159015763 +/-  6.569e-06	Err:  2.496e-06
	Workers: 34		Time:  45.215s


## Complete listing of software used here

In [16]:
%pip list

/usr/bin/sh: module: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_module'
/usr/bin/sh: jutil: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_jutil'
/usr/bin/sh: ml: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_ml'
Package            Version          
------------------ -----------------
asciitree          0.3.3            
aspy.yaml          1.2.0            
backcall           0.1.0            
bokeh              1.1.0            
certifi            2019.3.9         
cfgv               1.6.0            
cftime             1.0.3.4          
Click              7.0              
cloudpickle        1.0.0            
cycler             0.10.0           
cytoolz            0.9.0.1          
dask               1.2.0            
dask-jobqueue      0.4.1+32.g9c3371d
decorator          4.4.0            
dist

In [17]:
%conda list --explicit

/usr/bin/sh: module: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_module'
/usr/bin/sh: jutil: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_jutil'
/usr/bin/sh: ml: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_ml'
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
@EXPLICIT
https://conda.anaconda.org/conda-forge/linux-64/git-lfs-2.7.2-0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/ca-certificates-2019.3.9-hecc5488_0.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libgcc-ng-8.2.0-hdf63c60_1.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libgfortran-ng-7.3.0-hdf63c60_0.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libstdcxx-ng-8.2.0-hdf63c60_1.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/