# Monte-Carlo Estimate of $\pi$

We want to estimate the number $\pi$ using a [Monte-Carlo method](https://en.wikipedia.org/wiki/Pi#Monte_Carlo_methods) exploiting that the area of a quarter circle of unit radius is $\pi/4$ and that hence the probability of any randomly chosen point in a unit square to lie in a unit circle centerd at a corner of the unit square is $\pi/4$ as well.  So for N randomly chosen pairs $(x, y)$ with $x\in[0, 1)$ and $y\in[0, 1)$, we count the number $N_{circ}$ of pairs that also satisfy $(x^2 + y^2) < 1$ and estimage $\pi \approx 4 \cdot N_{circ} / N$.

[<img src="https://upload.wikimedia.org/wikipedia/commons/8/84/Pi_30K.gif" 
     width="50%" 
     align=top
     alt="PI monte-carlo estimate">](https://en.wikipedia.org/wiki/Pi#Monte_Carlo_methods)

## Core Lessons

- short Dask recap (assuming that `LocalCluster`, `Client`, and `dask.array` are familiar)
- Scaling (local) clusters
- Adaptive (local) clusters

## Set up a local cluster

In [1]:
from dask.distributed import LocalCluster, Client

In [2]:
cluster = LocalCluster(n_workers=2, threads_per_worker=1, memory_limit=1e9)
client = Client(cluster)
client

0,1
Client  Scheduler: tcp://127.0.0.1:43846  Dashboard: http://127.0.0.1:8787/status,Cluster  Workers: 2  Cores: 2  Memory: 2.00 GB


## The Monte Carlo Method

In [3]:
import dask.array as da
import numpy as np

In [4]:
def calc_pi_mc(size_in_bytes):
    """Calculate PI using a Monte Carlo estimate."""
    xy = da.random.uniform(0, 1,
                           size=(int(size_in_bytes / 8 / 2), 2),
                           chunks=(100e6 / 8, 2))
    
    in_circle = ((xy ** 2).sum(axis=-1) < 1)
    pi = 4 * in_circle.mean()

    return pi.compute()

In [5]:
def print_pi_stats(size, pi, time_delta, num_workers):
    """Print pi, calculate offset from true value, and print some stats."""
    print(f"{size / 1e9} GB\n"
          f"\tMC pi: {pi : 13.11f}"
          f"\tErr: {abs(pi - np.pi) : 10.3e}\n"
          f"\tWorkers: {num_workers}"
          f"\t\tTime: {time_delta : 7.3f}s")

## The actual calculations

We loop over different volumes of double-precision random numbers and estimate $\pi$ as described above.

In [6]:
from time import time

In [7]:
for size in (1e9 * n for n in (2, 4, 8, 16)):
    
    start = time()
    pi = calc_pi_mc(size)
    elaps = time() - start

    print_pi_stats(size, pi,
                   time_delta=elaps,
                   num_workers=len(cluster.workers))

2.0 GB
	MC pi:  3.14165280000	Err:  6.015e-05
	Workers: 2		Time:   3.535s
4.0 GB
	MC pi:  3.14143952000	Err:  1.531e-04
	Workers: 2		Time:   6.481s
8.0 GB
	MC pi:  3.14139753600	Err:  1.951e-04
	Workers: 2		Time:  12.800s
16.0 GB
	MC pi:  3.14158570400	Err:  6.950e-06
	Workers: 2		Time:  25.747s


## Scaling the Cluster

We increase the number of workers by 2 and the re-run the experiments.

In [8]:
from time import sleep

In [9]:
new_num_workers = 2 * len(cluster.workers)

print(f"Scaling from {len(cluster.workers)} to {new_num_workers} workers.")

cluster.scale(new_num_workers)

sleep(3)

Scaling from 2 to 4 workers.


In [10]:
client

0,1
Client  Scheduler: tcp://127.0.0.1:43846  Dashboard: http://127.0.0.1:8787/status,Cluster  Workers: 4  Cores: 4  Memory: 4.00 GB


In [11]:
for size in (1e9 * n for n in (2, 4, 8, 16)):
    
    start = time()
    pi = calc_pi_mc(size)
    elaps = time() - start
    print_pi_stats(size, pi, time_delta=elaps,
                   num_workers=len(cluster.workers))

2.0 GB
	MC pi:  3.14174988800	Err:  1.572e-04
	Workers: 4		Time:   2.168s
4.0 GB
	MC pi:  3.14155224000	Err:  4.041e-05
	Workers: 4		Time:   3.630s
8.0 GB
	MC pi:  3.14148824000	Err:  1.044e-04
	Workers: 4		Time:   7.018s
16.0 GB
	MC pi:  3.14160793200	Err:  1.528e-05
	Workers: 4		Time:  13.904s


## Automatically Scaling the Cluster

We want each calculation to take approximately the same time irrespective of the actual work load.

_**Watch** how the cluster will scale down to the minimum a few (three!) seconds after being made adaptive._

In [12]:
# Check docstring of distributed.Adaptive for keywords
ca = cluster.adapt(
    minimum=1, maximum=8,
    target_duration="15s",
    scale_factor=1);

sleep(4)  # Allow for scale-down

In [13]:
client

0,1
Client  Scheduler: tcp://127.0.0.1:43846  Dashboard: http://127.0.0.1:8787/status,Cluster  Workers: 1  Cores: 1  Memory: 1000.00 MB


Repeat the calculation from above with larger work loads.  (And watch the dash board!)

In [14]:
for size in (n * 1e9 for n in (2, 4, 8, 16, 32)):
    
    start = time()
    pi = calc_pi_mc(size)
    elaps = time() - start
    
    print_pi_stats(size, pi, time_delta=elaps,
                   num_workers=len(cluster.workers))
    
    sleep(4)  # allow for scale-down time

2.0 GB
	MC pi:  3.14140288000	Err:  1.898e-04
	Workers: 1		Time:   6.816s
4.0 GB
	MC pi:  3.14146401600	Err:  1.286e-04
	Workers: 1		Time:  13.075s
8.0 GB
	MC pi:  3.14165064800	Err:  5.799e-05
	Workers: 2		Time:  15.139s
16.0 GB
	MC pi:  3.14153852000	Err:  5.413e-05
	Workers: 4		Time:  15.182s
32.0 GB
	MC pi:  3.14157405800	Err:  1.860e-05
	Workers: 8		Time:  16.313s


## Complete listing of software used here

In [15]:
%pip list

/usr/bin/sh: module: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_module'
/usr/bin/sh: jutil: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_jutil'
/usr/bin/sh: ml: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_ml'
Package          Version 
---------------- --------
asciitree        0.3.3   
backcall         0.1.0   
bokeh            1.1.0   
certifi          2019.3.9
cftime           1.0.3.4 
Click            7.0     
cloudpickle      1.0.0   
cycler           0.10.0  
cytoolz          0.9.0.1 
dask             1.2.2   
dask-jobqueue    0.4.1   
decorator        4.4.0   
distributed      1.28.1  
docrep           0.2.5   
fasteners        0.14.1  
heapdict         1.0.0   
ipykernel        5.1.1   
ipython          7.5.0   
ipython-genutils 0.2.0   
jedi             0.13.3  
Jinja2           2.10.1 

In [16]:
%conda list --explicit

/usr/bin/sh: module: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_module'
/usr/bin/sh: jutil: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_jutil'
/usr/bin/sh: ml: line 1: syntax error: unexpected end of file
/usr/bin/sh: error importing function definition for `BASH_FUNC_ml'
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
@EXPLICIT
https://conda.anaconda.org/conda-forge/linux-64/git-lfs-2.7.2-0.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/ca-certificates-2019.3.9-hecc5488_0.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libgcc-ng-8.2.0-hdf63c60_1.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libgfortran-ng-7.3.0-hdf63c60_0.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libstdcxx-ng-8.2.0-hdf63c60_1.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/