In [7]:
import tensorflow as tf

## Setting up the Monte Carlo


Imagine a 1x1 square. We throw random darts at it.
We can get the Area by seeing number of darts that are at a distance of no more than 1unit from the centre.

Since `area` is $\pi r^2$ and $r = 0.5$

$\pi = \frac{A(r)}{r^2} = \frac{A}{0.25} = 4A$


In [8]:
def calc_pi(num_sims=100000000):
  x = tf.random.uniform(shape=[num_sims]) * 2 - 1
  y = tf.random.uniform(shape=[num_sims]) * 2 - 1

  distance = tf.math.sqrt(
      x*x + y*y, name='Distance'
  )

  within_radius = tf.cast(distance < 1, tf.int32)
  area = tf.reduce_sum(within_radius) / num_sims
  return 4 * area

In [13]:
calc_pi().numpy()

3.14152004

## CPU

In [5]:
tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

In [6]:
%%timeit
calc_pi()

1 loop, best of 3: 2.03 s per loop


## GPU

In [9]:
tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [10]:
%%timeit
4 * area

The slowest run took 26.28 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 71.4 µs per loop


## TPU

In [9]:
tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

In [10]:
%%timeit
4 * area


The slowest run took 177.85 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 24.9 µs per loop
