# Estimating Pi Using PySpark

This script demonstrates how to use **PySpark** to estimate the value of **π (Pi)** using **Monte Carlo simulation**. It distributes the computation across a Spark cluster, leveraging parallel processing.

## **How It Works**
The **Monte Carlo method** estimates Pi by randomly generating points inside a unit square and checking how many fall inside a quarter circle.

- Uses:

  $$
  \pi \approx 4 \times \frac{\text{Points inside}}{\text{Total points generated}}
  $$  

In [3]:
import findspark
findspark.init()
from pyspark.sql import SparkSession
import random

spark = SparkSession.builder \
    .master('spark://localhost:7077') \
    .appName('Pi-Estimation') \
    .getOrCreate()

NUM_SAMPLES = 1000000  # Number of random points to sample

def inside(p):
    x, y = random.random(), random.random()
    return x*x + y*y < 1

count = spark.sparkContext.parallelize(range(0, NUM_SAMPLES)) \
        .filter(inside).count()
print('Pi is roughly {}'.format(4.0 * count / NUM_SAMPLES))

spark.stop()

25/02/16 10:11:44 WARN GarbageCollectionMetrics: To enable non-built-in garbage collector(s) List(G1 Concurrent GC), users should configure it(them) to spark.eventLog.gcMetrics.youngGenerationGarbageCollectors or spark.eventLog.gcMetrics.oldGenerationGarbageCollectors
                                                                                

Pi is roughly 3.140176


# Mathematics Behind the Monte Carlo Pi Estimation

The **Monte Carlo method** is a statistical simulation technique that uses random sampling to estimate numerical results. In our example, we use it to estimate the value of **π (Pi)**.

## Understanding the Problem: Estimating Pi (π)
We inscribe a **quarter-circle** inside a **unit square** and use random points to estimate the ratio between their areas.

### 1. The Geometric Setup
We consider:

- A **unit square** with side length **1**, covering the range **(0 ≤ x ≤ 1, 0 ≤ y ≤ 1)**.
- A **quarter-circle** of radius **1**, centered at **(0,0)**, with the equation:

  $$
  x^2 + y^2 \leq 1
  $$

- The **area** of the quarter-circle is:

  $$
  A_{\text{circle}} = \frac{\pi r^2}{4} = \frac{\pi}{4}
  $$

- The **area** of the unit square is:

  $$
  A_{\text{square}} = 1^2 = 1
  $$

### 2. Using Random Sampling
We generate **random points** \( (x, y) \) inside the unit square:

  $$
  x, y \sim U(0,1)
  $$

  *(Uniform distribution between 0 and 1)*

We check if a point **falls inside the quarter-circle**:

  $$
  x^2 + y^2 < 1
  $$

The **probability** of a random point falling inside the quarter-circle is:

  $$
  P = \frac{\text{Points inside the quarter-circle}}{\text{Total points generated}}
  $$

Since the **ratio of areas** is:

  $$
  \frac{\pi}{4} \approx P
  $$

We can solve for **π**:

  $$
  \pi \approx 4 \times P = 4 \times \frac{\text{Points inside}}{\text{Total points generated}}
  $$

This approximation gets **more accurate** as the number of random points increases!
