# Estimating Pi Value using Spark

The `os` module provides a portable way of using operating system dependent functionality.

In [1]:
import os 

The `sys` module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter. 

In [2]:
import sys 

Install `py4j` and Setup Spark Home

In [3]:
os.environ['SPARK_HOME'] = "C:/Apache/spark-1.5.0"
sys.path.append("C:/Apache/spark-1.5.0/python")

Import `SparkContext` from `pyspark` module

In [4]:
from pyspark import SparkContext

The `random` module implements pseudo-random number generators for various distributions.

In [5]:
from random import random

The `operator` module exports a set of efficient functions corresponding to the intrinsic operators of Python

In [6]:
from operator import add

Run a simulation by generating random numbers between 0 and 1, find out how many numbers will fall in the 1/4th of the circle area

In [7]:
sc = SparkContext(appName="PythonPi")

n = 100000

def f(_):
    x = random() * 2 - 1
    y = random() * 2 - 1
    return 1 if x ** 2 + y ** 2 < 1 else 0

count = sc.parallelize(range(1, n + 1), 3).map(f).reduce(add)


In [8]:
print("Pi is roughly %f" % (4.0 * count / n))

Pi is roughly 3.145160
