In this exercise, we will perform a Monte Carlo simulation for a value at risk analysis (VaR) - specifically, the 99% VaR for Apple stock after 20 days. We will time the output and speed of the operation and then run it in parallel to compare the performance.

In [67]:
from IPython import parallel
import urllib
import pandas as pd
import numpy as np
from pandas import Series
appleRaw = 'https://raw.githubusercontent.com/dstern04/MSDA-Work/master/IS602/apple.2011.csv'
appleCSV = urllib.urlopen(appleRaw)
appleData = pd.read_csv(appleCSV)
appleData.columns = ['date','price','percent change']
percent = Series(appleData['percent change'][1:],dtype=float)
mu, sigma = np.mean(percent), np.std(percent)
finalPrice = []

In [68]:
def price_floor_99(n):
    for i in range(0,n):
        lastPrice = appleData['price'][251]
        twentyChanges = np.random.normal(mu, sigma, 20)
        nextTwenty = []
        for each in twentyChanges:
            lastPrice = lastPrice + lastPrice*each
            nextTwenty.append(lastPrice)
        finalPrice.append(nextTwenty[19])
    return np.percentile(finalPrice,1)

In [69]:
price_floor_99(10000)

348.22629521954127

In [70]:
%timeit price_floor_99(10000)

1 loops, best of 3: 461 ms per loop


Before running the next function, I started 4 parallel clusters by typing the following in the terminal: "ipcluster start -n 4". The following command will confirm that the 4 clusters are running.

In [75]:
len(clients.ids)

4

In [72]:
def price_floor_99_parallel(n):
    clients = parallel.Client()
    clients.block = True
    dview = clients.direct_view()
    results = dview.apply(price_floor_99, n/len(clients.ids))
    return np.mean(results)

In [73]:
price_floor_99_parallel(10000)

346.78009812275434

In [74]:
%timeit price_floor_99_parallel(10000)

1 loops, best of 3: 264 ms per loop


Our results end up being very close, though the parallel computing approach takes about half the time.