# IS602 -- Assignment 11 -- James Hamski

1. Take your solution from Homework 11 and complete the Monte Carlo step (step 6) in parallel.  There are many ways you can go about doing this, and I'm not looking for anything too complicated.  If you can get multiple processes crunching the data together, that is great.  Using IPython’s built-in tools would be a great method
2. Compare the timing for your solution in homework 11 and this parallel solution.  This is similar to what you did in homeworks 6 and 7.  Ideally, you'll see some speed improvement.  The amount you see will largely be based the capabilities of your hardware, and less on the software implementation.  There is additional overhead for running an operation in parallel, so speed gains will be more obvious with a larger number of calculations.

*Python Modules*

In [66]:
%pylab inline

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
import ipyparallel

Populating the interactive namespace from numpy and matplotlib


In [2]:
! ipcluster start -n 4 --daemon

*Data Import and Formatting* 

In [3]:
apple = pd.read_csv('apple.2011.csv', header=0, na_values='XXXXX', parse_dates=True)
apple.columns = ['Date', 'Last', 'PercentChange']

In [4]:
apple['Date'] = pd.to_datetime(apple['Date'])

*Generate random numbers with the same probability distribution as the Percent Change column*

In [5]:
apple_mean_df = apple.mean()
apple_std_df = apple.std()

In [6]:
apple_mean = apple_mean_df['PercentChange']
print apple_mean

0.000957355207171


In [7]:
apple_std = apple_std_df['PercentChange']
print apple_std

0.0165205562984


In [8]:
def GeneratePercentChange(mean, std):
    return np.random.normal(loc=mean, scale=std, size=20)

In [9]:
def PriceQuote(start, PercentChange):
    price = start
    for i in PercentChange:
        price = price+(price*i)
    return price

Value at Risk with 99% confidence:

In [48]:
def MonteCarloSimulation(iterations):
    prices = []
    for i in range(0, iterations):
        random_walk = GeneratePercentChange(apple_mean, apple_std)
        end_price = PriceQuote(apple_start_price, random_walk)
        prices.append(end_price)
        
    prices_array = np.array(prices)
    
    VaR_99 = np.percentile(prices_array, 1)
    
    return VaR_99

In [49]:
iterations = 100000
apple_start_price = int(apple['Last'].tail(1))
print MonteCarloSimulation(iterations)

346.474411205


In [26]:
%timeit MonteCarloSimulation(iterations)

1 loops, best of 3: 796 ms per loop


In [53]:
def MonteCarloSimulationComb(iterations, mean, std, start):
    prices = []
    for i in range(0, iterations):               
        PercentChange = np.random.normal(loc=mean, scale=std, size=20)        
        price = start
        for i in PercentChange:
            price = price+(price*i)                    
        prices.append(price)
        
    prices_array = np.array(prices)
    
    VaR_99 = np.percentile(prices_array, 1)
    
    return VaR_99

In [54]:
MonteCarloSimulationComb(iterations, apple_mean, apple_std, apple_start_price)

346.63373368147433

In [55]:
%timeit MonteCarloSimulationComb(iterations, apple_mean, apple_std, apple_start_price)

1 loops, best of 3: 726 ms per loop


## Parallel Computing Task

First, confirm ipyparallel is working. 

In [13]:
from ipyparallel import Client
c = Client()
c.ids

set([0,1,2,3])

c[:].apply_sync(lambda: "Hello World")

['Hello World', 'Hello World', 'Hello World', 'Hello World']

The operations will be split among four cores. Therefore, the number of iterations for each function can be divided by four. 

In [87]:
iterations_parallel = 100000/4

In [88]:
def MonteCarloSimulationParallel(iterations, mean, std, start):
    prices = []
    for i in range(0, iterations):               
        PercentChange = np.random.normal(loc=mean, scale=std, size=20)        
        price = start
        for i in PercentChange:
            price = price+(price*i)                    
        prices.append(price)
        
    prices_array = np.array(prices)
    
    #VaR_99 = np.percentile(prices_array, 1)
    
    return prices_array

In [89]:
c.block = True
dview = c.direct_view()
dview.block = False

dview.execute('import numpy as np') 

<AsyncResult: execute>

In [90]:
def ParallelCalculation():
    result_arrays = c[:].apply_sync(MonteCarloSimulationParallel, iterations_parallel, apple_mean, apple_std, apple_start_price)
    results = np.concatenate(result_arrays)
    VaR_99 = np.percentile(results, 1)
    return VaR_99

In [91]:
ParallelCalculation()

346.27347287551618

In [92]:
%timeit ParallelCalculation()

1 loops, best of 3: 403 ms per loop


Confirm that the resulting array is the same size as what was used in the non-paralellized function. 

In [93]:
result_arrays = c[:].apply_sync(MonteCarloSimulationParallel, iterations_parallel, apple_mean, apple_std, apple_start_price)
results = np.concatenate(result_arrays)
results.size

100000

In [94]:
! ipcluster stop

2015-12-13 11:49:38.571 [IPClusterStop] Stopping cluster [pid=447] with [signal=2]
