### PyWren RISECamp, 2017

In [101]:
%pylab inline
import boto3
import cloudpickle
import itertools
import concurrent.futures as fs
import io
import numpy as np
import time

Populating the interactive namespace from numpy and matplotlib


Welcome to the hands-on tutorial for PyWren.

This tutorial consists of a set of exercises that will have you working directly with PyWren:
- simple matrix multiplication
- data analysis on a wikipedia dataset
- some machine learning algorithms (Eric's) 


## 0. Hello World

First, let's write a simple hello program to test out PyWren.



In [2]:
# first we need to load PyWren and creates an executor instance
import pywren
pwex = pywren.default_executor()

We can use the `call_async()` API on pywren executor to run the function in the cloud
The workflow is pretty simple and looks like this:

```python
def my_func(param):
    # do something
    return some_result
    
handler = pwex.call_async(my_func, param)
result = handler.result()
```

**Exercise**: modify the following code block to run hello world with pywren

**TODO: I think we need helper function to test against the output. This gives attendees more incentive to get things right.**

In [7]:
# first we need a basic hello world function
def hello_world(param):
    return "hello world!"

handler = pwex.call_async(hello_world, 0) 
# on success, this line should print out "hello world"
print(handler.result())

hello world!


The above example runs a single function in the cloud.
Now PyWren also has a `map()` API that allows users to run a single function with multiple parameters:

```python
handlers = pwex.map(my_func, param_list)
pywren.wait(handlers)

results = [h.result() for h in handlers]
```

**Exercise**: modify the following code block to print "hello world"

In [None]:
# do not modify code here
def hello_world(param):
    if param == 1:
        return "hello"
    if param == 2:
        return "world!"
# do not modify code above

param_list = []
handlers = pwex.call_async(hello_world, None)

results = [h.result() for h in handlers] 
print(" ".join(results))

## 1. Matrix Multiplication

One nice thing about PyWren is it allows users to integrate existing python libraries easily.
For the following exercise, we are going to use some popular python libraries, e.g., NumPy, to work on some matrix multiplication problems.

In [None]:
import numpy as np

def my_function(b):
    x = np.random.normal(0, b, 1024)
    A = np.random.normal(0, b, (1024, 1024))
    return np.dot(A, x)

pwex = pywren.default_executor()
res = pwex.map(my_function, np.linspace(0.1, 10, 100))


## 2. Data Analytics with Wikipedia Dataset

## 3. Some Machine Learning

In [11]:
from sklearn.datasets import fetch_mldata
import matrix
import numpy as np
import pywren

In [4]:
X = np.random.randn(32768,128)

In [7]:
X_sharded = matrix.ShardedMatrix("x", bucket="vaishaalpywren", shape=X.shape, shard_size_0=4096)

In [8]:
X_sharded.shard_matrix(np.random.randn(32768,128), n_jobs=10)

[((0, 4096), (0, 128)), ((4096, 8192), (0, 128)), ((8192, 12288), (0, 128)), ((12288, 16384), (0, 128)), ((16384, 20480), (0, 128)), ((20480, 24576), (0, 128)), ((24576, 28672), (0, 128)), ((28672, 32768), (0, 128))]


0

In [16]:
def xyt(x,y, z, blocks):
    block0, block1 = blocks
    submatrix_0 = x.get_block(block0, 0)
    submatrix_1 = y.get_block(block1, 0)
    out_matrix = submatrix_0.dot(submatrix_1.T)
    z.put_block(block0, block1, out_matrix)
    return 0


x_sharded = matrix.ShardedMatrix("x", bucket="vaishaalpywren", shape=X.shape, shard_size_0=4096)
xxt_sharded = matrix.ShardedMatrix("z", bucket="vaishaalpywren", shape=(X.shape[0], X.shape[0]), shard_size_0=4096, shard_size_1=4096)

pwex = pywren.default_executor()

futures0 = pwex.map(lambda a: xyt(x_sharded, x_sharded, xxt_sharded, a), xxt_sharded.block_idxs_not_exist)


In [18]:
pywren.wait(futures0)

([<pywren.future.ResponseFuture at 0x1197f10b8>,
  <pywren.future.ResponseFuture at 0x1178c40f0>,
  <pywren.future.ResponseFuture at 0x118382080>,
  <pywren.future.ResponseFuture at 0x119696ef0>,
  <pywren.future.ResponseFuture at 0x10a3bb4e0>,
  <pywren.future.ResponseFuture at 0x118359f60>,
  <pywren.future.ResponseFuture at 0x119c10080>,
  <pywren.future.ResponseFuture at 0x1182e3ac8>,
  <pywren.future.ResponseFuture at 0x10a3bb048>,
  <pywren.future.ResponseFuture at 0x11836c668>,
  <pywren.future.ResponseFuture at 0x11836c8d0>,
  <pywren.future.ResponseFuture at 0x11a0521d0>,
  <pywren.future.ResponseFuture at 0x1178c1668>,
  <pywren.future.ResponseFuture at 0x119696f60>,
  <pywren.future.ResponseFuture at 0x10a3bb278>,
  <pywren.future.ResponseFuture at 0x1178c1390>,
  <pywren.future.ResponseFuture at 0x10f7966a0>,
  <pywren.future.ResponseFuture at 0x11931ecc0>,
  <pywren.future.ResponseFuture at 0x119c100b8>,
  <pywren.future.ResponseFuture at 0x118382048>,
  <pywren.future.Res

In [19]:
[f.result() for f in futures0]

[0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0]

In [21]:
xxt_sharded.get_block(0,0)


array([[ 140.17207881,    5.51050477,   -8.44600443, ...,  -20.87518709,
           0.90908415,   10.89966011],
       [   5.51050477,  163.04454963,  -11.39085155, ...,    6.66628488,
           1.35222017,    9.0063894 ],
       [  -8.44600443,  -11.39085155,  136.94271241, ...,   21.13998887,
         -11.62821219,   -3.53400135],
       ..., 
       [ -20.87518709,    6.66628488,   21.13998887, ...,  167.26547952,
          -7.57375346,   -6.03390309],
       [   0.90908415,    1.35222017,  -11.62821219, ...,   -7.57375346,
         149.16393331,   -9.14872823],
       [  10.89966011,    9.0063894 ,   -3.53400135, ...,   -6.03390309,
          -9.14872823,  124.86665548]])