# PyWren RISECamp, 2017

Welcome to the hands-on tutorial for PyWren.

This tutorial consists of a set of exercises that will have you working directly with PyWren:
- Some introduction exercises (covered in this notebook)
- data analysis on a wikipedia dataset (see analyze-wikipedia.ipynb)
- matrix multiplication with PyWren (see matrix.ipynb)
- hyperparameter optimization (see hyperparameter-optimization.ipynb)




## Introduction to PyWren

You can find solutions for this notebook at:
https://github.com/ucbrise/risecamp/tree/master/pywren/solution/pywren-intro-solution.ipynb

First, let's write a simple Hello world program to test out PyWren.

In [None]:
# some libraries that are useful for this tutorial
import sys
from training import *

# We need to load PyWren and create an executor instance
import pywren
pwex = pywren.default_executor()

## 1. call_async() -- our single invocation API
We use the `call_async()` API in pywren executor to run a function in the cloud
The workflow is pretty simple and looks like this:

```python
def my_func(param):
    # do something
    return some_result
    
handler = pwex.call_async(my_func, param)
result = handler.result()
```

**Exercise**: modify the following code block to run hello world with pywren



In [None]:
# first we need a basic hello world function
def hello_world(param):
    if param == 42:
        return "Hello world!"

future = pwex.call_async()
# on success, this line should print out "Hello world!"
check_result_1(future.result())

## 2. map() -- parallel execution in the cloud
The above example executes a function once in the cloud. This is pretty neat, but pywren *really* shines when we want to run functions multiple times in parallel.
To do this, we can use PyWren's `map()` API that allows users to call a function over multiple parameters:

```python
handlers = pwex.map(my_func, param_list)
pywren.wait(handlers)

results = [h.result() for h in handlers]
```


## 3. wait() API and multiple jobs

`map` returns a list of `futures`, which represents separate lambda invocations which may not have completed and have results yet. In order to track the progress of our job-set, we use the `wait` API.


In [None]:
import numpy as np

def my_function(b):
    x = np.random.normal(0, b, 1024)
    A = np.random.normal(0, b, (1024, 1024))
    return np.dot(A, x)

pwex = pywren.default_executor()
futures = pwex.map(my_function, np.linspace(0.1, 10, 100))
pywren.wait(futures, return_when=pywren.ALL_COMPLETED)

for fut in futures:
    print(fut.result())


`wait` polls S3 for the results of any finished jobs, and return two lists: finished and unfinished jobs.

By default it blocks until all jobs have completed, though you can also make it block until at least one job has completed, with `return_when=ANY_COMPLETED`, or return immediately with `ALWAYS`

**Exercise**: modify the following code block to return a list of the squares of the integers 0...9

In [None]:
# do not modify code here
param_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# do not modify code above

def square(param):
    return param

futures = pwex.call_async(square, None)

pywren.wait(futures)

results = [f.result() for f in futures]

## 4. Visualization and Debugging
From the talk, you know what happens behind every PyWren execution. Let's see it all in action!

**Exercise**: inspect PyWren's execution by running the plotting code below

In [None]:
plot_pywren_execution(futures)

Another tool you can use is to print CloudWatch logs which gives you more information about the latest Lambda execution.

In [None]:
!pywren print_latest_logs

This concludes our startup section. You can find more documentation on PyWren APIs and usage at http://pywren.io/