# Using Python in Amazon's AWS Lambda Service

Austin Godber  
@godber

<img style="float: right" src="lambda.png">
www
DesertPy - 5/25/2016

# What is AWS Lambda?

"AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume - there is no charge when your code is not running."

"You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app."

https://aws.amazon.com/lambda/

# Features

* Extend other AWS services with custom logic
* Build custom back-end services
* Completely Automated Administration
* Built-in Fault Tolerance
* Automatic Scaling
* Integrated Security Model
* Bring Your Own Code
* Pay Per Use

# How it works

<img src="Lambda_HowItWorks.png">

# Use Cases

# Real Time File Processing

<img src="Lambda_FileProcessing.png">

# Real Time Stream Processing

<img src="Lambda_StreamProcessing.png">

# Extract, Transform, Load

<img src="Lambda_ETL.png">

# Backend - IOT

<img src="Lambda_IoT.png">

# Backend - Mobile

<img src="Lambda_MobileBackends.png">

# Backend - Web

<img src="Lambda_WebApplications.png">

# Let's consider a file processing example

<img src="Lambda_FileProcessing.png">

A canonical example is described in the following **S3 Walkthrough**:

http://docs.aws.amazon.com/lambda/latest/dg/with-s3.html

**IMPORTANT!! Much AWS Setup hidden here!!**

# But let's mix it up a little bit.


### Let's do something that requires external dependencies.

We will follow the example of generating a GeoJSON outline:

http://www.perrygeo.com/running-python-with-compiled-code-on-aws-lambda.html

That is, taking a GeoTIFF Digital Elevation Model like this

<img src="srtm_21_09.png" height="400px" width="400px">

And generating a GeoJSON outline of the aread that looks like this

<img src="srtm_21_09_geojson.png" height="400px" width="400px">

# Process Overview

1. Sign Up For AWS Account
1. Setup user, buckets, events and Lambda function as shown in **S3 walkthrough**
1. **Launch EC2 instance** using Amazon Linux AMI
1. SSH to EC2 instance, **build shared libries** from source.
1. **Create a virtualenv** with all your python dependecies.

# Process Overview (cont.)

6. Write top level python **handler** function to respond to events and interact with other parts of AWS
7. Write python **worker**, as a stand alone command line interface, to process the data
8. **Create a zipfile** containing your code, virtualenv and the binary libs
9. **Publish the zip file to AWS Lambda**

# Bonus Tips

* Make sure you zip **all** of your compiled dependencies
* Write a debug handler to run on EC2 to test your worker
* Become familiar with the CloudWatch Log Viewer
* Use the test setup from the **S3 Walkthrough**
* Write a `Makefile` to speed up iterations

# The Handler

* Receives and parses event (see the test procedure in the **S3 Walkthrough**)
* Grabs file from S3
* Calls worker on local tempfile
* Uploads result to S3

In [17]:
%pycat code/handler.py

# The Worker

* Should run standalone on EC2 instance
* Could be ANYTHING that runs on Linux: Py3, C, Fortran
* Does the STUFF
* Prints path to output file to STDOUT, read by handler.

In [18]:
%pycat code/worker.py

# The Makefile

In [19]:
%pycat code/Makefile

# Check Out S3 Buckets

https://console.aws.amazon.com/s3/home?region=us-west-2

* asg-python-lambda-source
* asg-python-lambda-source-geojson

Make sure your buckets and Lambda functions are all in the same AWS Region.  Note **Event Notifications** config.

# Check Out Lambda Config

https://us-west-2.console.aws.amazon.com/lambda/home?region=us-west-2#/functions/handler

# Check out Cloud Watch Log Viewer

https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2

# Other File Processing Ideas

* PDF Text/Image Extraction
* Raw Text Data Cleanup/Munging
* File encoding/conversions (WAV to mp3)
* Image calibration pipelines

Imagine long pipelines you drop raw data in and gradually build on over time.