## EyeWatchOne Engineering Prototype

Hyper local weather related accident prediction.

Gives a browser/mobile based heatmap overlay on satellite images showing where accidents are predicted to occur given historical accidents and incoming weather.


Targeted towards government planning for road emergency services, but can also be used for insurance claims adjustment, logistics companies's route planning, and much more. 

##### This notebook can be run on windows, osx, linux, any platform with aws-cli installed.

### 1. Fix errors in AWS command line interface

I had an issue with windows not associating .py with my Python install.

fix .py file association error
http://superuser.com/questions/429604/passing-arguments-to-a-python-script-file-association-not-found-windows-7-on-i

### 2. Upload test insurance data to S3 bucket: eyewatchone

code adapted from http://boto.cloudhackers.com/en/latest/s3_tut.html

Insurance data:  
470 MB zipped csv  
from https://www.kaggle.com/c/ClaimPredictionChallenge/

In [4]:
! aws s3 cp data/example_compressed_entry.zip s3://eyewatchone --grants \
    read=uri=http://acs.amazonaws.com/groups/global/AllUsers full=emailaddress=hollisnolan@gmail.com

upload: data/example_compressed_entry.zip to s3://eyewatchone/example_compressed_entry.zip


In [5]:
! aws s3 cp data/dictionary.html s3://eyewatchone --grants \
    read=uri=http://acs.amazonaws.com/groups/global/AllUsers full=emailaddress=hollisnolan@gmail.com

upload: data/dictionary.html to s3://eyewatchone/dictionary.html


#### Move the full sets of Insurance Data to S3

In [None]:
! aws s3 cp data/test_set.zip s3://eyewatchone --grants \
    read=uri=http://acs.amazonaws.com/groups/global/AllUsers full=emailaddress=hollisnolan@gmail.com

In [None]:
! aws s3 cp data/train_set.zip s3://eyewatchone --grants \
    read=uri=http://acs.amazonaws.com/groups/global/AllUsers full=emailaddress=hollisnolan@gmail.com

In [6]:
! aws s3 ls s3://eyewatchone

2016-03-07 18:50:42      24768 dictionary.html
2016-03-07 18:50:36    5120252 example_compressed_entry.zip
2016-02-27 20:29:50  138124104 test_set.zip
2016-02-27 20:31:27  380553491 train_set.zip


### 3. Spin up analytics EC2 instance

(right click for paste in windows)

Create an image from the Data Science Tookbox ami

In [7]:
# create an instance, DS-toolbox ami and XL size

!aws ec2 run-instances --image-id ami-d1737bb8 --count 1 --instance-type m3.xlarge --key-name .boto --security-groups my-sg

You must specify a region. You can also configure your region by running "aws configure".


Historical weather data:  
20 GB EBS snapshot  
https://aws.amazon.com/datasets/daily-global-weather-measurements-1929-2009-ncdc-gsod/  

In [None]:
# attach EBS historical weather data 

ec2-attach-volume snap-ac47f4c5 --instance i-582bf5dc --device  /dev/sdf

or retstart a stopped instance

In [8]:
# restart a stopped instance 

!aws ec2 start-instances --instance-ids i-582bf5dc --region us-east-1

{
    "StartingInstances": [
        {
            "InstanceId": "i-582bf5dc", 
            "CurrentState": {
                "Code": 0, 
                "Name": "pending"
            }, 
            "PreviousState": {
                "Code": 80, 
                "Name": "stopped"
            }
        }
    ]
}


In [9]:
# get PublicDnsName and IP
!aws ec2 describe-instances --instance-ids i-582bf5dc --region us-east-1 

{
    "Reservations": [
        {
            "OwnerId": "563534492411", 
            "ReservationId": "r-47e34b95", 
            "Groups": [], 
            "Instances": [
                {
                    "Monitoring": {
                        "State": "disabled"
                    }, 
                    "PublicDnsName": "ec2-54-88-161-196.compute-1.amazonaws.com", 
                    "RootDeviceType": "ebs", 
                    "State": {
                        "Code": 16, 
                        "Name": "running"
                    }, 
                    "EbsOptimized": true, 
                    "LaunchTime": "2016-03-08T03:17:16.000Z", 
                    "PublicIpAddress": "54.88.161.196", 
                    "PrivateIpAddress": "172.31.10.190", 
                    "ProductCodes": [], 
                    "VpcId": "vpc-66511502", 
                    "StateTransitionReason": "", 
                    "InstanceId": "i-582bf5dc", 
           

Ssh into the instance and start a notebook

In [None]:
# ssh in (in terminal) 
ssh -X -i ~/.ssh/aws.pem ubuntu@ec2-54-88-161-196.compute-1.amazonaws.com

# update 
sudo apt-get update && sudo apt-get upgrade

### super update (in terminal, very long)
sudo do-release-upgrade

# update pip 
sudo pip install pip --upgrade

# install notebook
sudo pip install --upgrade "ipython[notebook]"

# open ports 
In the Security Group, select Inbound, then Edit, and then Add Rule. The Port Range should be 8888 and the Sourceis 0.0.0.0/0

# start tmux 
tmux new -s notebook

# new notebook with ip specified
jupyter notebook --no-browser --ip=0.0.0.0

# grab ip address of instance and navigate to 
http://54.88.161.196:8888/tree
        
# save this notebook, close it locally, and use the "upload" button to move it into the aws instance
# rerun notebook from there

In [27]:
! scp -i /users/hn/.ssh/aws.pem 09-Demo.ipynb ubuntu@ec2-54-88-161-196.compute-1.amazonaws.com:~

09-Demo.ipynb                                   0%    0     0.0KB/s   --:-- ETA09-Demo.ipynb                                 100%   52KB  51.8KB/s   00:00    


### 4. Check historical weather in EBS public data set

how to 
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-attaching-volume.html

public weather
https://aws.amazon.com/datasets/daily-global-weather-measurements-1929-2009-ncdc-gsod/

In [2]:
! pwd

/home/ubuntu


List disks 

In [1]:
! lsblk

NAME  MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvdf  202:80   0  50G  0 disk 
xvda1 202:1    0   8G  0 disk /


mount disk 
(help from)
https://help.ubuntu.com/community/InstallingANewHardDrive

In [None]:
!sudo mount /dev/xvdf /hist_weather/

In [3]:
# install aws command line interface 
!sudo pip install awscli --upgrade

[33mThe directory '/home/ubuntu/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.[0m
[33mThe directory '/home/ubuntu/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.[0m
Requirement already up-to-date: awscli in /usr/local/lib/python2.7/dist-packages
Requirement already up-to-date: botocore==1.4.1 in /usr/local/lib/python2.7/dist-packages (from awscli)
Requirement already up-to-date: rsa<=3.3.0,>=3.1.2 in /usr/local/lib/python2.7/dist-packages (from awscli)
Requirement already up-to-date: colorama<=0.3.3,>=0.2.5 in /usr/local/lib/python2.7/dist-packages (from awscli)
Requirement already up-to-date: s3transfer==0.0.1 in /usr/local/lib/python2.7/dist-pa

In [4]:
# add credentials (in command line, not notebook)
!aws configure 

AWS Access Key ID [****************BYWA]: ^C



###  5. Move historical insurance to EBS

In [5]:
!aws s3 ls s3://eyewatchone

2016-03-08 02:50:42      24768 dictionary.html
2016-03-08 02:50:36    5120252 example_compressed_entry.zip
2016-02-28 04:29:50  138124104 test_set.zip
2016-02-28 04:31:27  380553491 train_set.zip


In [6]:
# make a directory for analytics 
!sudo mkdir eyewatchone

mkdir: cannot create directory ‘eyewatchone’: File exists


In [7]:
# copy files from s3
!sudo aws s3 cp s3://eyewatchone/example_compressed_entry.zip eyewatchone/

download: s3://eyewatchone/example_compressed_entry.zip to eyewatchone/example_compressed_entry.zip


In [9]:
# unzip
!sudo unzip eyewatchone/example_compressed_entry.zip

/bin/sh: 1: no: not found
Archive:  eyewatchone/example_compressed_entry.zip
replace example_compressed_entry.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename:  NULL
(EOF or read error, treating as "[N]one" ...)


###  6. Move a sample image to EC2 analytics instance 

Histoical satellite images:  
very very large, S3  
1 GB / image, updated daily 

Commands to explore files, what PRE means 
http://docs.aws.amazon.com/cli/latest/reference/s3/ls.html

In [13]:
! pwd

/home/ubuntu


In [19]:
! 

ls: cannot access datasat --recursive: No such file or directory


In [10]:
!aws s3 ls s3://nasanex/Landsat/gls/2000/001/012 --recursive 

2013-11-09 01:13:37   57716532 Landsat/gls/2000/001/012/p001r012_7x20010613.tar.gz


In [38]:
!aws s3 cp s3://nasanex/Landsat/gls/2000/001/012/p001r012_7x20010613.tar.gz .

download: s3://nasanex/Landsat/gls/2000/001/012/p001r012_7x20010613.tar.gz to ./p001r012_7x20010613.tar.gz


In [39]:
! tar -xvzf p001r012_7x20010613.tar.gz

./p001r012_7dt20010613_z24_10.tif
./p001r012_7dt20010613_z24_20.tif
./p001r012_7dt20010613_z24_30.tif
./p001r012_7dt20010613_z24_40.tif
./p001r012_7dt20010613_z24_50.tif
./p001r012_7dk20010613_z24_61.tif
./p001r012_7dk20010613_z24_62.tif
./p001r012_7dt20010613_z24_70.tif
./p001r012_7dp20010613_z24_80.tif
./p001r012_7x20010613.met


In [42]:
! rm p001r012_7x20010613.tar.gz

In [20]:
import matplotlib.pyplot as plt
image = plt.imread('p001r012_7dt20010613_z24_10.tif')
image

IOError: [Errno 2] No such file or directory: 'p001r012_7dt20010613_z24_10.tif'

Landsat specialized python utility 
http://landsat-util.readthedocs.org/en/latest/installation.html#ubuntu-14-10


In [None]:
# install depencencies 
!sudo apt-get -y install python-pip python-numpy python-scipy \
libgdal-dev libatlas-base-dev gfortran libfreetype6-dev

In [None]:
# install utility  ( reccomended at command line, long )
!yes | sudo pip install landsat-util

Alternate images source 
https://aws.amazon.com/datasets/ccafs-climate-data/?tag=datasets%23keywords%23climate  
6 TB, S3    
    

In [17]:
! aws s3 ls s3://cgiardata/ccafs/ccafs-climate/data/eta/eta_south_america/baseline/1970s/hadcm_high/20min/ --recursive

2014-11-10 14:37:38          0 ccafs/ccafs-climate/data/eta/eta_south_america/baseline/1970s/hadcm_high/20min/
2014-11-10 14:37:38    1992179 ccafs/ccafs-climate/data/eta/eta_south_america/baseline/1970s/hadcm_high/20min/hadcm_high_baseline_1970s_prec_20min_sa_eta_asc.zip
2014-11-10 14:37:39    1889826 ccafs/ccafs-climate/data/eta/eta_south_america/baseline/1970s/hadcm_high/20min/hadcm_high_baseline_1970s_tmax_20min_sa_eta_asc.zip
2014-11-10 14:37:40    1886173 ccafs/ccafs-climate/data/eta/eta_south_america/baseline/1970s/hadcm_high/20min/hadcm_high_baseline_1970s_tmean_20min_sa_eta_asc.zip
2014-11-10 14:37:40    1688460 ccafs/ccafs-climate/data/eta/eta_south_america/baseline/1970s/hadcm_high/20min/hadcm_high_baseline_1970s_tmin_20min_sa_eta_asc.zip


### 7. Query Weather API, Store to EBS

code adapted from http://stackoverflow.com/questions/12965203/how-to-get-json-from-webpage-into-python-script
and https://www.wunderground.com/weather/api/d/docs?MR=1

In [None]:
# create an instance, DS-toolbox ami and XL size
!aws ec2 run-instances --image-id ami-d1737bb8 --count 1 --instance-type m3.micro --key-name .boto --security-groups my-sg

In [10]:
# start instance 
!aws ec2 start-instances --instance-ids i-bb95183f --region us-east-1 

{
    "StartingInstances": [
        {
            "InstanceId": "i-bb95183f", 
            "CurrentState": {
                "Code": 0, 
                "Name": "pending"
            }, 
            "PreviousState": {
                "Code": 80, 
                "Name": "stopped"
            }
        }
    ]
}


In [11]:
# get PublicDnsName
!aws ec2 describe-instances --instance-ids i-bb95183f --region us-east-1 

{
    "Reservations": [
        {
            "OwnerId": "563534492411", 
            "ReservationId": "r-151696c7", 
            "Groups": [], 
            "Instances": [
                {
                    "Monitoring": {
                        "State": "disabled"
                    }, 
                    "PublicDnsName": "ec2-54-164-234-226.compute-1.amazonaws.com", 
                    "RootDeviceType": "ebs", 
                    "State": {
                        "Code": 16, 
                        "Name": "running"
                    }, 
                    "EbsOptimized": false, 
                    "LaunchTime": "2016-03-08T03:46:41.000Z", 
                    "PublicIpAddress": "54.164.234.226", 
                    "PrivateIpAddress": "172.31.11.202", 
                    "ProductCodes": [], 
                    "VpcId": "vpc-66511502", 
                    "StateTransitionReason": "", 
                    "InstanceId": "i-bb95183f", 
        

In [None]:
# ssh in (in terminal) 
ssh -X -i ~/.ssh/aws.pem ubuntu@ec2-54-164-113-126.compute-1.amazonaws.com

# new notebook with ip specified
jupyter notebook --no-browser --ip=0.0.0.0

# grab ip address of instance and navigate to 
http://52.91.161.73:8888/tree
        
# move notebook to instance

Here we are going to use Docker to quickly get Kafka working 
https://github.com/tobegit3hub/standalone-kafka

In [None]:
!sudo apt-get install -y docker

In [None]:
! docker run -d --net=host -e HOSTNAME=localhost tobegit3hub/standalone-kafka

In [None]:
! cd kafka

In [None]:
! bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic weather_stream

In [None]:
# open ports 
In the Security Group, select Inbound, then Edit, and then Add Rule, the Port Range should be 2181 and the Source is 0.0.0.0/0

In [None]:
! bin/kafka-console-producer.sh --broker-list localhost:9092 --topic weather_stream

In [None]:
! bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic weather_stream --from-beginning

In [2]:
import urllib, json
from pprint import pprint

url = "http://api.wunderground.com/api/     api key   /conditions/q/CA/San_Francisco.json"
response = urllib.urlopen(url)
data = json.loads(response.read())
pprint(data)

{u'current_observation': {u'UV': u'0',
                          u'dewpoint_c': 7,
                          u'dewpoint_f': 45,
                          u'dewpoint_string': u'45 F (7 C)',
                          u'display_location': {u'city': u'San Francisco',
                                                u'country': u'US',
                                                u'country_iso3166': u'US',
                                                u'elevation': u'47.00000000',
                                                u'full': u'San Francisco, CA',
                                                u'latitude': u'37.77500916',
                                                u'longitude': u'-122.41825867',
                                                u'magic': u'1',
                                                u'state': u'CA',
                                                u'state_name': u'California',
                                                u'wmo': u'99999',
       

### 8. Produce a Prediction

In [19]:
import numpy as np

pred = np.random.rand(1)
pred

array([ 0.88791482])

### 9. Tag the Image, Send to Team

In [None]:
tagged = zip(pred, image)
tagged.write('image_tagged')

In [None]:
! aws s3 cp image_tagged s3://eyewatchone --grants \
    read=uri=http://acs.amazonaws.com/groups/global/AllUsers full=emailaddress=hollisnolan@gmail.com

### 10. Clone the Analytics EC2, Scale for Production

In [None]:
# create an AMI from the EC2 analytics image 
aws ec2 create-image --instance-id i-582bf5dc --name "eyewatchone-analytics" --description \
"AMI for EyeWatchOne's prediction and analytics EC2 XL or higher reccomended."

### 11. Stop the instances

In [28]:
# stop analytics instance 
!aws ec2 stop-instances --instance-id i-582bf5dc --region us-east-1

{
    "StoppingInstances": [
        {
            "InstanceId": "i-582bf5dc", 
            "CurrentState": {
                "Code": 64, 
                "Name": "stopping"
            }, 
            "PreviousState": {
                "Code": 16, 
                "Name": "running"
            }
        }
    ]
}


In [29]:
# stop streaming instance 
!aws ec2 stop-instances --instance-id i-bb95183f --region us-east-1

{
    "StoppingInstances": [
        {
            "InstanceId": "i-bb95183f", 
            "CurrentState": {
                "Code": 64, 
                "Name": "stopping"
            }, 
            "PreviousState": {
                "Code": 16, 
                "Name": "running"
            }
        }
    ]
}


##### Always double check 

https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:sort=instanceId