# Sentiment Analysis

## Building an API in SageMaker

_Deep Learning Nanodegree Program | Deployment_

---

**TODO** Intro


## Instructions

Some template code has already been provided for you, and you will need to implement additional functionality to successfully complete this notebook. You will not need to modify the included code beyond what is requested. Sections that begin with '**TODO**' in the header indicate that you need to complete or implement some portion within them. Instructions will be provided for each section and the specifics of the implementation are marked in the code block with a `# TODO: ...` comment. Please be sure to read the instructions carefully!

In addition to implementing code, there will be questions for you to answer which relate to the task and your implementation. Each section where you will answer a question is preceded by a '**Question:**' header. Carefully read each question and provide your answer below the '**Answer:**' header by editing the Markdown cell.

> **Note**: Code and Markdown cells can be executed using the **Shift+Enter** keyboard shortcut. In addition, a cell can be edited by typically clicking it (double-click for Markdown cells) or by pressing **Enter** while it is highlighted.

## Modifying the inference code.

In the previous notebook we constructed a custom model and two different docker containers to manipulate it. The first container we used for training and the second we used for inference. However, we made the assumption when constructing the inference container that the input would be a review described as a seqeunce of integers. However, our goal is to create a simple web app that allows a user to type out a review and then tells the user whether their review is positive or negative. This means we need to modify our inference code to accept a string and then transform the input inside the inference container.

To begin with, let us remind ourselves how we process a review in order to send it off for inference. To begin with, we will read one of the reviews in our test set.

In [1]:
import os

review_text = None
with open(os.path.join('aclImdb', 'test', 'pos', '10000_7.txt')) as f:
    review_text = f.read()

In [2]:
review_text

'Actor turned director Bill Paxton follows up his promising debut, the Gothic-horror "Frailty", with this family friendly sports drama about the 1913 U.S. Open where a young American caddy rises from his humble background to play against his Bristish idol in what was dubbed as "The Greatest Game Ever Played." I\'m no fan of golf, and these scrappy underdog sports flicks are a dime a dozen (most recently done to grand effect with "Miracle" and "Cinderella Man"), but some how this film was enthralling all the same.<br /><br />The film starts with some creative opening credits (imagine a Disneyfied version of the animated opening credits of HBO\'s "Carnivale" and "Rome"), but lumbers along slowly for its first by-the-numbers hour. Once the action moves to the U.S. Open things pick up very well. Paxton does a nice job and shows a knack for effective directorial flourishes (I loved the rain-soaked montage of the action on day two of the open) that propel the plot further or add some unexpec

Now that we've read a sample review, the first thing we need to do is remove the html tags and stop words.

In [3]:
import nltk
nltk.download("stopwords")
from nltk.corpus import stopwords
from nltk.stem.porter import *
stemmer = PorterStemmer()

[nltk_data] Downloading package stopwords to
[nltk_data]     /home/ec2-user/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [4]:
import re
from bs4 import BeautifulSoup

def review_to_words(review):
    text = BeautifulSoup(review, "html.parser").get_text() # Remove HTML tags
    text = re.sub(r"[^a-zA-Z0-9]", " ", text.lower()) # Convert to lower case
    words = text.split() # Split string into words
    words = [w for w in words if w not in stopwords.words("english")] # Remove stopwords
    words = [PorterStemmer().stem(w) for w in words] # stem
    
    return words

In [5]:
review_words = review_to_words(review_text)
review_words

['actor',
 'turn',
 'director',
 'bill',
 'paxton',
 'follow',
 'promis',
 'debut',
 'gothic',
 'horror',
 'frailti',
 'famili',
 'friendli',
 'sport',
 'drama',
 '1913',
 'u',
 'open',
 'young',
 'american',
 'caddi',
 'rise',
 'humbl',
 'background',
 'play',
 'bristish',
 'idol',
 'dub',
 'greatest',
 'game',
 'ever',
 'play',
 'fan',
 'golf',
 'scrappi',
 'underdog',
 'sport',
 'flick',
 'dime',
 'dozen',
 'recent',
 'done',
 'grand',
 'effect',
 'miracl',
 'cinderella',
 'man',
 'film',
 'enthral',
 'film',
 'start',
 'creativ',
 'open',
 'credit',
 'imagin',
 'disneyfi',
 'version',
 'anim',
 'open',
 'credit',
 'hbo',
 'carnival',
 'rome',
 'lumber',
 'along',
 'slowli',
 'first',
 'number',
 'hour',
 'action',
 'move',
 'u',
 'open',
 'thing',
 'pick',
 'well',
 'paxton',
 'nice',
 'job',
 'show',
 'knack',
 'effect',
 'directori',
 'flourish',
 'love',
 'rain',
 'soak',
 'montag',
 'action',
 'day',
 'two',
 'open',
 'propel',
 'plot',
 'add',
 'unexpect',
 'psycholog',
 'dept

And, now that we've converted our review into usable words we need to map those words to integers using the `word_dict` that we created using the training set. We also need to pad or truncate the resulting sequence if it isn't the correct size.

In [6]:
import pickle

word_dict = None
with open("word_dict.pkl", "rb") as f:
    word_dict = pickle.load(f)

In [7]:
def convert_and_pad(data, word_dict, pad=500):
    NOWORD = 0 # Use 0 to represent the no word category
    INFREQ = 1 # Use 1 to represent infrequent words
    
    working_sentence = [NOWORD] * pad
    
    # We go through each word in the (possibly truncated) review and convert the words to integers
    for word_index, word in enumerate(data[:pad]):
        if word in word_dict:
            working_sentence[word_index] = word_dict[word]
        else:
            working_sentence[word_index] = INFREQ
            
    return working_sentence, min(len(data), pad)

In [8]:
review_data, review_length = convert_and_pad(review_words, word_dict)

In [9]:
review_length, review_data

(186,
 [68,
  99,
  100,
  819,
  1,
  237,
  1124,
  2147,
  3435,
  132,
  1,
  180,
  2914,
  1605,
  568,
  1,
  3471,
  248,
  107,
  212,
  1,
  1387,
  3998,
  984,
  32,
  1,
  1,
  1352,
  704,
  465,
  64,
  32,
  175,
  1,
  1,
  1,
  1605,
  591,
  1,
  1696,
  524,
  185,
  1698,
  176,
  4801,
  2843,
  88,
  5,
  1,
  5,
  94,
  1276,
  248,
  532,
  449,
  1,
  242,
  247,
  248,
  532,
  4859,
  1,
  1,
  1,
  318,
  1436,
  27,
  424,
  294,
  155,
  223,
  3471,
  248,
  38,
  500,
  51,
  1,
  227,
  264,
  29,
  1,
  176,
  3200,
  1,
  33,
  2344,
  1,
  3460,
  155,
  194,
  48,
  248,
  1,
  65,
  480,
  2096,
  1459,
  1178,
  3293,
  1212,
  18,
  426,
  603,
  990,
  1,
  1142,
  754,
  1,
  271,
  1181,
  362,
  2547,
  993,
  180,
  1,
  612,
  8,
  46,
  1,
  383,
  26,
  14,
  264,
  571,
  816,
  193,
  1235,
  320,
  3755,
  1,
  576,
  300,
  1605,
  692,
  1,
  1890,
  532,
  26,
  266,
  110,
  909,
  1056,
  2672,
  311,
  1116,
  939,
  285,
  815,

And now we have input that can be sent to the neural network that we trained previously. To reiterate, given a review in string form, we need to do the following in order to determine the sentiment of the review:

- Convert the review to words (clean html, remove stopwords, etc.)
- Transform the words to integers using `word_dict` and pad / truncate the sequence
- Send the data through the neural network.

The important takeaway here isn't the additional pre-processing of the input, this is relatively easy to do you just need to modify the code in either `train` or in `model.py` to incorporate this step. Instead, it is important to note that we need to include the `word_dict.pkl` file so that our inference code can make use of it to perform the second item above.

## Step 1: Build and Push new inference code

Now that we know what our inference code needs to do, we can make the necessary changes. The code for this has been provided and resides in the `api_container` folder. In particular, note that the `sentiment_api.py` file contains the code shown above to pre-process incoming data. Of course, in order to do this we need to make sure to include `word_dict.pkl`.

In [10]:
%cp word_dict.pkl api_container/sentiment/

To recap, the changes between the original inference code and our new inference code are the following:

- `predictor.py` has been modified to pre-process incoming data,
- `sentiment_api.py` has been added, implementing the pre-processing methods,
- `word_dict.pkl` has been added,
- `train` has been removed so that this container can't accidentally be used for training, and
- `Dockerfile.cpu` has been modified so that our code has access to the nltk and BeautifulSoup libraries.

Now that this is done, we can run the `build_and_push.sh` script to make our container available on Amazon's Elastic Container Repository.

In [11]:
%cd api_container
!chmod +x ./build_and_push.sh
!./build_and_push.sh
%cd ..

/home/ec2-user/SageMaker/api_container
Login Succeeded
Sending build context to Docker daemon  121.3kB
Step 1/13 : FROM ubuntu:16.04
16.04: Pulling from library/ubuntu

[1Bf539f7a1: Pulling fs layer 
[1B2d420b43: Pulling fs layer 
[1Bbbeb6b91: Pulling fs layer 
[1B2841ad7a: Pulling fs layer 
[1BDigest: sha256:b050c1822d37a4463c01ceda24d0fc4c679b0dd3c43e742730e2884d3c582e3a[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[5A[1K[K[4A[1K[K[3A[1K[K[3A[1K[K[2A[1K[K[1A[1K[K[1A[1K[K
Status: Downloaded newer image for ubuntu:16.04
 ---> 5e8b97a2a082
Step 2/13 : RUN apt-get update && apt-get install -y     wget     curl     nginx     ca-certificates     sudo     git     bzip2     libx11-6  && rm -rf /var/lib/apt/lists/*
 ---> Running in 872a579e4cec
Get:1

Get:23 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 isc-dhcp-client amd64 4.3.3-5ubuntu12.10 [224 kB]
Get:24 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 isc-dhcp-common amd64 4.3.3-5ubuntu12.10 [105 kB]
Get:25 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 less amd64 481-2.1ubuntu0.2 [110 kB]
Get:26 http://archive.ubuntu.com/ubuntu xenial/main amd64 libbsd0 amd64 0.8.2-1 [41.7 kB]
Get:27 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libexpat1 amd64 2.1.0-7ubuntu0.16.04.3 [71.2 kB]
Get:28 http://archive.ubuntu.com/ubuntu xenial/main amd64 libffi6 amd64 3.2.1-4 [17.8 kB]
Get:29 http://archive.ubuntu.com/ubuntu xenial/main amd64 libgmp10 amd64 2:6.1.0+dfsg-2 [240 kB]
Get:30 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libnettle6 amd64 3.2-1ubuntu0.16.04.1 [93.5 kB]
Get:31 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libhogweed4 amd64 3.2-1ubuntu0.16.04.1 [136 kB]
Get:32 http://archive.ubuntu.com/ubuntu xenia

(Reading database ... 4768 files and directories currently installed.)
Preparing to unpack .../perl-base_5.22.1-9ubuntu0.5_amd64.deb ...
Unpacking perl-base (5.22.1-9ubuntu0.5) over (5.22.1-9ubuntu0.3) ...
Setting up perl-base (5.22.1-9ubuntu0.5) ...
Selecting previously unselected package libatm1:amd64.
(Reading database ... 4768 files and directories currently installed.)
Preparing to unpack .../libatm1_1%3a2.5.1-1.5_amd64.deb ...
Unpacking libatm1:amd64 (1:2.5.1-1.5) ...
Selecting previously unselected package libmnl0:amd64.
Preparing to unpack .../libmnl0_1.0.3-5_amd64.deb ...
Unpacking libmnl0:amd64 (1.0.3-5) ...
Selecting previously unselected package libpopt0:amd64.
Preparing to unpack .../libpopt0_1.16-10_amd64.deb ...
Unpacking libpopt0:amd64 (1.16-10) ...
Selecting previously unselected package libgdbm3:amd64.
Preparing to unpack .../libgdbm3_1.8.3-13.1_amd64.deb ...
Unpacking libgdbm3:amd64 (1.8.3-13.1) ...
Selecting previously unselected package libxau6:amd64.
Preparing to 

Selecting previously unselected package libkrb5support0:amd64.
Preparing to unpack .../libkrb5support0_1.13.2+dfsg-5ubuntu2_amd64.deb ...
Unpacking libkrb5support0:amd64 (1.13.2+dfsg-5ubuntu2) ...
Selecting previously unselected package libk5crypto3:amd64.
Preparing to unpack .../libk5crypto3_1.13.2+dfsg-5ubuntu2_amd64.deb ...
Unpacking libk5crypto3:amd64 (1.13.2+dfsg-5ubuntu2) ...
Selecting previously unselected package libkeyutils1:amd64.
Preparing to unpack .../libkeyutils1_1.5.9-8ubuntu1_amd64.deb ...
Unpacking libkeyutils1:amd64 (1.5.9-8ubuntu1) ...
Selecting previously unselected package libkrb5-3:amd64.
Preparing to unpack .../libkrb5-3_1.13.2+dfsg-5ubuntu2_amd64.deb ...
Unpacking libkrb5-3:amd64 (1.13.2+dfsg-5ubuntu2) ...
Selecting previously unselected package libgssapi-krb5-2:amd64.
Preparing to unpack .../libgssapi-krb5-2_1.13.2+dfsg-5ubuntu2_amd64.deb ...
Unpacking libgssapi-krb5-2:amd64 (1.13.2+dfsg-5ubuntu2) ...
Selecting previously unselected package libhcrypto4-heimdal:

Unpacking nginx (1.10.3-0ubuntu0.16.04.2) ...
Selecting previously unselected package patch.
Preparing to unpack .../patch_2.7.5-1ubuntu0.16.04.1_amd64.deb ...
Unpacking patch (2.7.5-1ubuntu0.16.04.1) ...
Selecting previously unselected package rename.
Preparing to unpack .../archives/rename_0.20-4_all.deb ...
Unpacking rename (0.20-4) ...
Processing triggers for libc-bin (2.23-0ubuntu10) ...
Processing triggers for systemd (229-4ubuntu21.2) ...
Setting up libatm1:amd64 (1:2.5.1-1.5) ...
Setting up libmnl0:amd64 (1.0.3-5) ...
Setting up libpopt0:amd64 (1.16-10) ...
Setting up libgdbm3:amd64 (1.8.3-13.1) ...
Setting up libxau6:amd64 (1:1.0.8-1) ...
Setting up libxdmcp6:amd64 (1:1.1.2-1.1) ...
Setting up libxcb1:amd64 (1.11.1-1ubuntu1) ...
Setting up libx11-data (2:1.6.3-1ubuntu2) ...
Setting up libx11-6:amd64 (2:1.6.3-1ubuntu2) ...
Setting up libxext6:amd64 (2:1.3.3-1) ...
Setting up sgml-base (1.26+nmu4ubuntu1) ...
Setting up libjpeg-turbo8:amd64 (1.4.2-0ubuntu3) ...
Setting up perl-mo

installing: cffi-1.11.4-py36h9745a5d_0 ...
installing: setuptools-38.4.0-py36_0 ...
installing: cryptography-2.1.4-py36hd09be54_0 ...
installing: wheel-0.30.0-py36hfd4bba0_1 ...
installing: pip-9.0.1-py36h6c6f9ce_4 ...
installing: pyopenssl-17.5.0-py36h20ba746_0 ...
installing: urllib3-1.22-py36hbe7ace6_0 ...
installing: requests-2.18.4-py36he2e5f8d_1 ...
installing: conda-4.4.10-py36_0 ...
installation finished.
Removing intermediate container c36071426bd5
 ---> c3e7ffc1694e
Step 4/13 : ENV PATH=/opt/conda/bin:$PATH
 ---> Running in f5f1d2743f41
Removing intermediate container f5f1d2743f41
 ---> 4f9d1e4fe419
Step 5/13 : ENV CONDA_AUTO_UPDATE_CONDA=false
 ---> Running in e3f859cedce7
Removing intermediate container e3f859cedce7
 ---> af2f71afc6c2
Step 6/13 : RUN conda install -y "conda>=4.4.11" && conda clean -ya
 ---> Running in d865d5d30391
Solving environment: ...working... done
conda 4.5.4: ########## | 100% [0m[91m[91m[91m
certifi 2018.4.16: ########## | 100% [0m[91m
openssl

mkl-2018.0.3         | 198.7 MB | ########## | 100% [0m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91m[91

Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
[91m
[0mCache location: /opt/conda/pkgs
Will remove the following tarballs:

/opt/conda/pkgs
---------------
pandas-0.23.1-py36h637b7d7_0.tar.bz2        11.9 MB
pytz-2018.4-py36_0.tar.bz2                   212 KB
gevent-1.3.2.post0-py36h14c3975_0.tar.bz2     1.9 MB
click-6.7-py36h5253387_0.tar.bz2             104 KB
markupsafe-1.0-py36hd9260cd_1.tar.bz2         24 KB
greenlet-0.4.13-py36h14c3975_0.tar.bz2        19 KB
html5lib-1.0.1-py36h2f9c1c0_0.tar.bz2        181 KB
flask-1.0.2-py36_1.tar.bz2                   119 KB
werkzeug-0.14.1-py36_0.tar.bz2               423 KB
beautifulsoup4-4.6.0-py36h49b8c8c_1.tar.bz2     133 KB
gunicorn-19.8.1-py36_0.tar.bz2               172 KB
itsdangerous-0.24-py36h93cc618_1.tar.bz2      20 KB
cffi-1.11.5-py36h9745a5d_0.tar.bz2           212 KB
jinja2-2.10-py36ha16c418_0.tar.bz2           184 KB
python-dateutil-2.7.3-py36_0.ta

[9B5693120a: Pushing  920.1MB/989.3MB[11A[1K[K[10A[1K[K[8A[1K[K[7A[1K[K[8A[1K[K[9A[1K[K[7A[1K[K[10A[1K[K[9A[1K[K[10A[1K[K[7A[1K[K[10A[1K[K[8A[1K[K[11A[1K[K[10A[1K[K[9A[1K[K[10A[1K[K[8A[1K[K[7A[1K[K[9A[1K[K[10A[1K[K[7A[1K[K[7A[1K[K[8A[1K[K[10A[1K[K[8A[1K[K[7A[1K[K[8A[1K[K[10A[1K[K[7A[1K[K[8A[1K[K[9A[1K[K[7A[1K[K[7A[1K[K[8A[1K[K[9A[1K[K[9A[1K[K[8A[1K[K[6A[1K[K[7A[1K[K[9A[1K[K[10A[1K[K[7A[1K[K[10A[1K[K[9A[1K[K[10A[1K[K[7A[1K[K[7A[1K[K[9A[1K[K[8A[1K[K[10A[1K[K[7A[1K[K[9A[1K[K[7A[1K[K[6A[1K[K[7A[1K[K[8A[1K[K[7A[1K[K[10A[1K[K[7A[1K[K[7A[1K[K[10A[1K[K[7A[1K[K[6A[1K[K[8A[1K[K[9A[1K[K[8A[1K[K[9A[1K[K[8A[1K[K[9A[1K[K[7A[1K[K[8A[1K[K[7A[1K[K[8A[1K[K[9A[1K[K[10A[1K[K[9A[1K[K[8A[1K[K[6A[1K[K[9A[1K[K[8A[1K[K[7A[1K[K[10A[1K[K[7A[1K[K[8A[1K[K[9A[1K

[9B5693120a: Pushed   993.7MB/989.3MB[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[K[9A[1K[Klatest: digest: sha256:3118eb29c9b07cea004ece464ba9af65a7bb2a9e839e947ed91f8b8866374c58 size: 2627
/home/ec2-user/SageMaker


## Step 2: Test the new inference container

Before getting into the details of setting up the web app we should make sure that our new inference container behaves the way we expect it to. To do this we will deploy and test our new container. Now, the way in which we will do this is a little different from the way that we did it in the previous notebook. This is because we want to use the model artifacts that we created in the previous notebook rather than creating a new model and training it from scratch.

### Creating the endpoint

Of course, we need to know where those model artifacts are stored. Luckily, we can look this up using the SageMaker console. First, click on **Models** to see the models that you have created. Then, select the model you would like to use. In our case it will likely be the most recently created model. Lastly, look at the details of the model settings and copy and paste the link displayed under **Location of model artifacts**. This link should begin with `s3://` and end with `model.tar.gz`.

In [12]:
model_artifacts = "s3://sagemaker-us-east-1-337425718252/output/sentiment-pytorch-gpu-2018-06-13-10-49-41-996/output/model.tar.gz"

Now that we know where the model artifacts are stored, we can construct an endpoint using these model artifacts along with the inference container that we've built. To do this we will use the `endpoint_from_model_data()` method provided by the SageMaker Session object. For more details and additional methods provided by the Session object please consult the [SageMaker documentation](http://sagemaker.readthedocs.io)

**Note**: It is important to name the endpoint something that you will remember as it will be required in the Lambda function that we create later to access the inference code.

In [13]:
import sagemaker as sage

sess = sage.Session() # Store the current SageMaker session
role = sage.get_execution_role() # Store our current IAM role

# We will also need our current account number and region in order to completely specify
# the name of the docker container we created earlier
account = sess.boto_session.client('sts').get_caller_identity()['Account']
region = sess.boto_session.region_name

inference_image = '{}.dkr.ecr.{}.amazonaws.com/sentiment-pytorch-api'.format(account, region)

In [14]:
model_endpoint = sess.endpoint_from_model_data(model_artifacts, # Where the model artifacts are stored
                                              inference_image,  # Which container to use for inference
                                              1, 'ml.m4.xlarge',# What sort of compute instance to use
                                              name="SentimentAnalysisEndpoint", # The name of the endpoint
                                              role = role)      # Our current role

INFO:sagemaker:Creating endpoint with name SentimentAnalysisEndpoint


----------------------------------------------------------------!

### Testing our inference code

Now that we have constructed the endpoint it is time to use it. To do so we will first create a predictor object and then send some data to it. In order to construct the predictor we need to know the name of the endpoint that we've just created. Fortunately, this is returned by the `endpoint_from_model_data()` method used earlier. We also need to tell SageMaker the format that we expect to use in order to send data. Since we want to send a string (the review itself) we set the content type to `text/plain`.

In [15]:
predictor = sage.predictor.RealTimePredictor(model_endpoint, content_type='text/plain')

And lastly, we send some reviews to the endpoint.

In [16]:
import glob

def test_reviews(data_dir='aclImdb', stop=250):
    
    results = []
    ground = []
    
    # We make sure to test both positive and negative reviews    
    for sentiment in ['pos', 'neg']:
        
        path = os.path.join(data_dir, 'test', sentiment, '*.txt')
        files = glob.glob(path)
        
        files_read = 0
        
        print('Starting ', sentiment, ' files')
        
        # Iterate through the files and send them to the predictor
        for f in files:
            with open(f) as review:
                # First, we store the ground truth (was the review positive or negative)
                if sentiment == 'pos':
                    ground.append(1)
                else:
                    ground.append(0)
                # Read in the review and convert to 'utf-8' for transmission via HTTP
                review_input = review.read().encode('utf-8')
                # Send the review to the predictor and store the results
                results.append(float(predictor.predict(review_input)))
                
            # Sending reviews to our endpoint one at a time takes a while so we
            # only send a small number of reviews
            files_read += 1
            if files_read == stop:
                break
            
    return ground, results

In [17]:
ground, results = test_reviews()

Starting  pos  files
Starting  neg  files


In [18]:
from sklearn.metrics import accuracy_score
accuracy_score(ground, results)

0.82

## Step 3: Exposing our endpoint to the outside world

Currently we have been access the model endpoint by constructing a predictor object which uses the endpoint and then just using the predictor object to perform inference. What if we wanted to create a web app which accessed our model? The way things are set up currently makes that not possible since in order to access a SageMaker endpoint the app would first have to authenticate with AWS using an IAM role which included access to SageMaker endpoints. However, there is an easier way! We just need to use some additional AWS services.

There are two services that we will be using to allow access to our model from the outside world. The first is called Lambda and the second is API Gatway.

Lambda is a service which allows someone to write some relatively simple code and have it executed whenever a chosen trigger occurs. For example, you may want to update a database whenever new data is uploaded to a folder stored on S3.

API Gateway is a service that allows you to create HTTP endpoints (url addresses) which are connected to other AWS services. One of the benefits to this is that you get to decide what credentials, if any, are required to access these endpoints.

In our case we are going to set up an HTTP endpoint through API Gateway which is open to the public. Then, whenever anyone sends data to our public endpoint we will have that trigger a Lambda function which will send the input (in our case a review) to the inference container and return the result.

> TODO: Include an image to help describe this.

### Setting up a Lambda function

The first thing we are going to do is set up a Lambda function. This Lambda function will be executed whenever our public API has data sent to it. When it is executed it will receive the data, perform any sort of processing that is required, send the data (the review) to the SageMaker endpoint we've created and then return the result.

#### Part A: Create an IAM Role for the Lambda function

Since we want the Lambda function to call a SageMaker endpoint, we need to make sure that it has permission to do so. To do this, we will construct a role that we can later give the Lambda function.

Using the AWS Console, navigate to the **IAM** page and click on **Roles**. Then, click on **Create role**. Make sure that the **AWS service** is the type of trusted entity selected and choose **Lambda** as the service that will use this role, then click **Next: Permissions**.

In the search box type `sagemaker` and select the check box next to the **AmazonSageMakerFullAccess** policy. Then, click on **Next: Review**.

Lastly, give this role a name. Make sure you use a name that you will remember later on, for example `LambdaSageMakerRole`. Then, click on **Create role**.

#### Part B: Create a Lambda function

Now it is time to actually create the Lambda function.

Using the AWS Console, navigate to the AWS Lambda page and click on **Create a function**. When you get to the next page, make sure that **Author from scratch** is selected. Now, name your Lambda function, using a name that you will remember later on, for example `sentiment_analysis_func`. Make sure that the **Python 3.6** runtime is selected and then choose the role that you created in the previous part. Then, click on **Create Function**.

On the next page you will see some information about the Lambda function you've just created. If you scroll down you should see an editor in which you can write the code that will be executed when your Lambda function is triggered. In our example, we will use the code below:

```python
# We need to use the low-level library to interact with SageMaker since the SageMaker API
# is not available natively through Lambda.
import boto3

def lambda_handler(event, context):

    # The SageMaker runtime is what allows us to invoke the endpoint that we've created.
    runtime = boto3.Session().client('sagemaker-runtime')

    # Now we use the SageMaker runtime to invoke our endpoint, sending the review we were given
    response = runtime.invoke_endpoint(EndpointName = 'SentimentAnalysisEndpoint', # The name of the endpoint we created
                                       ContentType = 'text/plain',                 # The data format that is expected
                                       Body = event['body'])                       # The actual review

    # The response is an HTTP response whose body contains the result of our inference
    result = response['Body'].read().decode('utf-8')

    return {
        'statusCode' : 200,
        'headers' : { 'Content-Type' : 'text/plain', 'Access-Control-Allow-Origin' : '*' },
        'body' : result
    }
```

Once you have copy and pasted the code above into the code editor, click on **Save** and your Lambda function will be up and running. Now we need to create a way for our web app to execute the Lambda function.

### Setting up API Gateway

Now that our Lambda function is set up, it is time to create a new API using API Gateway that will trigger the Lambda function we have just created.

Using AWS Console, navigate to **Amazon API Gateway** and then click on **Get started**.

On the next page, make sure that **New API** is selected and give the new api a name, for example, `sentiment_analysis_api`. Then, click on **Create API**.

Now we have created an API, however it doesn't currently do anything. What we want it to do is to trigger the Lambda function that we created earlier.

Select the **Actions** dropdown menu and click **Create Method**. A new blank method will be created, select its dropdown menu and select **POST**, then click on the check mark beside it.

For the integration point, make sure that **Lambda Function** is selected and click on the **Use Lambda Proxy integration**. This option makes sure that the data that is sent to the API is then sent directly to the Lambda function with no processing. It also means that the return value must be a proper response object as it will also not be processed by API Gateway.

Type the name of the Lambda function you created earlier into the **Lambda Function** text entry box and then click on **Save**. Click on **OK** in the pop-up box that then appears, giving permission to API Gateway to invoke the Lambda function you created.

The last step in creating the API Gateway is to select the **Actions** dropdown and click on **Deploy API**. You will need to create a new Deployment stage and name it anything you like, for example `prod`.

You have now successfully set up a public API to access your SageMaker model. Make sure to copy or write down the URL provided to invoke your newly created public API as this will be needed in the next step. This URL can be found at the top of the page, highlighted in blue next to the text **Invoke URL**.

## Step 4: Deploying our web app

Now that we have a publicly available API, we can start using it in a web app. For our purposes, we have provided a simple static html file which can make use of the public api you created earlier.

In the `website` folder there should be a file called `index.html`. Download the file to your computer and open that file up in a text editor of your choice. There should be a line which contains **\*\*REPLACE WITH PUBLIC API URL\*\***. Replace this string with the url that you wrote down in the last step and then save the file.

Now, if you open `index.html` on your local computer, your browser will behave as a local web server and you can use the provided site to interact with your SageMaker model.

If you'd like to go further, you can host this html file anywhere you'd like, for example using github or hosting a static site on Amazon's S3. Once you have done this you can share the link with anyone you'd like and have them play with it too!

> **Important Note** In order for the web app to communicate with the SageMaker endpoint, the endpoint has to actually be deployed and running. This means that you are paying for it. Make sure that the endpoint is running when you want to use the web app but that you shut it down when you don't need it, otherwise you will end up with a surprisingly large AWS bill.

### Delete the endpoint

Now that we are done testing our model we need to delete the endpoint so that it is no longer running.

In [19]:
sess.delete_endpoint(model_endpoint)

INFO:sagemaker:Deleting endpoint with name: SentimentAnalysisEndpoint
