# Text Classification Using Keras & TensorFlow on Amazon SageMaker

Full lab guide can be found here: https://github.com/aws-samples/amazon-sagemaker-keras-text-classification

In [1]:
%cd data
!wget https://archive.ics.uci.edu/ml/machine-learning-databases/00359/NewsAggregatorDataset.zip && unzip NewsAggregatorDataset.zip
!wget http://nlp.stanford.edu/data/glove.6B.zip && unzip glove.6B.zip
!rm 2pageSessions.csv glove.6B.200d.txt glove.6B.50d.txt glove.6B.300d.txt glove.6B.zip readme.txt NewsAggregatorDataset.zip && rm -rf __MACOSX/    

/home/ec2-user/SageMaker/amazon-sagemaker-keras-text-classification/data
--2019-06-11 07:06:43--  https://archive.ics.uci.edu/ml/machine-learning-databases/00359/NewsAggregatorDataset.zip
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 29224203 (28M) [application/x-httpd-php]
Saving to: ‘NewsAggregatorDataset.zip’


2019-06-11 07:06:45 (14.2 MB/s) - ‘NewsAggregatorDataset.zip’ saved [29224203/29224203]

Archive:  NewsAggregatorDataset.zip
  inflating: 2pageSessions.csv       
   creating: __MACOSX/
  inflating: __MACOSX/._2pageSessions.csv  
  inflating: newsCorpora.csv         
  inflating: __MACOSX/._newsCorpora.csv  
  inflating: readme.txt              
  inflating: __MACOSX/._readme.txt   
--2019-06-11 07:06:46--  http://nlp.stanford.edu/data/glove.6B.zip
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140

# Lab 1: Data Exploration

In [2]:
import pandas as pd
import tensorflow as tf
import re
import numpy as np
import os

from tensorflow.python.keras.preprocessing.text import Tokenizer
from tensorflow.python.keras.preprocessing.sequence import pad_sequences
from tensorflow.python.keras.utils import to_categorical

In [3]:
column_names = ["TITLE", "URL", "PUBLISHER", "CATEGORY", "STORY", "HOSTNAME", "TIMESTAMP"]
news_dataset = pd.read_csv(os.path.join('.', 'newsCorpora.csv'), names=column_names, header=None, delimiter='\t')
news_dataset.head()

Unnamed: 0,TITLE,URL,PUBLISHER,CATEGORY,STORY,HOSTNAME,TIMESTAMP
1,"Fed official says weak data caused by weather,...",http://www.latimes.com/business/money/la-fi-mo...,Los Angeles Times,b,ddUyU0VZz0BRneMioxUPQVP6sIxvM,www.latimes.com,1394470370698
2,Fed's Charles Plosser sees high bar for change...,http://www.livemint.com/Politics/H2EvwJSK2VE6O...,Livemint,b,ddUyU0VZz0BRneMioxUPQVP6sIxvM,www.livemint.com,1394470371207
3,US open: Stocks fall after Fed official hints ...,http://www.ifamagazine.com/news/us-open-stocks...,IFA Magazine,b,ddUyU0VZz0BRneMioxUPQVP6sIxvM,www.ifamagazine.com,1394470371550
4,"Fed risks falling 'behind the curve', Charles ...",http://www.ifamagazine.com/news/fed-risks-fall...,IFA Magazine,b,ddUyU0VZz0BRneMioxUPQVP6sIxvM,www.ifamagazine.com,1394470371793
5,Fed's Plosser: Nasty Weather Has Curbed Job Gr...,http://www.moneynews.com/Economy/federal-reser...,Moneynews,b,ddUyU0VZz0BRneMioxUPQVP6sIxvM,www.moneynews.com,1394470372027


In [4]:
news_dataset_sampled = news_dataset.sample(frac=0.00005)
for i, n in enumerate(range(news_dataset_sampled.shape[0])):    
    category = news_dataset_sampled.iloc[i][3]
    if category == "b":
        category = "Business"
    elif category == "t":
        category = "Science & Technology"
    elif category == "e":
        category = "Entertainment"
    elif category == "m":
        category = "Health & Medicine"
    else:
        category = "unknown"
    print("{}. {} - {}".format(n+1, news_dataset_sampled.iloc[i][0], category))

1. Progress on job growth: Really! - Business
2. Strike a Pose! Kim Kardashian Finally Lands Vogue Magazine Cover! - Entertainment
3. Fargo: “Morton's Fork” - Entertainment
4. See the 7-Carat Engagement Ring George Clooney Gave Amal Alamuddin! - Entertainment
5. Will Smith may star in NFL concussion movie 'Game Brain' - Entertainment
6. Yandex to buy car classified website for $175m - Science & Technology
7. Moderate losses for bullions - Business
8. Facebook raises revenue from mobile ads - Business
9. Report: Morgan Stanley eyes compensation cuts - Business
10. US auto sales climb in March - Business
11. Men's Wearhouse, Jos. A. Bank merger elicits caution from financial analysts  ... - Business
12. Rick Case, AutoNation among new-vehicle dealers innovating to boost used-car  ... - Business
13. Here's how your iPhone just got a whole lot better - Science & Technology
14. Dropbox Buys Photo Stream Alternative 'Loom' For Its New Carousel App - Science & Technology
15. McDonald's offers

![STOP HERE](images/sm-keras-stop-sign.jpg)

## Stop here. Please switch back to the [lab guide](https://github.com/aws-samples/amazon-sagemaker-keras-text-classification) on Github and complete labs 2 and 3. You will come back to this notebook later in Lab 4.




# Lab 2: Building the SageMaker TensorFlow Container

In [5]:
%cd ~
!git clone https://github.com/aws/sagemaker-tensorflow-container.git
%cd sagemaker-tensorflow-container/docker/1.8.0/base
!docker build -t tensorflow-base:1.8.0-cpu-py2 -f Dockerfile.cpu .
!docker images

/home/ec2-user
fatal: destination path 'sagemaker-tensorflow-container' already exists and is not an empty directory.
/home/ec2-user/sagemaker-tensorflow-container/docker/1.8.0/base
Sending build context to Docker daemon  8.704kB
Step 1/9 : FROM ubuntu:16.04
16.04: Pulling from library/ubuntu

[1Be2e5f967: Pulling fs layer 
[1B6638ac9f: Pulling fs layer 
[1B7d6d954b: Pulling fs layer 
[1BDigest: sha256:cad5e101ab30bb7f7698b277dd49090f520fe063335643990ce8fbd15ff920ef[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[4A[1K[K[3A[1K[K[3A[1K[K[2A[1K[K[1A[1K[K
Status: Downloaded newer image for ubuntu:16.04
 ---> 2a697363a870
Step 2/9 : RUN apt-get update && apt-get install -y --no-install-recommends         build-essential         curl         git         libcurl3-dev         libfreetype6-de

0 upgraded, 239 newly installed, 0 to remove and 15 not upgraded.
Need to get 183 MB of archives.
After this operation, 801 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu xenial/main amd64 libjson-c2 amd64 0.11-4ubuntu2 [22.3 kB]
Get:2 http://archive.ubuntu.com/ubuntu xenial/main amd64 libpopt0 amd64 1.16-10 [26.0 kB]
Get:3 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libssl1.0.0 amd64 1.0.2g-1ubuntu4.15 [1084 kB]
Get:4 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libpython3.5-minimal amd64 3.5.2-2ubuntu0~16.04.5 [524 kB]
Get:5 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libexpat1 amd64 2.1.0-7ubuntu0.16.04.3 [71.2 kB]
Get:6 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 python3.5-minimal amd64 3.5.2-2ubuntu0~16.04.5 [1598 kB]
Get:7 http://archive.ubuntu.com/ubuntu xenial/main amd64 python3-minimal amd64 3.5.1-3 [23.3 kB]
Get:8 http://archive.ubuntu.com/ubuntu xenial/main amd64 mime-support all 3.5

Get:78 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libroken18-heimdal amd64 1.7~git20150920+dfsg-4ubuntu1.16.04.1 [41.4 kB]
Get:79 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libasn1-8-heimdal amd64 1.7~git20150920+dfsg-4ubuntu1.16.04.1 [174 kB]
Get:80 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libkrb5support0 amd64 1.13.2+dfsg-5ubuntu2.1 [31.2 kB]
Get:81 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libk5crypto3 amd64 1.13.2+dfsg-5ubuntu2.1 [81.3 kB]
Get:82 http://archive.ubuntu.com/ubuntu xenial/main amd64 libkeyutils1 amd64 1.5.9-8ubuntu1 [9904 B]
Get:83 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libkrb5-3 amd64 1.13.2+dfsg-5ubuntu2.1 [273 kB]
Get:84 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libgssapi-krb5-2 amd64 1.13.2+dfsg-5ubuntu2.1 [120 kB]
Get:85 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 libhcrypto4-heimdal amd64 1.7~git20150920+dfsg-4ubuntu1.16.04.1 [85.0 kB]
Get:86 ht

Get:157 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 ca-certificates-java all 20160321ubuntu1 [12.5 kB]
Get:158 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 curl amd64 7.47.0-1ubuntu2.13 [139 kB]
Get:159 http://archive.ubuntu.com/ubuntu xenial/main amd64 liberror-perl all 0.17-1.2 [19.6 kB]
Get:160 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 git-man all 1:2.7.4-0ubuntu1.6 [736 kB]
Get:161 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 git amd64 1:2.7.4-0ubuntu1.6 [3176 kB]
Get:162 http://archive.ubuntu.com/ubuntu xenial/main amd64 libasound2-data all 1.1.0-0ubuntu1 [29.4 kB]
Get:163 http://archive.ubuntu.com/ubuntu xenial/main amd64 libasound2 amd64 1.1.0-0ubuntu1 [350 kB]
Get:164 http://archive.ubuntu.com/ubuntu xenial/main amd64 libatk1.0-data all 2.18.0-1 [17.1 kB]
Get:165 http://archive.ubuntu.com/ubuntu xenial/main amd64 libatk1.0-0 amd64 2.18.0-1 [56.9 kB]
Get:166 http://archive.ubuntu.com/ubuntu xenial/main amd64 libpixman-1

Get:238 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 vim amd64 2:7.4.1689-3ubuntu1.2 [1036 kB]
Get:239 http://archive.ubuntu.com/ubuntu xenial/main amd64 zip amd64 3.0-11 [158 kB]
[91mdebconf: delaying package configuration, since apt-utils is not installed
[0mFetched 183 MB in 5s (36.0 MB/s)
Selecting previously unselected package libjson-c2:amd64.
(Reading database ... 4777 files and directories currently installed.)
Preparing to unpack .../libjson-c2_0.11-4ubuntu2_amd64.deb ...
Unpacking libjson-c2:amd64 (0.11-4ubuntu2) ...
Selecting previously unselected package libpopt0:amd64.
Preparing to unpack .../libpopt0_1.16-10_amd64.deb ...
Unpacking libpopt0:amd64 (1.16-10) ...
Selecting previously unselected package libssl1.0.0:amd64.
Preparing to unpack .../libssl1.0.0_1.0.2g-1ubuntu4.15_amd64.deb ...
Unpacking libssl1.0.0:amd64 (1.0.2g-1ubuntu4.15) ...
Selecting previously unselected package libpython3.5-minimal:amd64.
Preparing to unpack .../libpython3.5-minimal_3.5.2-2

Selecting previously unselected package libxshmfence1:amd64.
Preparing to unpack .../libxshmfence1_1.2-1_amd64.deb ...
Unpacking libxshmfence1:amd64 (1.2-1) ...
Selecting previously unselected package x11-common.
Preparing to unpack .../x11-common_1%3a7.7+13ubuntu3.1_all.deb ...
dpkg-query: no packages found matching nux-tools
Unpacking x11-common (1:7.7+13ubuntu3.1) ...
Selecting previously unselected package libxtst6:amd64.
Preparing to unpack .../libxtst6_2%3a1.2.2-1_amd64.deb ...
Unpacking libxtst6:amd64 (2:1.2.2-1) ...
Selecting previously unselected package libxxf86vm1:amd64.
Preparing to unpack .../libxxf86vm1_1%3a1.1.4-1_amd64.deb ...
Unpacking libxxf86vm1:amd64 (1:1.1.4-1) ...
Selecting previously unselected package perl-modules-5.22.
Preparing to unpack .../perl-modules-5.22_5.22.1-9ubuntu0.6_all.deb ...
Unpacking perl-modules-5.22 (5.22.1-9ubuntu0.6) ...
Selecting previously unselected package libperl5.22:amd64.
Preparing to unpack .../libperl5.22_5.22.1-9ubuntu0.6_amd64.deb

Selecting previously unselected package libhcrypto4-heimdal:amd64.
Preparing to unpack .../libhcrypto4-heimdal_1.7~git20150920+dfsg-4ubuntu1.16.04.1_amd64.deb ...
Unpacking libhcrypto4-heimdal:amd64 (1.7~git20150920+dfsg-4ubuntu1.16.04.1) ...
Selecting previously unselected package libheimbase1-heimdal:amd64.
Preparing to unpack .../libheimbase1-heimdal_1.7~git20150920+dfsg-4ubuntu1.16.04.1_amd64.deb ...
Unpacking libheimbase1-heimdal:amd64 (1.7~git20150920+dfsg-4ubuntu1.16.04.1) ...
Selecting previously unselected package libwind0-heimdal:amd64.
Preparing to unpack .../libwind0-heimdal_1.7~git20150920+dfsg-4ubuntu1.16.04.1_amd64.deb ...
Unpacking libwind0-heimdal:amd64 (1.7~git20150920+dfsg-4ubuntu1.16.04.1) ...
Selecting previously unselected package libhx509-5-heimdal:amd64.
Preparing to unpack .../libhx509-5-heimdal_1.7~git20150920+dfsg-4ubuntu1.16.04.1_amd64.deb ...
Unpacking libhx509-5-heimdal:amd64 (1.7~git20150920+dfsg-4ubuntu1.16.04.1) ...
Selecting previously unselected packa

Selecting previously unselected package libquadmath0:amd64.
Preparing to unpack .../libquadmath0_5.4.0-6ubuntu1~16.04.11_amd64.deb ...
Unpacking libquadmath0:amd64 (5.4.0-6ubuntu1~16.04.11) ...
Selecting previously unselected package libgcc-5-dev:amd64.
Preparing to unpack .../libgcc-5-dev_5.4.0-6ubuntu1~16.04.11_amd64.deb ...
Unpacking libgcc-5-dev:amd64 (5.4.0-6ubuntu1~16.04.11) ...
Selecting previously unselected package gcc-5.
Preparing to unpack .../gcc-5_5.4.0-6ubuntu1~16.04.11_amd64.deb ...
Unpacking gcc-5 (5.4.0-6ubuntu1~16.04.11) ...
Selecting previously unselected package gcc.
Preparing to unpack .../gcc_4%3a5.3.1-1ubuntu1_amd64.deb ...
Unpacking gcc (4:5.3.1-1ubuntu1) ...
Selecting previously unselected package libstdc++-5-dev:amd64.
Preparing to unpack .../libstdc++-5-dev_5.4.0-6ubuntu1~16.04.11_amd64.deb ...
Unpacking libstdc++-5-dev:amd64 (5.4.0-6ubuntu1~16.04.11) ...
Selecting previously unselected package g++-5.
Preparing to unpack .../g++-5_5.4.0-6ubuntu1~16.04.11_amd6

Selecting previously unselected package libexpat1-dev:amd64.
Preparing to unpack .../libexpat1-dev_2.1.0-7ubuntu0.16.04.3_amd64.deb ...
Unpacking libexpat1-dev:amd64 (2.1.0-7ubuntu0.16.04.3) ...
Selecting previously unselected package libflac8:amd64.
Preparing to unpack .../libflac8_1.3.1-4_amd64.deb ...
Unpacking libflac8:amd64 (1.3.1-4) ...
Selecting previously unselected package zlib1g-dev:amd64.
Preparing to unpack .../zlib1g-dev_1%3a1.2.8.dfsg-2ubuntu4.1_amd64.deb ...
Unpacking zlib1g-dev:amd64 (1:1.2.8.dfsg-2ubuntu4.1) ...
Selecting previously unselected package libpng12-dev:amd64.
Preparing to unpack .../libpng12-dev_1.2.54-1ubuntu1.1_amd64.deb ...
Unpacking libpng12-dev:amd64 (1.2.54-1ubuntu1.1) ...
Selecting previously unselected package libfreetype6-dev:amd64.
Preparing to unpack .../libfreetype6-dev_2.6.1-0.1ubuntu2.3_amd64.deb ...
Unpacking libfreetype6-dev:amd64 (2.6.1-0.1ubuntu2.3) ...
Selecting previously unselected package libtiff5:amd64.
Preparing to unpack .../libtiff

Selecting previously unselected package nginx-common.
Preparing to unpack .../nginx-common_1.10.3-0ubuntu0.16.04.3_all.deb ...
Unpacking nginx-common (1.10.3-0ubuntu0.16.04.3) ...
Selecting previously unselected package nginx-core.
Preparing to unpack .../nginx-core_1.10.3-0ubuntu0.16.04.3_amd64.deb ...
Unpacking nginx-core (1.10.3-0ubuntu0.16.04.3) ...
Selecting previously unselected package nginx.
Preparing to unpack .../nginx_1.10.3-0ubuntu0.16.04.3_all.deb ...
Unpacking nginx (1.10.3-0ubuntu0.16.04.3) ...
Selecting previously unselected package openjdk-8-jre:amd64.
Preparing to unpack .../openjdk-8-jre_8u212-b03-0ubuntu1.16.04.1_amd64.deb ...
Unpacking openjdk-8-jre:amd64 (8u212-b03-0ubuntu1.16.04.1) ...
Selecting previously unselected package openjdk-8-jdk-headless:amd64.
Preparing to unpack .../openjdk-8-jdk-headless_8u212-b03-0ubuntu1.16.04.1_amd64.deb ...
Unpacking openjdk-8-jdk-headless:amd64 (8u212-b03-0ubuntu1.16.04.1) ...
Selecting previously unselected package openjdk-8-jd

Setting up librtmp1:amd64 (2.4+20151223.gitfa8646d-1ubuntu0.1) ...
Setting up libcurl3-gnutls:amd64 (7.47.0-1ubuntu2.13) ...
Setting up libdbus-1-3:amd64 (1.10.6-1ubuntu3.3) ...
Setting up libdbus-glib-1-2:amd64 (0.106-1) ...
Setting up libdrm-common (2.4.91-2~16.04.1) ...
Setting up libdrm2:amd64 (2.4.91-2~16.04.1) ...
Setting up libedit2:amd64 (3.1-20150325-1ubuntu2) ...
Setting up libelf1:amd64 (0.165-3ubuntu1.2) ...
Setting up libgeoip1:amd64 (1.6.9-1) ...
Setting up libicu55:amd64 (55.1-7ubuntu0.4) ...
Setting up libxml2:amd64 (2.9.3+dfsg1-1ubuntu0.6) ...
Setting up python-apt-common (1.1.0~beta1ubuntu0.16.04.4) ...
Setting up rsync (3.1.1-3ubuntu1.2) ...
invoke-rc.d: could not determine current runlevel
invoke-rc.d: policy-rc.d denied execution of restart.
Setting up shared-mime-info (1.5-2ubuntu0.2) ...
Setting up wget (1.17.1-1ubuntu1.5) ...
Setting up binutils (2.26.1-1ubuntu1~16.04.8) ...
Setting up libc-dev-bin (2.23-0ubuntu11) ...
Setting up linux-libc-dev:amd64 (4.4.0-150.

running python post-rtupdate hooks for python3.5...
Setting up lsb-release (9.20160110ubuntu0.2) ...
Setting up python3-apt (1.1.0~beta1ubuntu0.16.04.4) ...
Setting up python3-dbus (1.2.0-3) ...
Setting up python3-gi (3.20.0-0ubuntu1) ...
Setting up libnss3-nssdb (2:3.28.4-0ubuntu0.16.04.5) ...
Setting up libnss3:amd64 (2:3.28.4-0ubuntu0.16.04.5) ...
Setting up openjdk-8-jre-headless:amd64 (8u212-b03-0ubuntu1.16.04.1) ...
update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/rmid to provide /usr/bin/rmid (rmid) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/clhsdb to provide /usr/bin/clhsdb (clhsdb) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java to provide /usr/bin/java (java) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/keytool to provide /usr/bin/keytool (keytool) in auto mode
update-alternatives: using /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/hsdb to provid

Adding debian:Comodo_Trusted_Services_root.pem
Adding debian:Cybertrust_Global_Root.pem
Adding debian:D-TRUST_Root_Class_3_CA_2_2009.pem
Adding debian:D-TRUST_Root_Class_3_CA_2_EV_2009.pem
Adding debian:DST_ACES_CA_X6.pem
Adding debian:DST_Root_CA_X3.pem
Adding debian:Deutsche_Telekom_Root_CA_2.pem
Adding debian:DigiCert_Assured_ID_Root_CA.pem
Adding debian:DigiCert_Assured_ID_Root_G2.pem
Adding debian:DigiCert_Assured_ID_Root_G3.pem
Adding debian:DigiCert_Global_Root_CA.pem
Adding debian:DigiCert_Global_Root_G2.pem
Adding debian:DigiCert_Global_Root_G3.pem
Adding debian:DigiCert_High_Assurance_EV_Root_CA.pem
Adding debian:DigiCert_Trusted_Root_G4.pem
Adding debian:E-Tugra_Certification_Authority.pem
Adding debian:EC-ACC.pem
Adding debian:EE_Certification_Centre_Root_CA.pem
Adding debian:Entrust.net_Premium_2048_Secure_Server_CA.pem
Adding debian:Entrust_Root_Certification_Authority.pem
Adding debian:Entrust_Root_Certification_Authority_-_EC1.pem
Adding debian:Entrust_Root_Certificatio

Collecting six (from h5py)
  Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
Building wheels for collected packages: sklearn
  Building wheel for sklearn (setup.py): started
  Building wheel for sklearn (setup.py): finished with status 'done'
  Stored in directory: /tmp/pip-ephem-wheel-cache-fTsRuz/wheels/76/03/bb/589d421d27431bcd2c6da284d5f2286c8e3b2ea3cf1594c074
Successfully built sklearn
Installing collected packages: numpy, scipy, scikit-learn, sklearn, six, python-dateutil, pytz, pandas, Pillow, h5py
Successfully installed Pillow-6.0.0 h5py-2.9.0 numpy-1.16.4 pandas-0.24.2 python-dateutil-2.8.0 pytz-2019.1 scikit-learn-0.20.3 scipy-1.2.2 six-1.12.0 sklearn-0.0
Removing intermediate container 73c047f69f37
 ---> cc398721113f
Step 5/9 : WORKDIR /root
 ---> Running in 0e9cc1ada7b1
Removing intermediate container 0e9cc1ada7b1
 ---> f0e1d510d69d
Step 6/9 : ENV TF_SERVING_VERSION=1.7.0


  Downloading https://files.pythonhosted.org/packages/89/ac/48dd71c2bdc8d31e367f9b72f25ccb3b89bc6b9d664fee21f9a8efa5714d/tensorboard-1.13.1-py2-none-any.whl (3.2MB)
Collecting absl-py>=0.1.6 (from tensorflow>=1.7.0->tensorflow-serving-api==1.7.0)
  Downloading https://files.pythonhosted.org/packages/da/3f/9b0355080b81b15ba6a9ffcf1f5ea39e307a2778b2f2dc8694724e8abd5b/absl-py-0.7.1.tar.gz (99kB)
Collecting backports.weakref>=1.0rc1 (from tensorflow>=1.7.0->tensorflow-serving-api==1.7.0)
  Downloading https://files.pythonhosted.org/packages/88/ec/f598b633c3d5ffe267aaada57d961c94fdfa183c5c3ebda2b6d151943db6/backports.weakref-1.0.post1-py2.py3-none-any.whl
Collecting termcolor>=1.1.0 (from tensorflow>=1.7.0->tensorflow-serving-api==1.7.0)
  Downloading https://files.pythonhosted.org/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz
Collecting tensorflow-estimator<1.14.0rc0,>=1.13.0 (from tensorflow>=1.7.0->tensorflow-serving-api==1.7.0)
  Down

  1700K ........[0m[91m.. ......[0m[91m.... .......... ...[0m[91m......[0m[91m. ....[0m[91m..[0m[91m.[0m[91m.[0m[91m..[0m[91m  1% 39.1M[0m[91m 2s[0m[91m
  1750K ..[0m[91m..[0m[91m....[0m[91m..[0m[91m .......[0m[91m..[0m[91m. ..........[0m[91m [0m[91m..[0m[91m..[0m[91m......[0m[91m .[0m[91m....[0m[91m.....  2% 34.2M 2s
  1800K .[0m[91m......... .[0m[91m........[0m[91m. .[0m[91m..[0m[91m....... .......... ..........  2% 46.8M 2s
  1850K .......... .......... .......... .......... ..........  2% 62.0M 2s
  1900K .......... .......... .......... .......... ..........  2% 93.7M 2s
  1950K .[0m[91m.[0m[91m........ .......... .......... .......... ..........  2%  136M 2s
  2000K .......... .......... .......... ..........[0m[91m .........[0m[91m.  2% 68.3M 2s
  2050K .......... .......[0m[91m... [0m[91m..[0m[91m....[0m[91m.... ......[0m[91m.... ..........  2% 59.0M 2s
  2100K .......... .......... .......... ...[0m[

  6650K .......... .......... .........[0m[91m. .......[0m[91m... .....[0m[91m.....  7%  166M 2s
  6700K .......... .......... .......[0m[91m... .....[0m[91m..... ........[0m[91m.[0m[91m.  7%  188M 2s
  6750K .......... .......... .......... .......[0m[91m... ..........  7% 7.28M 2s
  6800K[0m[91m .......... .[0m[91m........[0m[91m. .........[0m[91m. .....[0m[91m..... ...[0m[91m.......  7% 96.3M 2s
  6850K ...[0m[91m......[0m[91m. .......[0m[91m... .....[0m[91m...[0m[91m.. ...[0m[91m....... .[0m[91m........[0m[91m.  7% 57.6M 2s
  6900K .......[0m[91m... .....[0m[91m..... ...[0m[91m....... .[0m[91m........[0m[91m. .......[0m[91m...  7% 99.7M 2s
  6950K .[0m[91m....[0m[91m..... ...[0m[91m....... [0m[91m.[0m[91m........[0m[91m. .......[0m[91m... [0m[91m.....[0m[91m.....  7% 42.8M 2s
  7000K ...[0m[91m.....[0m[91m.. .[0m[91m......... .......[0m[91m... .......... ..........  7% 63.4M 2s
  7050K .....[0m[91m.

 13100K ..[0m[91m........ ....[0m[91m.....[0m[91m. ......[0m[91m.... ....[0m[91m...... ..[0m[91m........ 14% 81.9M 1s[0m[91m
 13150K .......... .......... .......... .......... .......... 14%  172M 1s
 13200K .......... .......... .......... .......... .......... 14%  128M 1s
 13250K .......... .......... .......... .......... .......... 14%  348M 1s
 13300K .......... .......... .......... .......... .......... 14%  361M 1s
 13350K .......... .......... .......... .......... .......... 14%  323M 1s
 13400K .......... .......... .......... .......... .......... 14%  342M 1s
 13450K .......... .......... .......... .......... .......... 15%  363M 1s
 13500K .......... .......... .......... .......... .......... 15%  349M 1s
 13550K .......... .......... .......... .......... .......... 15%  304M 1s
 13600K .......... .......... ..........[0m[91m ........[0m[91m.. .......... 15% 20.7M 1s
 13650K [0m[91m....[0m[91m...... .......... .......... ...[0m[91m...[0m[91m

 20550K ..........[0m[91m .......... ........[0m[91m.. .......... ..[0m[91m........ 22%  189M 1s
 20600K .......... .......... .......... ..........[0m[91m .......... 22%  197M 1s
 20650K .......... ....[0m[91m...... .......... ....[0m[91m...... ....[0m[91m...... 23%  177M[0m[91m 1s
 20700K .......... .......... .......... .......... .......... 23%  255M 1s
 20750K .......... ........[0m[91m.. .......... .......... .......... 23%  258M 1s
 20800K .......... .......... ....[0m[91m...... ..[0m[91m........ .......... 23%[0m[91m  176M 1s
 20850K .......... .....[0m[91m..... .......... .......... .......... 23% 14.6M 1s
 20900K .......... .......... .......... .......... .......... 23%  315M 1s
 20950K .......... .......... .......... .......... .......... 23%  330M 1s
 21000K .......... .......... .......... .......... .......... 23%  292M 1s
 21050K .......... .......... .......... .......... .......... 23%  374M 1s
 21100K .......... .......... .......... ......

 27550K ..[0m[91m.[0m[91m.......[0m[91m ........[0m[91m.. ..........[0m[91m .......... .......... 30%[0m[91m 4.96M 1s
 27600K ....[0m[91m...... .........[0m[91m. .......... ......[0m[91m.... ..[0m[91m..[0m[91m.[0m[91m..... 30% 98.6M 1s
 27650K .......... .......... .......... .......... ..[0m[91m........ 30%  200M 1s[0m[91m
 27700K[0m[91m .......... .......... .......... ..........[0m[91m ........[0m[91m.. 30%  155M 1s
 27750K ......[0m[91m.... ....[0m[91m...... .......... .......... .........[0m[91m. 30%  185M 1s
 27800K ....[0m[91m...... ..[0m[91m........[0m[91m .......... .......... .......... 30%  164M 1s
 27850K ..[0m[91m........ ........[0m[91m.. ......[0m[91m.... .......... .......... 31%  175M 1s
 27900K ........[0m[91m.. .......... .......... ..[0m[91m........ .......... 31%  187M 1s
 27950K .......... .......... .......... ........[0m[91m.. .......... 31%  215M 1s
 28000K ....[0m[91m...... .......... .......... .....

 34600K ..[0m[91m........ ........[0m[91m.. .......... .......... .......... 38%  177M 1s[0m[91m
 34650K .......... ..........[0m[91m .......... ..........[0m[91m ........[0m[91m.. 38% 10.9M 1s
 34700K .......... ....[0m[91m...... ..........[0m[91m ......[0m[91m.... .......... 38%  116M 1s
 34750K .......... ..[0m[91m........[0m[91m .......... .......... ....[0m[91m......[0m[91m 38%  120M 1s
 34800K ..[0m[91m........ ........[0m[91m.. .......... ..........[0m[91m ..[0m[91m........ 38%  119M 1s
 34850K .......... .......... ....[0m[91m...... ..........[0m[91m .......... 38% 71.6M 1s
 34900K .......... [0m[91m.......... ..[0m[91m........ .......... ......[0m[91m.... 38%  208M 1s
 34950K ....[0m[91m...... ..[0m[91m........ .......... .......... .....[0m[91m..... 38%  203M 1s
 35000K ..[0m[91m........[0m[91m .......... ......[0m[91m.... .......... .......... 39%  146M 1s[0m[91m
 35050K ........[0m[91m.. .......... .......... .....

 41100K ........[0m[91m.. ...[0m[91m....... ..[0m[91m.......[0m[91m. .........[0m[91m. .....[0m[91m..... 45% 60.3M 1s
 41150K ...[0m[91m....... ......[0m[91m...[0m[91m. ....[0m[91m...... .......... .......... 45% 86.3M 1s
 41200K .......... .......... .......... .......... .......... 45%  284M 1s
 41250K .......... .......... .......... .......... .......... 45%  365M 1s
 41300K .......... .......... .......... .......... .......... 46%  154M 1s
 41350K .......... .......... .......... .......... .......... 46%  346M 1s
 41400K .......... .......... .......... .......... .......... 46%  309M 1s
 41450K .......... .......... .......... .......... .......... 46%  319M 1s
 41500K .......... .......... .......... .......... .......... 46%  331M 1s
 41550K .......... .......... .......... .......... .......... 46%  353M 1s
 41600K .......... .......... .......... .......... .......... 46%  288M 1s
 41650K .......... .......... .......... .......... .......... 

 45150K ...[0m[91m....... .[0m[91m..[0m[91m.......[0m[91m ....[0m[91m...... .......... ....[0m[91m...... 50% 8.57M 1s
 45200K ..........[0m[91m .......... ......[0m[91m.... ....[0m[91m...... ...[0m[91m....... 50%  106M 1s[0m[91m
 45250K ........[0m[91m.[0m[91m. .......... .......... ..........[0m[91m .......... 50% 69.2M 1s
 45300K[0m[91m .......... .[0m[91m......... ..........[0m[91m .......... .......... 50%  182M 1s
 45350K .........[0m[91m. ..........[0m[91m ........[0m[91m.. .......... .......... 50%  199M 1s
 45400K ..[0m[91m........ ........[0m[91m.. ......[0m[91m.... .......... .......... 50%  215M 1s[0m[91m
 45450K .......... ..........[0m[91m .......... .........[0m[91m.[0m[91m .......... 50%  168M 1s
 45500K .......... ....[0m[91m...... ..........[0m[91m ..[0m[91m........ .......... 50%  201M 1s
 45550K ....[0m[91m...... .......... ........[0m[91m.. .......... ..........[0m[91m 50%  284M 1s
 45600K ..[0m[91m

 52100K .......... .......... ...[0m[91m....... ..[0m[91m........ .......... 58% 4.17M 1s
 52150K .......... .......... .......... .......... ......[0m[91m.... 58%  299M 1s
 52200K .......... .......... .......... .......... .......... 58%  168M 1s
 52250K ..........[0m[91m ..........[0m[91m ......[0m[91m.... .......... ....[0m[91m...... 58%  101M 1s[0m[91m
 52300K ........[0m[91m.. ......[0m[91m.... .......... ..[0m[91m........[0m[91m .......... 58% 81.7M 1s
 52350K .......... ....[0m[91m...... ..[0m[91m........ .......... .[0m[91m.....[0m[91m.... 58%  116M 1s
 52400K .......... ..[0m[91m........ ........[0m[91m.[0m[91m. ......[0m[91m.... .......... 58% 90.4M 1s
 52450K ..........[0m[91m .......... ..[0m[91m........ ....[0m[91m...... ..[0m[91m........ 58%  136M 1s
 52500K ........[0m[91m.. ......[0m[91m..[0m[91m.. .......... ..........[0m[91m .......... 58%  101M 1s
 52550K .......... ........[0m[91m.. ..[0m[91m........ .....

 59000K .....[0m[91m..... ...[0m[91m....... .......... .......... ....[0m[91m.[0m[91m..... 65%  110M 0s
 59050K .......... [0m[91m.......... .......... .......... .......... 65%  215M 0s
 59100K[0m[91m .[0m[91m...[0m[91m...... .......... .......... .......... .......... 65% 20.3M 0s
 59150K .......... .......... .......... .......... .......... 65%  350M 0s
 59200K .......... .......... .......... .......... .......... 65%  350M 0s
 59250K .......... .......... .......... .......... .......... 65%  345M 0s
 59300K .......... .......... .......... .......... .......... 66%  304M 0s
 59350K .......... .......... .......... .......... .......... 66%  336M 0s
 59400K .......... .......... .......... .......... .......... 66%  361M 0s
 59450K .......... .......... .......... .......... .......... 66%  337M 0s
 59500K .......... .......... .......... .......... .......... 66%  281M 0s
 59550K .......... .......... .......... .......... .......... 66%  350M 0s
 59600K ........

 66050K .......... .......... .......... .......... ....[0m[91m.[0m[91m..... 73%  146M 0s
 66100K ...[0m[91m....... .[0m[91m.......[0m[91m.. .......... .....[0m[91m.....[0m[91m .......... 73%  265M[0m[91m 0s
 66150K .[0m[91m........[0m[91m. .......[0m[91m... .......... [0m[91m...[0m[91m....... .[0m[91m........[0m[91m. 73%  265M 0s
 66200K [0m[91m.......[0m[91m... .....[0m[91m..... ...[0m[91m....... .........[0m[91m. .......... 73%  322M 0s
 66250K .....[0m[91m..... .[0m[91m..[0m[91m....... .[0m[91m........[0m[91m.[0m[91m .......[0m[91m... .......... 73%  116M 0s
 66300K ...[0m[91m....... .[0m[91m..[0m[91m......[0m[91m. .......[0m[91m... .....[0m[91m.....[0m[91m ...[0m[91m....... 73%  306M 0s
 66350K .......... ..........[0m[91m .....[0m[91m..... ...[0m[91m....... .[0m[91m......... 73%  281M 0s
 66400K .......[0m[91m..[0m[91m. .......... ...[0m[91m....... .[0m[91m......... .......[0m[91m... 73

 69650K ..........[0m[91m ......[0m[91m.... .......... .......... .......... 77% 6.63M 0s
 69700K .......... .......... .......... .......... .......... 77%  314M 0s
 69750K .......... .......... .......... .......... .......... 77%  322M 0s
 69800K .......... .......... .......... .......... .......... 77%  336M 0s
 69850K .......... .......... .......... .......... .......... 77%  311M 0s
 69900K .......... .......... .......... .......... .......... 77%  334M 0s
 69950K .......... .......... .......... .......... .......... 77%  331M 0s
 70000K .......... .......... .......... .......... .......... 77%  341M 0s
 70050K .......... .......... .......... .......... .......... 78%  254M 0s
 70100K .......... .......... .......... .......... .......[0m[91m... 78%  310M 0s
 70150K .......... [0m[91m.......... ........[0m[91m.. ......[0m[91m.... ....[0m[91m...... 78% 20.8M[0m[91m 0s
 70200K ..[0m[91m........ ........[0m[91m.. ......[0m[91m....[0m[91m ....

 73350K ..........[0m[91m .......... .[0m[91m......... .......... .......... 81% 17.7M 0s
 73400K .......... .......... .......... .......... .......... 81%  285M 0s
 73450K .......... .......... .......... .......... .......... 81%  280M 0s
 73500K .......... .......... .......... .......... .......... 81%  348M 0s
 73550K .......... .......... .......... .......... .......... 81%  372M 0s
 73600K .......... .......... .......... .......... .......... 81%  325M 0s
 73650K .......... .......... .......... .......... .......... 82%  301M 0s
 73700K .......... .......... .......... .......... .......... 82%  367M 0s
 73750K .......... .......... .......... .......... .......... 82%  340M 0s
 73800K .......... ....[0m[91m...... ..........[0m[91m .......... .........[0m[91m. 82% 14.2M 0s
 73850K ....[0m[91m...... .......... .....[0m[91m..... ..[0m[91m........ [0m[91m.......[0m[91m... 82%  291M 0s
 73900K ..[0m[91m........[0m[91m ........[0m[91m.. ...[0m[91m....

 80250K ...[0m[91m.[0m[91m...... .[0m[91m........[0m[91m. ......[0m[91m.[0m[91m... .....[0m[91m..... ...[0m[91m..[0m[91m..... 89% 65.3M 0s
 80300K .[0m[91m......... ..[0m[91m.....[0m[91m... .....[0m[91m..... ...[0m[91m....... .[0m[91m......[0m[91m...[0m[91m 89%  158M 0s
 80350K .......[0m[91m... .......... .......... .......... .......... 89%  112M 0s
 80400K .......... .......... .......... .......... .......... 89%  329M 0s
 80450K .......... .......... .......... .......[0m[91m... .....[0m[91m..... 89% 10.1M 0s
 80500K .......... .....[0m[91m..... ...[0m[91m....... ..[0m[91m...[0m[91m.[0m[91m.... ...[0m[91m.....[0m[91m.. 89%  223M 0s
 80550K .........[0m[91m. ........[0m[91m.. .....[0m[91m..... ...[0m[91m....... .[0m[91m......... 89%  259M 0s
 80600K .......[0m[91m.[0m[91m.. .....[0m[91m..... ...[0m[91m....... .[0m[91m.......[0m[91m.. .......[0m[91m... 89%  295M 0s
 80650K .......... ...[0m[91m..

 83650K [0m[91m.......[0m[91m... .....[0m[91m..... ...[0m[91m....... .[0m[91m......... .......[0m[91m... 93% 76.4M 0s
 83700K .......... ...[0m[91m....... ..[0m[91m........ .......... .......... 93% 90.7M 0s
 83750K .......... .......... .......... .......... .......... 93%  249M 0s
 83800K .......... .......... .......... .......... .......... 93%  176M 0s
 83850K .......... .......... .......... .......... .......... 93%  181M 0s
 83900K .......... .......... .......... ....[0m[91m...... .......... 93% 17.2M 0s
 83950K .......... .......... .......... .......... .......... 93%  220M 0s
 84000K .......... .......... .......... .......... .......... 93%  331M 0s
 84050K .......... .......... .......... .......... .......... 93%  342M 0s
 84100K .......... .......... .......... .......... .......... 93%  312M 0s
 84150K .......... .......... ....[0m[91m...... .......... .......... 93%  259M 0s
 84200K .......... .......... ..[0m[91m........ .......... .......... 93

[0mSelecting previously unselected package tensorflow-model-server.
(Reading database ... 17711 files and directories currently installed.)
Preparing to unpack tensorflow-model-server_1.7.0_all.deb ...
Unpacking tensorflow-model-server (1.7.0) ...
Setting up tensorflow-model-server (1.7.0) ...
Removing intermediate container f3fa2fcd533c
 ---> 8099d629a3f4
Step 9/9 : RUN add-apt-repository ppa:ubuntu-toolchain-r/test -y &&     apt-get update &&     apt-get install -y libstdc++6
 ---> Running in fd23f7859b63
[91mgpg: keyring `/tmp/tmpavas8iqb/secring.gpg' created
[0m[91mgpg: keyring `/tmp/tmpavas8iqb/pubring.gpg' created
[0m[91mgpg: requesting key BA9EF27F from hkp server keyserver.ubuntu.com
[0m[91mgpg: /tmp/tmpavas8iqb/trustdb.gpg: trustdb created
gpg: key BA9EF27F: public key "Launchpad Toolchain builds" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
[0mOK
Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [109 kB]
Get:2 http:

In [35]:
%cd ~/SageMaker/amazon-sagemaker-keras-text-classification/container/

/home/ec2-user/SageMaker/amazon-sagemaker-keras-text-classification/container


In [7]:
%%writefile Dockerfile
# Build an image that can do training and inference in SageMaker

FROM tensorflow-base:1.8.0-cpu-py2

ENV PATH="/opt/program:${PATH}"

# Set up the program in the image
COPY sagemaker_keras_text_classification /opt/program
WORKDIR /opt/program

Writing Dockerfile


In [36]:
!docker build -t sagemaker-keras-text-class:latest .

Sending build context to Docker daemon  459.7MB
Step 1/4 : FROM tensorflow-base:1.8.0-cpu-py2
 ---> f0bfaa074d3e
Step 2/4 : ENV PATH="/opt/program:${PATH}"
 ---> Using cache
 ---> c48badb6665c
Step 3/4 : COPY sagemaker_keras_text_classification /opt/program
 ---> b45e426f17c2
Step 4/4 : WORKDIR /opt/program
 ---> Running in 296900a0392c
Removing intermediate container 296900a0392c
 ---> 31e4ff58ddce
Successfully built 31e4ff58ddce
Successfully tagged sagemaker-keras-text-class:latest


In [37]:
!docker images

REPOSITORY                                                                         TAG                 IMAGE ID            CREATED             SIZE
sagemaker-keras-text-class                                                         latest              31e4ff58ddce        2 seconds ago       1.91GB
741855114961.dkr.ecr.us-east-1.amazonaws.com/sagemaker-keras-text-classification   latest              8e715bc36ca2        2 hours ago         1.91GB
sagemaker-keras-text-classification                                                latest              8e715bc36ca2        2 hours ago         1.91GB
tensorflow-base                                                                    1.8.0-cpu-py2       f0bfaa074d3e        2 hours ago         1.91GB
ubuntu                                                                             16.04               2a697363a870        3 weeks ago         119MB


# Lab 3: Local Testing of Training & Inference Code

In [10]:
%cd ~/SageMaker/amazon-sagemaker-keras-text-classification/data
!cp -a . ../container/local_test/test_dir/input/data/training/

/home/ec2-user/SageMaker/amazon-sagemaker-keras-text-classification/data


In [38]:
%cd ~/SageMaker/amazon-sagemaker-keras-text-classification/container/local_test
!./train_local.sh sagemaker-keras-text-class:latest

/home/ec2-user/SageMaker/amazon-sagemaker-keras-text-classification/container/local_test
Instructions for updating:
Colocations handled automatically by placer.
2019-06-11 08:52:04.531505: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-06-11 08:52:04.537304: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2400070000 Hz
2019-06-11 08:52:04.537483: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0xb95aaf0 executing computations on platform Host. Devices:
2019-06-11 08:52:04.537504: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
Instructions for updating:
Use tf.cast instead.
Starting the training.
                                               TITLE  ...      TIMESTAMP
1  Fed official says weak data caused by weather,...  ...  1394470370698
2  Fed's Charles Plosser sees high bar for change..

In [None]:
!./serve_local.sh sagemaker-keras-text-class:latest

In [48]:
%cd ~/SageMaker/amazon-sagemaker-keras-text-classification/container/local_test/
!./predict.sh input.json application/json

/home/ec2-user/SageMaker/amazon-sagemaker-keras-text-classification/container/local_test
curl: (7) Failed to connect to localhost port 8080: Connection refused


# Lab 4: Training and Hosting your Algorithm in Amazon SageMaker


### Building and registering the container

The following shell code shows how to build the container image using `docker build` and push the container image to ECR using `docker push`. 

This code looks for an ECR repository in the account you're using and the current default region (if you're using a SageMaker notebook instance, this will be the region where the notebook instance was created). If the repository doesn't exist, the script will create it.

In [49]:
%%sh

# The name of our algorithm
algorithm_name=sagemaker-keras-text-classification

cd ~/SageMaker/amazon-sagemaker-keras-text-classification/container

chmod +x sagemaker_keras_text_classification/train
chmod +x sagemaker_keras_text_classification/serve

account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)
region=${region:-us-west-2}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.

aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)

# Build the docker image locally with the image name and then push it to ECR
# with the full name.

# On a SageMaker Notebook Instance, the docker daemon may need to be restarted in order
# to detect your network configuration correctly.  (This is a known issue.)
if [ -d "/home/ec2-user/SageMaker" ]; then
  sudo service docker restart
fi

docker build  -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

Login Succeeded
Stopping docker: [  OK  ]
Starting docker:	.[  OK  ]
Sending build context to Docker daemon  459.7MB
Step 1/4 : FROM tensorflow-base:1.8.0-cpu-py2
 ---> f0bfaa074d3e
Step 2/4 : ENV PATH="/opt/program:${PATH}"
 ---> Using cache
 ---> c48badb6665c
Step 3/4 : COPY sagemaker_keras_text_classification /opt/program
 ---> f74ebb27711d
Step 4/4 : WORKDIR /opt/program
 ---> Running in 06e4f795d1ed
Removing intermediate container 06e4f795d1ed
 ---> 3435689f7906
Successfully built 3435689f7906
Successfully tagged sagemaker-keras-text-classification:latest
The push refers to repository [741855114961.dkr.ecr.us-east-1.amazonaws.com/sagemaker-keras-text-classification]
70ee6f0fa5a3: Preparing
00b2dbe4c1b6: Preparing
e4b51eac015b: Preparing
f9a193cdb19d: Preparing
fe604d50586c: Preparing
9801f0d644e2: Preparing
19d4850d25d7: Preparing
4c54072a5034: Preparing
49652298c779: Preparing
e15278fcccca: Preparing
739482a9723d: Preparing
9801f0d644e2: Waiting
19d4850d25d7: Waiting
4c54072a

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



Once you have your container packaged, you can use it to train and serve models. Let's do that with the algorithm we made above.

## Set up the environment

Here we specify a bucket to use and the role that will be used for working with SageMaker.

In [42]:
# S3 prefix
prefix = 'sagemaker-keras-text-classification'

# Define IAM role
import boto3
import re

import os
import numpy as np
import pandas as pd
from sagemaker import get_execution_role

role = get_execution_role()

## Create the session

The session remembers our connection parameters to SageMaker. We'll use it to perform all of our SageMaker operations.

In [43]:
import sagemaker as sage
from time import gmtime, strftime

sess = sage.Session()

## Upload the data for training

When training large models with huge amounts of data, you'll typically use big data tools, like Amazon Athena, AWS Glue, or Amazon EMR, to create your data in S3.  

We can use use the tools provided by the SageMaker Python SDK to upload the data to a default bucket. 

In [50]:
WORK_DIRECTORY = '/home/ec2-user/SageMaker/amazon-sagemaker-keras-text-classification/data'

data_location = sess.upload_data(WORK_DIRECTORY, key_prefix=prefix)
print(data_location)

s3://sagemaker-us-east-1-741855114961/sagemaker-keras-text-classification


## Create an estimator and fit the model

In order to use SageMaker to fit our algorithm, we'll create an `Estimator` that defines how to use the container to to train. This includes the configuration we need to invoke SageMaker training:

* The __container name__. This is constucted as in the shell commands above.
* The __role__. As defined above.
* The __instance count__ which is the number of machines to use for training.
* The __instance type__ which is the type of machine to use for training.
* The __output path__ determines where the model artifact will be written.
* The __session__ is the SageMaker session object that we defined above.

Then we use fit() on the estimator to train against the data that we uploaded above.

In [51]:
account = sess.boto_session.client('sts').get_caller_identity()['Account']
region = sess.boto_session.region_name
image = '{}.dkr.ecr.{}.amazonaws.com/sagemaker-keras-text-classification'.format(account, region)

tree = sage.estimator.Estimator(image,
                       role, 1, 'ml.c5.2xlarge',
                       output_path="s3://{}/output".format(sess.default_bucket()),
                       sagemaker_session=sess)

tree.fit(data_location)

2019-06-11 09:30:23 Starting - Starting the training job...
2019-06-11 09:30:26 Starting - Launching requested ML instances......
2019-06-11 09:31:40 Starting - Preparing the instances for training......
2019-06-11 09:32:37 Downloading - Downloading input data
2019-06-11 09:32:37 Training - Downloading the training image...
2019-06-11 09:33:13 Training - Training image download completed. Training in progress....
[31mInstructions for updating:[0m
[31mColocations handled automatically by placer.[0m
[31m2019-06-11 09:33:36.119620: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA[0m
[31m2019-06-11 09:33:36.168543: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000000000 Hz[0m
[31m2019-06-11 09:33:36.170397: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0xb4e0880 executing computations on platform Host. Devices:[0m
[31m2019-06-11 09:

## Deploy the model

Deploying the model to SageMaker hosting just requires a `deploy` call on the fitted model. This call takes an instance count, instance type, and optionally serializer and deserializer functions. These are used when the resulting predictor is created on the endpoint.

In [52]:
from sagemaker.predictor import json_serializer
predictor = tree.deploy(1, 'ml.m5.xlarge', serializer=json_serializer)

---------------------------------------------------------------------------------------------------!

In [53]:
request = { "input": "Deadpool 2 Has More Swearing, Slicing and Dicing from Ryan Reynolds"}

print(predictor.predict(request).decode('utf-8'))

{"result": "Entertainment"}


In [65]:
import json
news_dataset_sampled = news_dataset.sample(frac=0.0001)
for i, n in enumerate(range(news_dataset_sampled.shape[0])):    
    category = news_dataset_sampled.iloc[i][3]
    if category == "b":
        category = "Business"
    elif category == "t":
        category = "Science & Technology"
    elif category == "e":
        category = "Entertainment"
    elif category == "m":
        category = "Health & Medicine"
    else:
        category = "unknown"
    request = {"input": news_dataset_sampled.iloc[i][0]}
    result = json.loads(predictor.predict(request).decode('utf-8'))["result"]
    print("{}. {} - Expected: {}, Predicted: {}".format(n+1, news_dataset_sampled.iloc[i][0], category,result))
    

1. A closer look at Amazon's new Fire smartphone - Expected: Business, Predicted: Science & Technology
2. Lake Forest Businessman Steps in to Save Crumbs - Expected: Business, Predicted: Science & Technology
3. Rob Kardashian Spotted at Gym Amid Sizzurp Concerns; Does He Need Rehab  ... - Expected: Entertainment, Predicted: Entertainment
4. Rite Aid to Administer MMR Vaccines in Ohio - Expected: Health & Medicine, Predicted: Health & Medicine
5. The Linux Foundation Aims To Prevent Future Heartbleed Bugs With Its Core  ... - Expected: Science & Technology, Predicted: Science & Technology
6. Google Earnings Impacted by Nest: What Wall Street's Saying - Expected: Business, Predicted: Science & Technology
7. MacRumors: Apple Close to Acquiring Radio Streaming Service Swell for $30  ... - Expected: Science & Technology, Predicted: Science & Technology
8. Netflix raises prices by a $1 for new subscribers (Update) - Expected: Science & Technology, Predicted: Science & Technology
9. Slowdown 

## Optional cleanup

When you're done with the endpoint, you'll want to clean it up.

In [None]:
sess.delete_endpoint(predictor.endpoint)