# Gender Prediction from name, using Deep Learning

Deep Neural Networks can be used to extract features in the input and derive higher level abstractions. This technique is used regularly in vision, speech and text analysis. In this exercise, we build a deep learning model that would identify low level features in texts containing people's names, and would be able to classify them in one of two categories - Male or Female.

## Recurrent Neural Networks and Long Short Term Memory
Since we have to process sequence of characters, Recurrent Neural Netwrosk are a good fit for this problem. Whenever we have to persist learning from data previously seen, traditional Neural Networks fail. Recurrent Neural Networks contains loops in the graph, that allows them to persist data in memory. Effective the loops facilitate passing multiple copies of information to be passed on to next step.
<details>
<summary><strong>Recurrent Neural Network - Loops (expand to view diagram)</strong></summary><p>
    ![Recurrent Neural Network - Loops](images/RNN-unrolled.png "Recurrent Neural Network - Loops")
</p></details>


In practice however, when we need to selectively memorize or forget patterns seen in the past, based on the context, plain vanilla RNNs do not seem to perform so well. Instead we can use a special type of RNN, that can retain information in long term, and thus works better in understanding the contextual relation between patterns observed. They are known as Long Short Term memory.

The nodes in an LSTM networks consusts of remember/forget gates to retain or pass patterns learnt in sequence useful for predicting target variable. These gates are a way to optionally let information through and tends to the ability of LSTM networks to remove or add information to the cell state in regulated manner.
<details>
<summary><strong>LSTM - Chains (expand to view diagram)</strong></summary><p>
    ![LSTM - Chains](images/LSTM3-chain.png "LSTM - Chains")
</p></details>


## Network Architecture
The problem we are trying to solve is to predict whether a given name belongs to a male or female. We will use supervised learning, where the character sequence making up the names would be `X` variable, and the flag indicating **Male(M)** or **Female(F)**  wuold be `Y` variable.

We use a stacked 2-Layer LSTM model and a final dense layer with softmax activation as our network architecture. We use categorical cross-entropy as loss function, with an adam optimizer. We also add a 20% dropout layer is added for regularization to avoid over-fitting. 

## Dependencies
*  We will use Keras deep learning library to build the network. THerefore we import the symbolic interfaces needed.
* We also use pandas data frames to load and slice-and-dice data
* Finally we need numpy for matric manipulation    
* While running on SageMaker Notebook Instance, we choose conda_tensorflow kernel, so that Keras code is compiled to use tensorflow in the backend. 
* If you choose P2 and P3 class of instances for your Notebook, using Tensorflow ensures the low level code takes advantage of all available GPUs. So further dependencies needs to be installed.


In [None]:
import os
import time
import boto3
import numpy as np
import pandas as pd
from numpy import genfromtxt
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.layers import LSTM
from keras.models import load_model
from sklearn.utils import shuffle
import re
import string
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem.porter import PorterStemmer
import time
nltk.download('punkt')
nltk.download('stopwords')
stop_words = set(stopwords.words('english'))

## Data preparation
* Training data that we will be using to train the LSTM model is derived from US Government's SSA records of baby names registered. 
* Original dataset is split into separate text files for names registered every year, starting from 1880.
Each record in each year's files contain the name, the gender identifier, and a count showing how many of those names have been registered.
* First step in data preparation is to concatenate data in all year specific files into a single file.
* We then load the raw data into memory into a Pandas dataframe and get rid of the temporary downloaded folder

In [None]:
! mkdir download ;  cd download ; wget https://www.ssa.gov/oact/babynames/names.zip ; unzip -qq names.zip 
! cat download/yob*.txt > download/allnames.txt
filename = 'download/allnames.txt'
data=pd.read_csv(filename, sep=',', names = ["Name", "Gender", "Count"])
!rm -rf download
!pip install tqdm

## Data cleanup
Since the training data contains duplicates and names that have been used to represent both Male and Female genders, we'll remediate the solution using following approach:
* Order the names by Name and Gender
* Add the count for each group of unique Name-Gender combination
* Iterate through the unique groups, and where a name is used for both Male and Female, choose to retain th entry with higher count
* Create a new clean data frame containing only unique records mapping each name to a single gender
* Create a dictionary that will have the Name as keys and gender (with higher sum count) as values.
* We loop through the indexes of the grouped data frame and populate the entries into this dictionary following the logic as described above.
* After the dictionary is populated, we create a clean data frame using the keys and values as coulmns.
* Shuffle the data and save the clean data into a file, which we'll also use in subsequent phases of model training.
* Finally save the data into an S3 bucket of your choice (If you were to orchestrate a pipeline to train, deploy and host the model, the container you create will need access to data on an S3 bucket)

In [None]:
from tqdm import tqdm, trange
grouped_data = data.groupby( [ "Name", "Gender"] ).apply(lambda x: x.Count.sum()).to_frame()
grouped_data.columns = ['Count']
names={}

max_count = len(grouped_data.index.values)

for i in tqdm(range(max_count), unit=" records"):
    if i > 0 and grouped_data.index[i][0] == grouped_data.index[i-1][0]:
        if grouped_data.values[i][0] > grouped_data.values[i-1][0]:
            names[grouped_data.index[i][0]] = grouped_data.index[i][1]
        else:
            names[grouped_data.index[i][0]] = grouped_data.index[i-1][1]
    else:
        names[grouped_data.index[i][0]] = grouped_data.index[i][1]
        
clean_data = pd.DataFrame(list(names.items()), columns=['Name', 'Gender']).sample(frac=1).reset_index(drop=True)        
clean_data = clean_data.sample(frac=1).reset_index(drop=True)
print(clean_data.shape)
print(clean_data.loc[clean_data['Name'] == 'Mary'])
print(clean_data.loc[clean_data['Name'] == 'John'])
!mkdir -p ../data
clean_data.to_csv('../data/name-gender.txt',index=False,header=False)
s3bucketname = input("Enter the Name of your S3 bucket : ")
s3 = boto3.resource('s3')
s3.meta.client.upload_file('../data/name-gender.txt', s3bucketname, 'data/name-gender.txt')
s3.ObjectAcl(s3bucketname,'data/name-gender.txt').put(ACL='public-read')

## Feature representation
Before we start buiding the model, we need to represent the data in a format that we can feed into the LSTM model that we'll be creating using the following steps.

* Although we already have the cleaned data loaded as a data frame, we re-load the data fresh from the S3 location. That way we'll know for sure that our cleaned data is of good quality.
* We need to convert the names into numeric arrays, using one-hot encoding scheme. The length of the arrays representing the names need to be as long as the longest name record we have. Therefore we check for the longest name length and have it in a variable.
* Next we derive the set of alphabets used in all the names together.
* In order for one-hot encoding to work, we need to assign index values to each of these characters. Since all characters in english alphabet are used in names, naturally we use sequential values as indices.



In [None]:
filename = "https://s3.amazonaws.com/{}/data/name-gender.txt".format(s3bucketname)
training_data=pd.read_csv(filename, sep=',', names = ["Name", "Gender"])
training_data = shuffle(training_data)

#number of names
num_names = training_data.shape[0]
print("Number of names: {}".format(num_names))

# length of longest name
max_name_length = (training_data['Name'].map(len).max())
print("Maximum length of a name: {}".format(max_name_length))

#Concatenate all names into one long string, case insensitive
names = training_data['Name'].values
txt = ""
for n in names:
    txt += n.lower()

#Created a sorted set of alphabets where each alphabet appears only once
chars = sorted(set(txt))
alphabet_size = len(chars)
print('Alphabet size: {}'.format(alphabet_size))
print("Alphabets used : {}".format(chars))

#Assign indices to characters
char_indices = dict((str(chr(c)), i) for i, c in enumerate(range(97,123)))
alphabet_size = 123-97
char_indices['max_name_length'] = max_name_length
print("Character indices: ", char_indices)


One hot encoded array would be of dimension `n` `*` `m` `*` `a`, where :
* `n` = Number of name records, 
* `m` = Maximum length of a record, and 
* `a` = Size of alphabet

Each of the `n` name records would be represented by 2-dimensional matrix of fixed size. This matrix would have number of rows equal to the maximum length of a name record. Each row would be of size equal to the alphabet size.<p>
For each position of a character in a given name, a row of this 2-dimensinal matrix would be either all zeroes (if no alphabets present in the corresponding position), or a row vector with a `1` in the position of the alphabet indicated in the index (and zeroes in other positions). 
    
So, the name `Mary` would look like, followed of course by 11 rows of all zeroes (because length of all encoded names has to be 15) <p>
m => [0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0]<br>
a => [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]<br>
r => [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]<br>
y => [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0]    
    
* We begin encoding `X` variable by taking a tensor containing all zeroes. Observe the dimensions matches the above description.
* Then we iterate through each character in each name records and selective turn the matching elements (as in the character index) to ones.
* We encode the `Y` variable by simply creating a column vector with zeroes representing Female and ones represnting Male, and check to ensure that dimensions of `X` and `Y` are compatible.    

In [None]:
X = np.zeros((num_names, max_name_length, alphabet_size))
print(X.shape)
for i,name in enumerate(names):
    name = name.lower()
    for t, char in enumerate(name):
        X[i, t,char_indices[char]] = 1

np.set_printoptions(linewidth=200)
print(X[np.where(names=='Mary'),:,:])

Y = np.ones((num_names,2))
Y[training_data['Gender'] == 'F',0] = 0
Y[training_data['Gender'] == 'M',1] = 0
print(Y.shape)
data_dim = alphabet_size
timesteps = max_name_length
num_classes = 2

## Model training
We build a stacked LSTM network with a final dense layer with softmax activation (many-to-one setup).<p>
Categorical cross-entropy loss is used with adam optimizer.<p>
A 20% dropout layer is added for regularization to avoid over-fitting.<p>
We train this model for 10 epochs, with a batch size of 64. Too large a batch size may result in out of memory error.<p>
During training we designate 20% of training data (randomly chosen) to be used as validation data. Validation is never presented to the model during training, instead used to ensure that the model works well with data that it has never seen.<p>
This confirms we are not over-fitting, that is the model is not simply memoriziing the dat it sees, and that it can generalize it's learning.<p>
After training just for one epoch, you should see about 80% of accuracy.<p>
With `t2.medium` instance training one epoch takes about 15 minutes of time (11 ms/step). Training speed dramatically increases if you use a `p2` or `p3` type instances.<p>
Using a `p3.16xlarge` instance type for example, we can easily run 20 epochs of training on this model in about 5 minutes of time (430 µs/step) and achieve above 90% of accuracy.

In [None]:
model = Sequential()
model.add(LSTM(512, return_sequences=True, input_shape=(timesteps, data_dim)))
model.add(Dropout(0.2))
model.add(LSTM(512, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))

model.compile(loss='categorical_crossentropy', 
              optimizer='adam',
              metrics=['accuracy'])

model.fit(X, Y, validation_split=0.10, epochs=1, batch_size=128)

## Model testing
To test the accuracy of the model, we need same pre-processing as we did on training data (one-hot encoding using the same character indices)<p>
We feed this one-hot encoded test data to the model, and the `predict` generates a vector, similar to the training labels vector we used before. Except in this case, it contains what model thinks the gender represnted by each of the test records.<p>
To present data intutitively, we simply map it back to `Male` / `Female`, from the `0` / `1` flag.    

In [None]:
test_name = input("Enter name : ")
val_no_special = re.compile('[%s]' % re.escape(string.punctuation)).sub(' ', test_name)
val_space_collapse = re.sub( '\s+', ' ', val_no_special).strip()
names_test = word_tokenize(val_space_collapse)
num_test = len(names_test)

X_test = np.zeros((num_test, max_name_length, alphabet_size))

for i,name in enumerate(names_test):
    name = name.lower()
    for t, char in enumerate(name):
        X_test[i, t,char_indices[char]] = 1
        
predictions = model.predict(X_test)

for i,name in enumerate(names_test):
    print("{} is {}".format(names_test[i],"Male" if predictions[i][0]>predictions[i][1] else "Female")) 

## Model saving
Our job is done, we satisfied ourselves that the scheme works, and that we have a somewhat useful model that we can use to predict the gender of people from their names.<p>
In order to orchestrate the ML pipeline however, we need to save the model file (containing the weights), and the character indices (including the length of maximum name) to an S3 location.<p>

In [None]:
!mkdir -p ../model
model.save('../model/lstm-gender-classifier-model.h5')
np.save('../model/lstm-gender-classifier-indices.npy', char_indices) 
!tar -zcvf ../model.tar.gz ../model
s3.meta.client.upload_file('../model.tar.gz', s3bucketname, 'model/model.tar.gz')
!rm -rf ../model

# Model hosting

Amazon SageMaker provides a powerful orchestration framework that you can use to productionize any of your own machine learning algorithm, using any machine learning framework and programming languages.<p>
This is possible because SageMaker, as a manager of containers, have standarized ways of interacting with your code running inside a Docker container. Since you are free to build a docker container using whatever code and depndency you like, this gives you freedom to bring your own machinery.<p>
In the following steps, we'll containerize the prediction code and host the model behind an API endpoint.<p>
This would allow us to use the model from web-application, and put it into real use.<p>
The boilerplate code, which we affectionately call the `Dockerizer` framework, was made available on this Notebook instance by the Lifecycle Configuration that you used. Just look into the folder and ensure the necessary files are available as shown.<p>
    
    <home>    
    |
    ├── container
        │
        ├── byoa
        |   |
        │   ├── train
        |   |
        │   ├── predictor.py
        |   |
        │   ├── serve
        |   |
        │   ├── nginx.conf
        |   |
        │   └── wsgi.py
        |
        ├── build_and_push.sh
        │   
        ├── Dockerfile.cpu
        │        
        └── Dockerfile.gpu

In [None]:
os.chdir('../container')
os.getcwd()
!ls -Rl 

* `Dockerfile` describes the container image and the accompanying script `build_and_push.sh` does the heavy lifting of building the container, and uploading it into an Amazon ECR repository
* Sagemaker containers that we'll be building serves prediction request using a Flask based application. `wsgi.py` is a wrapper to invoke the Flask application, while `nginx.conf` is the configuration for the nginx front end and `serve` is the program that launches the gunicorn server. These files can be used as-is, and are required to build the webserver stack serving prediction requests, following the architecture as shown:
![Request serving stack](images/stack.png "Request serving stack")

* The file named `predictor.py` is where we need to package the code for generating inference using the trained model that was saved into an S3 bucket location by the training code during the training job run.<p>
* We'll write code into this file using Jupyter magic command - `writefile`.<p>

In [None]:
%%writefile byoa/predictor.py
# This is the file that implements a flask server to do inferences. It's the file that you will modify to
# implement the scoring for your own algorithm.

from __future__ import print_function

import os
import json
import pickle
from io import StringIO
import sys
import signal
import traceback

import numpy as np

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM
from keras.models import load_model
import flask

import tensorflow as tf

import pandas as pd

from os import listdir, sep
from os.path import abspath, basename, isdir
from sys import argv

prefix = '/opt/ml/model'
model_path = os.path.join(prefix, 'model')

# A singleton for holding the model. This simply loads the model and holds it.
# It has a predict function that does a prediction based on the model and the input data.

class ScoringService(object):
    model_type = None           # Where we keep the model type, qualified by hyperparameters used during training
    model = None                # Where we keep the model when it's loaded
    graph = None
    indices = None              # Where we keep the indices of Alphabet when it's loaded
    
    @classmethod
    def get_indices(cls):
        #Get the indices for Alphabet for this instance, loading it if it's not already loaded        
        if cls.indices == None:
            model_type='lstm-gender-classifier'
            index_path = os.path.join(model_path, '{}-indices.npy'.format(model_type))
            if os.path.exists(index_path):
                cls.indices = np.load(index_path).item()
            else:
                print("Character Indices not found.")
        return cls.indices

    @classmethod
    def get_model(cls):
        #Get the model object for this instance, loading it if it's not already loaded  
        if cls.model == None:
            model_type='lstm-gender-classifier'
            mod_path = os.path.join(model_path, '{}-model.h5'.format(model_type))
            if os.path.exists(mod_path):
                cls.model = load_model(mod_path)
                cls.model._make_predict_function()
                cls.graph = tf.get_default_graph()
            else:
                print("LSTM Model not found.")
        return cls.model
    
    @classmethod
    def predict(cls, input):

        mod = cls.get_model()
        ind = cls.get_indices()

        result = {}

        if mod == None:
            print("Model not loaded.")
        else:
            if 'max_name_length' not in ind:
                max_name_length = 15
                alphabet_size = 26
            else:
                max_name_length = ind['max_name_length']
                ind.pop('max_name_length', None)
                alphabet_size = len(ind)

            inputs_list = input.strip('\n').split(",")
            num_inputs = len(inputs_list)

            X_test = np.zeros((num_inputs, max_name_length, alphabet_size))

            for i,name in enumerate(inputs_list):
                name = name.lower().strip('\n')
                for t, char in enumerate(name):
                    if char in ind:
                        X_test[i, t,ind[char]] = 1

            with cls.graph.as_default():
                predictions = mod.predict(X_test)

            for i,name in enumerate(inputs_list):
                result[name] = 'M' if predictions[i][0]>predictions[i][1] else 'F'
                print("{} ({})".format(inputs_list[i],"M" if predictions[i][0]>predictions[i][1] else "F"))

        return json.dumps(result)
    
# The flask app for serving predictions
app = flask.Flask(__name__)

@app.route('/ping', methods=['GET'])
def ping():
    #Determine if the container is working and healthy.
    # Declare it healthy if we can load the model successfully.
    model = ScoringService.get_model()
    indices = ScoringService.get_indices()
    health = model is not None and indices is not None
    status = 200 if health else 404
    return flask.Response(response='\n', status=status, mimetype='application/json')

@app.route('/invocations', methods=['POST'])
def transformation():
    #Do an inference on a single batch of data
    data = None

    # Convert from CSV to pandas
    if flask.request.content_type == 'text/csv':
        data = flask.request.data.decode('utf-8')
    else:
        return flask.Response(response='This predictor only supports CSV data', status=415, mimetype='text/plain')

    print('Invoked with {} records'.format(data.count(",")+1))

    # Do the prediction
    predictions = ScoringService.predict(data)

    result = ""
    for prediction in predictions:
        result = result + prediction

    return flask.Response(response=result, status=200, mimetype='text/csv')

## Container publishing

In order to host and deploy the trained model using SageMaker, we need to build the `Docker` containers, publish it to `Amazon ECR` repository, and then either use SageMaker console or API to created the endpoint configuration and deploy the stages.<p>

Conceptually, the steps required for publishing are:<p>
1. Make the`predictor.py` files executable
2. Create an ECR repository within your default region
3. Build a docker container with an identifieable name
4. Tage the image and publish to the ECR repository
<p><br>
All of these are conveniently encapsulated inside `build_and_push` script. We simply run it with the unique name of our production run.

In [None]:
run_type='cpu'
instance_class = "p2" if run_type.lower()=='gpu' else "t2"
instance_type = "ml.{}.xlarge".format(instance_class)
pipeline_name = 'gender-classifier'
run='1'

run_name = pipeline_name+"-"+run
if run_type == "cpu":
    !cp "Dockerfile.cpu" "Dockerfile"

if run_type == "gpu":
    !cp "Dockerfile.gpu" "Dockerfile"
    
!sh build_and_push.sh $run_name

## Orchestration

At this point, we can head to ECS console, grab the ARN for the repository where we published the docker image, and use SageMaker console to create hosted model, and endpoint.<p>
However, it is often more convenient to automate these steps. In this notebook we do exactly that using `boto3 SageMaker` API.<p>
Following are the steps:<p>
    
* First we create a model hosting definition, by providing the S3 location to the model artifact, and ARN to the ECR image of the container.
* Using the model hosting definition, our next step is to create configuration of a hosted endpoint that will be used to serve prediciton generation requests. 
* Creating the endpoint is the last step in the ML cycle, that prepares your model to serve client reqests from applications.
* We wait until the provision is completed and the endpoint in service. At this point we can send request to this endpoint and obtain gender predictions.


In [None]:
import sagemaker
sm_role = sagemaker.get_execution_role()
print("Using Role {}".format(sm_role))
acc = boto3.client('sts').get_caller_identity().get('Account')
reg = boto3.session.Session().region_name
sagemaker = boto3.client('sagemaker')

#Check if model already exists
model_name = "{}-model".format(run_name)
models = sagemaker.list_models(NameContains=model_name)['Models']
model_exists = False
if len(models) > 0:
    for model in models:
        if model['ModelName'] == model_name:
            model_exists = True
            break
#Delete model, if chosen
if model_exists == True:    
    choice = input("Model already exists, do you want to delete and create a fresh one (Y/N) ? ")
    if choice.upper()[0:1] == "Y":
        sagemaker.delete_model(ModelName = model_name)
        model_exists = False
    else:
        print("Model - {} already exists".format(model_name))

if model_exists == False:    
    model_response = sagemaker.create_model(
        ModelName=model_name,
        PrimaryContainer={
            'Image': '{}.dkr.ecr.{}.amazonaws.com/{}:latest'.format(acc, reg, run_name),
            'ModelDataUrl': 's3://{}/model/model.tar.gz'.format(s3bucketname)
        },
        ExecutionRoleArn=sm_role,
        Tags=[
            {
                'Key': 'Name',
                'Value': model_name
            }
        ]
    )
    print("{} Created at {}".format(model_response['ModelArn'], 
                                    model_response['ResponseMetadata']['HTTPHeaders']['date']))

In [None]:
#Check if endpoint configuration already exists
endpoint_config_name = "{}-endpoint-config".format(run_name)
endpoint_configs = sagemaker.list_endpoint_configs(NameContains=endpoint_config_name)['EndpointConfigs']
endpoint_config_exists = False
if len(endpoint_configs) > 0:
    for endpoint_config in endpoint_configs:
        if endpoint_config['EndpointConfigName'] == endpoint_config_name:
            endpoint_config_exists = True
            break
            
#Delete endpoint configuration, if chosen
if endpoint_config_exists == True:    
    choice = input("Endpoint Configuration already exists, do you want to delete and create a fresh one (Y/N) ? ")
    if choice.upper()[0:1] == "Y":
        sagemaker.delete_endpoint_config(EndpointConfigName = endpoint_config_name)
        endpoint_config_exists = False
    else:
        print("Endpoint Configuration - {} already exists".format(endpoint_config_name))
        
if endpoint_config_exists == False:           
    endpoint_config_response = sagemaker.create_endpoint_config(
        EndpointConfigName=endpoint_config_name,
        ProductionVariants=[
            {
                'VariantName': 'default',
                'ModelName': model_name,
                'InitialInstanceCount': 1,
                'InstanceType': instance_type,
                'InitialVariantWeight': 1
            },
        ],
        Tags=[
            {
                'Key': 'Name',
                'Value': endpoint_config_name
            }
        ]
    )
    print("{} Created at {}".format(endpoint_config_response['EndpointConfigArn'], 
                                    endpoint_config_response['ResponseMetadata']['HTTPHeaders']['date']))

In [None]:
from ipywidgets import widgets
from IPython.display import display

#Check if endpoint already exists
endpoint_name = "{}-endpoint".format(run_name)
endpoints = sagemaker.list_endpoints(NameContains=endpoint_name)['Endpoints']
endpoint_exists = False
if len(endpoints) > 0:
    for endpoint in endpoints:
        if endpoint['EndpointName'] == endpoint_name:
            endpoint_exists = True
            break
            
#Delete endpoint, if chosen
if endpoint_exists == True:    
    choice = input("Endpoint already exists, do you want to delete and create a fresh one (Y/N) ? ")
    if choice.upper()[0:1] == "Y":
        sagemaker.delete_endpoint(EndpointName = endpoint_name)
        print("Deleting Endpoint - {} ...".format(endpoint_name))
        waiter = sagemaker.get_waiter('endpoint_deleted')
        waiter.wait(EndpointName=endpoint_name,
                   WaiterConfig = {'Delay':1,'MaxAttempts':100})
        endpoint_exists = False
        print("Endpoint - {} deleted".format(endpoint_name))
        
    else:
        print("Endpoint - {} already exists".format(endpoint_name))
        
if endpoint_exists == False:  

    endpoint_response = sagemaker.create_endpoint(
        EndpointName=endpoint_name,
        EndpointConfigName=endpoint_config_name,
        Tags=[
            {
                'Key': 'string',
                'Value': endpoint_name
            }
        ]
    )
    status='Creating'
    sleep = 3

    print("{} Endpoint : {}".format(status,endpoint_name))
    bar = widgets.FloatProgress(min=0, description="Progress") # instantiate the bar
    display(bar) # display the bar

    while status != 'InService' and status != 'Failed' and status != 'OutOfService':    
        endpoint_response = sagemaker.describe_endpoint(
            EndpointName=endpoint_name
        )
        status = endpoint_response['EndpointStatus']
        time.sleep(sleep)
        bar.value = bar.value + 1 
        if bar.value >= bar.max-1:
            bar.max = int(bar.max*1.05)
        if status != 'InService' and status != 'Failed' and status != 'OutOfService':        
            print(".", end='')

    bar.max = bar.value     
    html = widgets.HTML(
        value="<H2>Endpoint <b><u>{}</b></u> - {}</H2>".format(endpoint_response['EndpointName'], status)
    )
    display(html)

In [None]:
test_name = input("Enter name : ")
val_no_special = re.compile('[%s]' % re.escape(string.punctuation)).sub(' ', test_name)
val_space_collapse = re.sub( '\s+', ' ', val_no_special).strip()
names_test = word_tokenize(val_space_collapse)
request_body = ",".join(names_test)

!aws sagemaker-runtime invoke-endpoint --endpoint-name "$run_name-endpoint" --body "$request_body" --content-type text/csv outfile
!cat outfile