## Deploying as a FaaS

### Verifying the model

The first part of this milestone will consist in verifying our model. With this, we'll be sure that we have everything that we need to run our model in a remote environment like a FaaS, and that we'll be able to predict new text with it.

### Download corpora 

We still need the stopwords list as well as the punctuation list to prepare our input. Movie reviews are contained in the model.

In [None]:
from nltk import download

download('punkt')
download('stopwords')

### Prepare feature extraction and bag-of-words converter

This implementation needs to match the implementation that the model used.

In [4]:
from nltk.corpus import stopwords
from nltk.stem.lancaster import LancasterStemmer
from nltk import everygrams
from string import punctuation as punctuation_list

from nltk.tokenize import word_tokenize

stopword_list = stopwords.words('english')
stemmer = LancasterStemmer()

def extract_features(input_string):
    words = word_tokenize(input_string)
    # Second pass, remove stop words and punctuation.
    features = [stemmer.stem(word) for word in words if stemmer.stem(word) not in stopword_list and stemmer.stem(word) not in punctuation_list]

    # Third pass, generate n_grams
    n_grams = everygrams(features, max_len=3)
    
    return n_grams

def bag_of_words(words):
    bag = {}
    for word in words:
        bag[word] = bag.get(word, 0) + 1
    return bag

tokens = list(extract_features('I Really did not likE this movie all that much'))
print(bag_of_words(tokens))

{('real',): 1, ('real', 'lik'): 1, ('real', 'lik', 'thi'): 1, ('lik',): 1, ('lik', 'thi'): 1, ('lik', 'thi', 'movy'): 1, ('thi',): 1, ('thi', 'movy'): 1, ('thi', 'movy', 'al'): 1, ('movy',): 1, ('movy', 'al'): 1, ('movy', 'al', 'much'): 1, ('al',): 1, ('al', 'much'): 1, ('much',): 1}


### Import trained model

In [8]:
import pickle

MODEL_PATH = '../model/sa_classifier.pickle'

with open(MODEL_PATH, 'rb') as file:
    model = pickle.load(file)

### Define prediction method

In [6]:
def get_sentiment(review):
    words = extract_features(review)
    words = bag_of_words(words)
    return model.classify(words)

### Validate model is working as expected

In [9]:
positive_review = 'This movie is probably the best movie I have ever seen in my entire life.'
print('positive_review: '+get_sentiment(positive_review))

negative_review = 'Trash, trash, and more trash. Two thumbs down! I would not recommend this movie to my worst enemy.'
print('negative_review: '+get_sentiment(negative_review))

positive_review: pos
negative_review: neg
