# Deployed with Flask

Using scikit-learn, I trained a classifier to classify movie reviews using the IMDB dataset. A positive review should get a score of 1 (*i.e.,* thumbs-up) and a negative review should get a score of 0 (*i.e., thumbs-down). 

I then used Flask to set up a local server at http://localhost:5000/classify. (The corresponding code is in 'app/api.py'.) Upon receiving a POST request with a payload consisting of a list of (text) movie reviews, the server applies the trained model to generate corresponding movie-review-classification predictions.

To try the out the trained classifier, I use samples from the set of unlabelled (*i.e.,* unsupervised) movie revews from the IMDB dataset.

## Imports

In [1]:
import numpy as np
import pandas as pd
import requests

from datasets import load_dataset

from app.cleaner.preprocessor import Preprocessor
from app.cleaner.tokenizer import Tokenizer

In [2]:
rng = np.random.default_rng()

## Load Dataset

In [3]:
ds = load_dataset('imdb')
all_reviews = ds['unsupervised']['text']

Here's an example movie review:

In [4]:
rng.choice(all_reviews)

"I was very disappointed in this movie. Hearing the DVD commentary by the director, producer, and screenwriter explained why it was so bad but annoyed me because they didn't seem to know the book as well as anyone adapting a book into a movie should. It is understandable that the director and producer were much better acquainted with the screenplay than with its source, but the screenwriter has no excuse. She frequently said that adapting a popular book is difficult because fans of the book are so devoted to the text that they are upset because their own favorite episodes are left out. I enjoyed the book but am not that devoted a fan, and though I was sorry to see characters and scenes I liked left out, I know it would be impossible to make a movie that retained everything in a full-length novel. But to take the title, characters, and some of the events of a book, but change significant motivations, events, or characters beyond recognition is inexcusable. It was especially annoying whe

## Predictions

In [5]:
# url of the movie-review-classifier server
URL = 'http://localhost:5000/classify'

In [6]:
def get_predictions(reviews: list[str]) -> list[int]:
    """Return a list of movie-review-classifier predictions.

    Makes a POST request to the movie-review classifier and returns
    a list of 0's and 1's where a 0 indicates a thumbs-down
    classification of the corresponding movie review and a 1 indicates
    a thumbs-up.

    :param reviews: A list of movie reviews
    :type reviews: list[str]

    :return: A list of predictions from the movie-review-classifier
    :rtype: list[int]
    """
    payload = {'reviews': reviews}
    response = requests.post(URL, json=payload)
    return response.json()['labels']

In [7]:
reviews = rng.choice(all_reviews, size=4).tolist()
reviews

['In my private library Bernard Cornwell\'s novels are the most read. And the Sharpe novels do show some wear and tear. As it is with the movies. As soon as my partners hears the song "Over the Hills..." she knows it is time to visit her friend. I am so glad they succeeded in merging the first three books into one movies and make it believable that everything takes place after Waterloo. There is of course one setback: I do so miss Obediah Hakeswill. But Sergeant Bickerstaff is a worthy replacement. And that while Bickerstaff is such a nice guy in the novel. But Dodd is there as evil as he is in Sharpe\'s Fortress. A great movie and a worthy successor to series. Am I the only one who misses the rest of the gang? Micheal Mears (Rifleman Cooper), John Tams (Daniel Hagman)and Jason Salkey (Rifleman Harris. A pity but not all of us have the luck of the Irish.',
 "Those who think that Cole Porter and the Cold War make strange bedfellows aren't going to have their minds changed by this movie.

In [8]:
predictions = get_predictions(reviews)
predictions

[1, 0, 0, 1]