# 5-Deployment

Using Google colab, I fine-tuned an LLM (DistilBERT) to classify movie reviews. A positive review should get a score of 1 (*i.e.,* thumbs-up) and a negative review should get a score of 0 (*i.e., thumbs-down). Through Goole colab, I was able to perform the fine-tuning with GPU support. I saved the resulting tokenizer and tuned model to a local directory. (The directory is over 250 MB in size, so I did not upload it to this project's Github repo.) 

I then used Flask to set up a local server at http://localhost:5000/classify. (The corresponding code is in 'notebooks/api.py'.) Upon receiving a POST request with a payload consisting of a list of (text) movie reviews, the server applies the model to generate corresponding move-review classifications. The classifications are returned as a pair of probabilities, one for 1 and another for 0.

## Import

In [1]:
import pandas as pd
import requests

## Utils

In [2]:
def collapse_dicts(d: dict) -> dict:
    """
    Collapse a two-key dictionary to its values.

    Given
        {k1: v1, k2: v2},
    return
        {v1: v2}.
    """
    values = list(d.values())
    return {values[0]: values[1]}

In [3]:
def merge_dicts(dict_pair) -> dict:
    """
    Merge a pair of dictionaries.

    Given
        [{k1: v1}, {k2: v2}],
    return
        {k1: v1, k2: v2}.
    """
    return {**dict_pair[0], **dict_pair[1]}

In [4]:
def get_predictions_df(reviews_scores: dict) -> pd.DataFrame:
    """Return a DataFrame of reviews, scores, and classifier predictions.

    Makes a POST request to my movie-review-classifier and organizes
    the request and results into a DataFrame.

    :param reviews_scores: Text reviews and corresponding scores (0 or 1)
    :type reviews_scores: dict

    :return: A DataFrame of reviews, scores, and classifier predictions
    :rtype: pd.DataFrame
    """
    reviews: list = reviews_scores['reviews']

    # The URL of the movie-review-classifier (local) server
    url = 'http://localhost:5000/classify'
    payload = {'reviews': reviews}
    response = requests.post(url, json=payload)
    response_json: list = response.json()

    results = []
    for rj in response_json:
        new: dict = merge_dicts([collapse_dicts(d) for d in rj])
        results.append(new)

    probs_of_0 = [result['LABEL_0'] for result in results]
    probs_of_1 = [result['LABEL_1'] for result in results]

    return pd.DataFrame({
        'review': reviews,
        'score': scores,
        'prob_0': probs_of_0,
        'prob_1': probs_of_1
    })

## Tests

In [5]:
# A list of movie reviews
reviews = [
"""I hate this movie.""",
"""Retroactively enriching Fury Road with greater emotional heft if not quite matching it in propulsive throttle, 
Furiosa is another glorious swerve in mastermind George Miller's breathless race towards cinematic Valhalla.""",
"Absolutely riveting. I enjoyed every minute of this film.",
"Pathetically predictable; not even deserving a ½ star; don't waste your movie time on this."
]

# The scores corresponding to the above movie reviews. A 1 is a thumbs-up.
# A 0 is a thumbs-down.
scores = [0, 1, 0, 0]

reviews_scores = {
    'reviews': reviews,
    'scores': scores
}


In [6]:
df = get_predictions_df(reviews_scores)
df

Unnamed: 0,review,score,prob_0,prob_1
0,I hate this movie.,0,1.0,2.021951e-08
1,Retroactively enriching Fury Road with greater...,1,2.471612e-08,1.0
2,Absolutely riveting. I enjoyed every minute of...,0,2.583128e-08,1.0
3,Pathetically predictable; not even deserving a...,0,1.0,1.418502e-08
