In [1]:
import pandas as pd

# modify the column width
pd.set_option('display.max_colwidth', None)

# look at a subset of the movie reviews
data_df = pd.read_csv('../resources/Course Materials/Data/movie_reviews_sentiment.csv')
sentiment_df = \
    data_df[['movie_title', 'movie_info', 'sentiment_vader']] \
        .rename(columns={'movie_title': 'title', 'movie_info': 'info', 'sentiment_vader': 'vader'})
sentiment_df.head()

Unnamed: 0,title,info,vader
0,A Dog's Journey,"Bailey (voiced again by Josh Gad) is living the good life on the Michigan farm of his ""boy,"" Ethan (Dennis Quaid) and Ethan's wife Hannah (Marg Helgenberger). He even has a new playmate: Ethan and Hannah's baby granddaughter, CJ. The problem is that CJ's mom, Gloria (Betty Gilpin), decides to take CJ away. As Bailey's soul prepares to leave this life for a new one, he makes a promise to Ethan to find CJ and protect her at any cost. Thus begins Bailey's adventure through multiple lives filled with love, friendship and devotion as he, CJ (Kathryn Prescott), and CJ's best friend Trent (Henry Lau) experience joy and heartbreak, music and laughter, and few really good belly rubs.",0.9837
1,A Dog's Way Home,"Separated from her owner, a dog sets off on an 400-mile journey to get back to the safety and security of the place she calls home. Along the way, she meets a series of new friends and manages to bring a little bit of comfort and joy to their lives.",0.9237
2,A Tuba to Cuba,"The leader of New Orleans' famed Preservation Hall Jazz Band seeks to fulfill his late father's dream of retracing their musical roots to the shores of Cuba in search of the indigenous music that gave birth to New Orleans jazz. A TUBA TO CUBA celebrates the triumph of the human spirit expressed through the universal language of music and challenges us to resolve to build bridges, not walls.",0.936
3,A Vigilante,"A once abused woman, Sadie (Olivia Wilde), devotes herself to ridding victims of their domestic abusers while hunting down the husband she must kill to truly be free. A Vigilante is a thriller inspired by the strength and bravery of real domestic abuse survivors and the incredible obstacles to safety they face.",-0.0334
4,After,"Based on Anna Todd's best-selling novel which became a publishing sensation on social storytelling platform Wattpad, AFTER follows Tessa (Langford), a dedicated student, dutiful daughter and loyal girlfriend to her high school sweetheart, as she enters her first semester in college. Armed with grand ambitions for her future, her guarded world opens up when she meets the dark and mysterious Hardin Scott (Tiffin), a magnetic, brooding rebel who makes her question all she thought she knew about herself and what she wants out of life.",0.9349


In [2]:
import torch

if torch.cuda.is_available():
    device = torch.device("cuda")
    print(f"Using GPU: {torch.cuda.get_device_name(0)}")
    # Check available GPU memory
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    torch.cuda.empty_cache()  # Clear cache before starting
elif torch.backends.mps.is_available():
    device = torch.device("mps")
    print("Using Apple MPS")
else:
    device = torch.device("cpu")
    print("Using CPU")

Using GPU: NVIDIA GeForce GTX 1650 with Max-Q Design
GPU Memory: 4.29 GB


In [3]:
# sentiment analysis with hugging face
from transformers import pipeline

In [4]:
%%time

# utilize machine's GPU if it has one
sentiment_analyzer = pipeline("sentiment-analysis",
                              model="distilbert/distilbert-base-uncased-finetuned-sst-2-english",
                              device=device,
                              truncation=True)

# Use a batch size so that pipelines can run in parallel
sentiment_scores = sentiment_df['info'].apply(sentiment_analyzer, batch_size=8)

if torch.cuda.is_available():
    print(f"Memory used: {torch.cuda.memory_allocated() / 1e9:.2f} GB")
    
sentiment_scores[:5]

Device set to use cuda
You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


Memory used: 0.28 GB
CPU times: total: 2.8 s
Wall time: 3.07 s


0    [{'label': 'POSITIVE', 'score': 0.9982469081878662}]
1    [{'label': 'POSITIVE', 'score': 0.9995336532592773}]
2    [{'label': 'POSITIVE', 'score': 0.9994434714317322}]
3    [{'label': 'POSITIVE', 'score': 0.9994601607322693}]
4    [{'label': 'POSITIVE', 'score': 0.9972022771835327}]
Name: info, dtype: object

In [5]:
# extract the label and score and create a sentiment score for all reviews
sentiment_df['Label_HF'] = sentiment_scores.apply(lambda x: x[0]['label'])
sentiment_df['Score_HF'] = sentiment_scores.apply(lambda x: x[0]['score'])

# Note: When using 'apply' on a whole dataframe (rather than a series), by default axis=0 which applies lambda to each column
# ...but we want lambda applied to each row -> axis=1
sentiment_df['Sentiment_HF'] = sentiment_df.apply(lambda row: row['Score_HF'] if row['Label_HF'] == 'POSITIVE' else -row['Score_HF'], axis=1)
sentiment_df.head()

Unnamed: 0,title,info,vader,Label_HF,Score_HF,Sentiment_HF
0,A Dog's Journey,"Bailey (voiced again by Josh Gad) is living the good life on the Michigan farm of his ""boy,"" Ethan (Dennis Quaid) and Ethan's wife Hannah (Marg Helgenberger). He even has a new playmate: Ethan and Hannah's baby granddaughter, CJ. The problem is that CJ's mom, Gloria (Betty Gilpin), decides to take CJ away. As Bailey's soul prepares to leave this life for a new one, he makes a promise to Ethan to find CJ and protect her at any cost. Thus begins Bailey's adventure through multiple lives filled with love, friendship and devotion as he, CJ (Kathryn Prescott), and CJ's best friend Trent (Henry Lau) experience joy and heartbreak, music and laughter, and few really good belly rubs.",0.9837,POSITIVE,0.998247,0.998247
1,A Dog's Way Home,"Separated from her owner, a dog sets off on an 400-mile journey to get back to the safety and security of the place she calls home. Along the way, she meets a series of new friends and manages to bring a little bit of comfort and joy to their lives.",0.9237,POSITIVE,0.999534,0.999534
2,A Tuba to Cuba,"The leader of New Orleans' famed Preservation Hall Jazz Band seeks to fulfill his late father's dream of retracing their musical roots to the shores of Cuba in search of the indigenous music that gave birth to New Orleans jazz. A TUBA TO CUBA celebrates the triumph of the human spirit expressed through the universal language of music and challenges us to resolve to build bridges, not walls.",0.936,POSITIVE,0.999443,0.999443
3,A Vigilante,"A once abused woman, Sadie (Olivia Wilde), devotes herself to ridding victims of their domestic abusers while hunting down the husband she must kill to truly be free. A Vigilante is a thriller inspired by the strength and bravery of real domestic abuse survivors and the incredible obstacles to safety they face.",-0.0334,POSITIVE,0.99946,0.99946
4,After,"Based on Anna Todd's best-selling novel which became a publishing sensation on social storytelling platform Wattpad, AFTER follows Tessa (Langford), a dedicated student, dutiful daughter and loyal girlfriend to her high school sweetheart, as she enters her first semester in college. Armed with grand ambitions for her future, her guarded world opens up when she meets the dark and mysterious Hardin Scott (Tiffin), a magnetic, brooding rebel who makes her question all she thought she knew about herself and what she wants out of life.",0.9349,POSITIVE,0.997202,0.997202


In [7]:
sentiment_df.sort_values('Sentiment_HF').head()

Unnamed: 0,title,info,vader,Label_HF,Score_HF,Sentiment_HF
22,Braid,"Two wanted women decide to rob their wealthy yet mentally unstable friend who lives in a fantasy world they all created as children. To take her money, the girls must take part in a deadly and perverse game of make believe throughout a sprawling yet decaying estate. As things become increasingly violent and hallucinatory, they realize that obtaining the money may be the least of their concerns.",-0.8316,NEGATIVE,0.999203,-0.999203
103,Spider-Man: Far From Home,"Peter Parker returns in Spider-Man: Far From Home, the next chapter of the Spider-Man: Homecoming series! Our friendly neighborhood Super Hero decides to join his best friends Ned, MJ, and the rest of the gang on a European vacation. However, Peter's plan to leave super heroics behind for a few weeks are quickly scrapped when he begrudgingly agrees to help Nick Fury uncover the mystery of several elemental creature attacks, creating havoc across the continent!",0.9722,NEGATIVE,0.998805,-0.998805
34,Dragged Across Concrete,"DRAGGED ACROSS CONCRETE follows two police detectives who find themselves suspended when a video of their strong-arm tactics is leaked to the media. With little money and no options, the embittered policemen descend into the criminal underworld and find more than they wanted waiting in the shadows.",-0.9015,NEGATIVE,0.998734,-0.998734
165,Yesterday,"Jack Malik (Himesh Patel, BBC's Eastenders) is a struggling singer-songwriter in a tiny English seaside town whose dreams of fame are rapidly fading, despite the fierce devotion and support of his childhood best friend, Ellie (Lily James, Mamma Mia! Here We Go Again). Then, after a freak bus accident during a mysterious global blackout, Jack wakes up to discover that The Beatles have never existed... and he finds himself with a very complicated problem, indeed.",0.1365,NEGATIVE,0.998447,-0.998447
102,Skin,"A white supremacist reforms his life after falling in love but saying goodbye to his skinhead life isn't a clean process. He must betray his former gang and work alongside the FBI in order to remove the body ink that has represented his identity for so long, as well as the burden of the gang's crimes he has carried.",-0.8377,NEGATIVE,0.996846,-0.996846
