# Text Translation and Sentiment Analysis using Transformers

## Project Overview:

The objective of this project is to analyze the sentiment of movie reviews in three different languages - English, French, and Spanish. There are a total of 30 movies, 10 in each language, along with their reviews and synopses. The data was provided in 3 separate CSV files named `movie_reviews_eng.csv`, `movie_reviews_fr.csv`, and `movie_reviews_sp.csv`.

The aim of this project will be to input a dataframe in a given language of choice - and the result will be the data translated into english (if not in english originally) and the sentiment of the reviews analysed.

In [9]:
# imports
from pathlib import Path
import sys

ROOT_DIR = Path.cwd()
sys.path.append(ROOT_DIR)

import pandas as pd
from translator import Translator
from transformers import AutoModelForSequenceClassification, AutoTokenizer

import warnings
warnings.filterwarnings("ignore")

### Creating Translator Class

### Get data from `.csv` files and then preprocess data

In [10]:
sp_translator = Translator("data/movie_reviews_sp.csv")
translated_sp = sp_translator.translate()

Original Language: ES


In [6]:
sp_translator.original_df

Unnamed: 0,Title,Year,Synopsis,Review
0,Roma,2018,Cleo (Yalitza Aparicio) es una joven empleada ...,"""Roma es una película hermosa y conmovedora qu..."
1,La Casa de Papel,(2017-2021),Esta serie de televisión española sigue a un g...,"""La Casa de Papel es una serie emocionante y a..."
2,Y tu mamá también,2001,Dos amigos adolescentes (Gael García Bernal y ...,"""Y tu mamá también es una película que se qued..."
3,El Laberinto del Fauno,2006,"Durante la posguerra española, Ofelia (Ivana B...","""El Laberinto del Fauno es una película fascin..."
4,Amores perros,2000,Tres historias se entrelazan en esta película ...,"""Amores perros es una película intensa y conmo..."
5,Águila Roja,(2009-2016),Esta serie de televisión española sigue las av...,"""Águila Roja es una serie aburrida y poco inte..."
6,Toc Toc,2017,"En esta comedia española, un grupo de personas...","""Toc Toc es una película aburrida y poco origi..."
7,El Bar,2017,Un grupo de personas quedan atrapadas en un ba...,"""El Bar es una película ridícula y sin sentido..."
8,Torrente: El brazo tonto de la ley,1998,"En esta comedia española, un policía corrupto ...","""Torrente es una película vulgar y ofensiva qu..."
9,El Incidente,2014,"En esta película de terror mexicana, un grupo ...","""El Incidente es una película aburrida y sin s..."


In [7]:
translated_sp

Unnamed: 0,Title,Year,Synopsis,Review
0,Roma,2018,Cleo (Yalitza Aparicio) is a young domestic wo...,"""Rome is a beautiful and moving film that pays..."
1,La Casa de Papel,(2017-2021),This Spanish television series follows a group...,"""The Paper House is an exciting and addictive ..."
2,Y tu mamá también,2001,Two teenage friends (Gael García Bernal and Di...,"""And your mom is also a movie that stays with ..."
3,El Laberinto del Fauno,2006,"During the Spanish postwar period, Ofelia (Iva...","""The Labyrinth of Fauno is a fascinating and e..."
4,Amores perros,2000,Three stories intertwine in this Mexican film:...,"""Amores dogs is an intense and moving film tha..."
5,Águila Roja,(2009-2016),This Spanish television series follows the adv...,"""Red Eagle is a boring and uninteresting serie..."
6,Toc Toc,2017,"In this Spanish comedy, a group of people with...","""Toc Toc is a boring and unoriginal film that ..."
7,El Bar,2017,A group of people are trapped in a bar after M...,"""The Bar is a ridiculous and meaningless film ..."
8,Torrente: El brazo tonto de la ley,1998,"In this Spanish comedy, a corrupt cop (played ...","""Torrente is a vulgar and offensive film that ..."
9,El Incidente,2014,"In this Mexican horror film, a group of people...","""The Incident is a boring and frightless film ..."


In [32]:
fr_translator = Translator("data/movie_reviews_fr.csv")
translated_fr = fr_translator.translate()

In [33]:
fr_translator.original_df

Unnamed: 0,Title,Year,Synopsis,Review
0,La La Land,2016,Cette comédie musicale raconte l'histoire d'un...,"""La La Land est un film absolument magnifique ..."
1,Intouchables,2011,Ce film raconte l'histoire de l'amitié improba...,"""Intouchables est un film incroyablement touch..."
2,Amélie,2001,Cette comédie romantique raconte l'histoire d'...,"""Amélie est un film absolument charmant qui vo..."
3,Les Choristes,2004,Ce film raconte l'histoire d'un professeur de ...,"""Les Choristes est un film magnifique qui vous..."
4,Le Fabuleux Destin d'Amélie Poulain,2001,Cette comédie romantique raconte l'histoire d'...,"""Le Fabuleux Destin d'Amélie Poulain est un fi..."
5,Le Dîner de Cons,1998,Le film suit l'histoire d'un groupe d'amis ric...,"""Je n'ai pas aimé ce film du tout. Le concept ..."
6,La Tour Montparnasse Infernale,2001,Deux employés de bureau incompétents se retrou...,"""Je ne peux pas croire que j'ai perdu du temps..."
7,Astérix aux Jeux Olympiques,2008,Dans cette adaptation cinématographique de la ...,"""Ce film est une déception totale. Les blagues..."
8,Les Visiteurs en Amérique,2000,Dans cette suite de la comédie française Les V...,"""Le film est une perte de temps totale. Les bl..."
9,Babylon A.D.,2008,"Dans un futur lointain, un mercenaire doit esc...","""Ce film est un gâchis complet. Les personnage..."


In [6]:
translated_fr

Unnamed: 0,Title,Year,Synopsis,Review
0,La La Land,2016,This musical tells the story of a budding actr...,"""The Land is an absolutely beautiful film with..."
1,Intouchables,2011,This film tells the story of the unlikely frie...,"""Untouchables is an incredibly touching film w..."
2,Amélie,2001,This romantic comedy tells the story of Amélie...,"""Amélie is an absolutely charming film that wi..."
3,Les Choristes,2004,This film tells the story of a music teacher w...,"""The Choristes are a beautiful film that will ..."
4,Le Fabuleux Destin d'Amélie Poulain,2001,This romantic comedy tells the story of Amélie...,"""The Fabulous Destiny of Amélie Poulain is an ..."
5,Le Dîner de Cons,1998,The film follows the story of a group of rich ...,"""I didn't like this movie at all. The concept ..."
6,La Tour Montparnasse Infernale,2001,Two incompetent office workers find themselves...,"""I can't believe I've wasted time watching thi..."
7,Astérix aux Jeux Olympiques,2008,In this film adaptation of the popular comic s...,"""This film is a complete disappointment. The j..."
8,Les Visiteurs en Amérique,2000,In this continuation of the French comedy The ...,"""The film is a total waste of time. The jokes ..."
9,Babylon A.D.,2008,"In the distant future, a mercenary has to esco...","""This film is a complete mess. The characters ..."


### Text translation

Translate the **Review** and **Synopsis** column values to English.

**MarianMT** was chosen as the translation model framework (all model names within MarianMT have the following format **Helsinki-NLP/opus-mt-{src}-{tgt}**).

### Sentiment Analysis

Use HuggingFace pretrained model for sentiment analysis of the reviews -storing the sentiment result **Positive** or **Negative** as a new feature titled **Sentiment**.

AutoClasses can automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary:

In [None]:
# load sentiment analysis model
model_name = "distilbert/distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

Pipelines are objects that offer a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering.

In [8]:
sentiment_classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

In [9]:
def analyze_sentiment(text, classifier):
    """
    function to perform sentiment analysis on a text using a model
    """
    return classifier(text)

In [10]:
def analyze_movie_review(df: pd.DataFrame):
    
    df["Sentiment"] = df["Review"].apply(lambda x: analyze_sentiment(x, sentiment_classifier)[0]['label'].title())
    
    return df

In [11]:
analyzed_fr = analyze_movie_review(translated_fr)
analyzed_fr

Unnamed: 0,Title,Year,Synopsis,Review,Sentiment
0,La La Land,2016,This musical tells the story of a budding actr...,"""The Land is an absolutely beautiful film with...",Positive
1,Intouchables,2011,This film tells the story of the unlikely frie...,"""Untouchables is an incredibly touching film w...",Positive
2,Amélie,2001,This romantic comedy tells the story of Amélie...,"""Amélie is an absolutely charming film that wi...",Positive
3,Les Choristes,2004,This film tells the story of a music teacher w...,"""The Choristes are a beautiful film that will ...",Positive
4,Le Fabuleux Destin d'Amélie Poulain,2001,This romantic comedy tells the story of Amélie...,"""The Fabulous Destiny of Amélie Poulain is an ...",Positive
5,Le Dîner de Cons,1998,The film follows the story of a group of rich ...,"""I didn't like this movie at all. The concept ...",Negative
6,La Tour Montparnasse Infernale,2001,Two incompetent office workers find themselves...,"""I can't believe I've wasted time watching thi...",Negative
7,Astérix aux Jeux Olympiques,2008,In this film adaptation of the popular comic s...,"""This film is a complete disappointment. The j...",Negative
8,Les Visiteurs en Amérique,2000,In this continuation of the French comedy The ...,"""The film is a total waste of time. The jokes ...",Negative
9,Babylon A.D.,2008,"In the distant future, a mercenary has to esco...","""This film is a complete mess. The characters ...",Negative


In [12]:
analyzed_sp = analyze_movie_review(translated_sp)
analyzed_sp

Unnamed: 0,Title,Year,Synopsis,Review,Sentiment
0,Roma,2018,Cleo (Yalitza Aparicio) is a young domestic wo...,"""Rome is a beautiful and moving film that pays...",Positive
1,La Casa de Papel,(2017-2021),This Spanish television series follows a group...,"""The Paper House is an exciting and addictive ...",Positive
2,Y tu mamá también,2001,Two teenage friends (Gael García Bernal and Di...,"""And your mom is also a movie that stays with ...",Positive
3,El Laberinto del Fauno,2006,"During the Spanish postwar period, Ofelia (Iva...","""The Labyrinth of Fauno is a fascinating and e...",Positive
4,Amores perros,2000,Three stories intertwine in this Mexican film:...,"""Amores dogs is an intense and moving film tha...",Positive
5,Águila Roja,(2009-2016),This Spanish television series follows the adv...,"""Red Eagle is a boring and uninteresting serie...",Negative
6,Toc Toc,2017,"In this Spanish comedy, a group of people with...","""Toc Toc is a boring and unoriginal film that ...",Negative
7,El Bar,2017,A group of people are trapped in a bar after M...,"""The Bar is a ridiculous and meaningless film ...",Negative
8,Torrente: El brazo tonto de la ley,1998,"In this Spanish comedy, a corrupt cop (played ...","""Torrente is a vulgar and offensive film that ...",Negative
9,El Incidente,2014,"In this Mexican horror film, a group of people...","""The Incident is a boring and frightless film ...",Negative


### Future work 
- Translation of column names to allow datasets of different features to be inputted into the model.
- Use more advanced sentiment analyzer i.e. give a star rating.