<a href="https://colab.research.google.com/github/chillyssa/NLP-with-Deep-Learning-Project/blob/main/project_part2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Recipe Review Sentiment Analysis: A Baseline Model 

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/chillyssa/NLP-with-Deep-Learning-Project/blob/main/project_part2.ipynb)

In my previous notebook [Recipe Review Analysis Part I](https://colab.research.google.com/github/chillyssa/NLP-with-Deep-Learning-Project/blob/main/project_part1.ipynb) I introduced an experiment where I will use sentiment analysis to examine a set of [recipe review data](https://www.kaggle.com/datasets/shuyangli94/food-com-recipes-and-user-interactions) from Food.com's online recipe generator. In that notebook I preprocessed and tokenized the reviews as well as presented some exploratory data analysis. 

In this notebook I will import the tokenized data set and implement a baseline machine learning model for recipe review sentiment analysis! 

The original data set comes from Kaggle and was originally gathered for the below cited research. 

Generating Personalized Recipes from Historical User Preferences
Bodhisattwa Prasad Majumder*, Shuyang Li*, Jianmo Ni, Julian McAuley
EMNLP, 2019
https://www.aclweb.org/anthology/D19-1613/

## Loading Pre-processed Taining Data 

In [None]:
# mount google drive to import data files - only have to run this once. 
from google.colab import drive
drive.mount('/content/drive')

In [9]:
# importing python modules/packages to be utilized 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay, accuracy_score
from sklearn.metrics import log_loss

# importing data set with pandas
pd.set_option('display.max_colwidth', 0)
path = '/content/drive/MyDrive/NLP-F22/data/pp_train.csv'
df_train = pd.read_csv(path)

# adding tokens feature that is a list of the p_review values that are pre-processed from review
df_train['tokens'] = df_train['p_review'].str.lower().str.split()

df_train.head()

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,Unnamed: 0.1.1,user_id,recipe_id,date,rating,review,review_len,word_count,p_review,tokens
0,0,0,396288,384043,19023,2008-06-20,5,Oh So good!!!!!!! Thank you for the great recipe. It was the first time I've ever made my own sauce and I was so proud.,122,25,oh good thank great recipe first time ive ever made sauce proud,"[oh, good, thank, great, recipe, first, time, ive, ever, made, sauce, proud]"
1,1,1,1110400,1127316,51803,2009-01-12,0,"This is a great soup. Only, watch out for the flavor packet that it comes with. It contains MSG, under the name of hydrolyzed soy protein (anything hydrolyzed or autolyzed contains free glutamate). It is a great soup without the flavor packet. Yum!",252,43,great soup watch flavor packet come contains msg name hydrolyzed soy protein anything hydrolyzed autolyzed contains free glutamate great soup without flavor packet yum,"[great, soup, watch, flavor, packet, come, contains, msg, name, hydrolyzed, soy, protein, anything, hydrolyzed, autolyzed, contains, free, glutamate, great, soup, without, flavor, packet, yum]"
2,2,2,510362,1872570,89207,2011-04-02,0,Haven't made this yet but just wanted to let HeyLillie know that I printed the recipe and it was correct. No weird mixup on the measurements. Perhaps your printer had a glitch. That said I am anxious to try this frosting on a ho ho cake. Will post after that.,264,50,havent made yet wanted let know printed recipe correct weird mixup measurement perhaps printer glitch said anxious try frosting ho ho cake post,"[havent, made, yet, wanted, let, know, printed, recipe, correct, weird, mixup, measurement, perhaps, printer, glitch, said, anxious, try, frosting, ho, ho, cake, post]"
3,3,3,563666,278578,106627,2008-01-25,5,great recipe! very easy and tasty. my very picky toddler could not get enough of it so that makes it a keeper!,110,22,great recipe easy tasty picky toddler could get enough make keeper,"[great, recipe, easy, tasty, picky, toddler, could, get, enough, make, keeper]"
4,4,4,242052,235493,28559,2005-09-30,5,"Once again, another fantastic recipe from this cook. My lasagna never quite turns out right but this was perfect! Even my picky hubby loved it. The only difference was I used ricotta instead of cottage cheese but actually I think next time I might go ahead and use cottage cheese. This was great! I can't wait to make it again. Thank you so much for posting this. You are a fantastic cook!",396,72,another fantastic recipe cook lasagna never quite turn right perfect even picky hubby loved difference used ricotta instead cottage cheese actually think next time might go ahead use cottage cheese great cant wait make thank much posting fantastic cook,"[another, fantastic, recipe, cook, lasagna, never, quite, turn, right, perfect, even, picky, hubby, loved, difference, used, ricotta, instead, cottage, cheese, actually, think, next, time, might, go, ahead, use, cottage, cheese, great, cant, wait, make, thank, much, posting, fantastic, cook]"
