In [1]:
import pandas as pd
file_path = "Amazon_Fine_Food_Reviews.csv"
df = pd.read_csv(file_path)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
print(df)

            Id   ProductId          UserId                      ProfileName  \
0            1  B001E4KFG0  A3SGXH7AUHU8GW                       delmartian   
1            2  B00813GRG4  A1D87F6ZCVE5NK                           dll pa   
2            3  B000LQOCH0   ABXLMWJIXXAIN  Natalia Corres "Natalia Corres"   
3            4  B000UA0QIQ  A395BORC6FGVXV                             Karl   
4            5  B006K2ZZ7K  A1UQRSCLF8GW1T    Michael D. Bigham "M. Wassir"   
...        ...         ...             ...                              ...   
568449  568450  B001EO7N10  A28KG5XORO54AY                 Lettie D. Carter   
568450  568451  B003S1WTCU  A3I8AFVPEE8KI5                        R. Sawyer   
568451  568452  B004I613EE  A121AA1GQV751Z                    pksd "pk_007"   
568452  568453  B004I613EE   A3IBEVCTXKNOH          Kathy A. Welch "katwel"   
568453  568454  B001LR2CU2  A3LGQPJCZVL9UC                         srfell17   

        HelpfulnessNumerator  HelpfulnessDenominato

In [2]:
# Lowercase conversion
def convert_to_lowercase(text):
    return text.lower()

df["lowercased"] = df["Text"].apply(convert_to_lowercase)

# Display column content without truncation
pd.set_option('display.max_colwidth', None) # Set to None for unlimited width
df["lowercased"]

0                                                                                                                                                                                                                                                               i have bought several of the vitality canned dog food products and have found them all to be of good quality. the product looks more like a stew than a processed meat and it smells better. my labrador is finicky and she appreciates this product better than  most.
1                                                                                                                                                                                                                                                                                                                                        product arrived labeled as jumbo salted peanuts...the peanuts were actually small sized unsalted. not sure if this was an error or if the vendor intend

In [3]:
# Removal of URLs
import re

# remove any URLs that start with "http" or "www" from the text
def remove_urls(text):
    return re.sub(r'http\S+|www\S+', '', text)

df["urls_removed"] = df["lowercased"].apply(remove_urls)

# Display column content without truncation
pd.set_option('display.max_colwidth', None) # Set to None for unlimited width
df["urls_removed"]

0                                                                                                                                                                                                                                                               i have bought several of the vitality canned dog food products and have found them all to be of good quality. the product looks more like a stew than a processed meat and it smells better. my labrador is finicky and she appreciates this product better than  most.
1                                                                                                                                                                                                                                                                                                                                        product arrived labeled as jumbo salted peanuts...the peanuts were actually small sized unsalted. not sure if this was an error or if the vendor intend

In [4]:
# Removal of HTML tags and Expressions
from bs4 import BeautifulSoup

def remove_html_tags(text):
    if isinstance(text, str) and ('<' in text and '>' in text):
        return BeautifulSoup(text, "html.parser").get_text()
    return text  # Return as-is if not HTML

# Apply the function
df["html_removed"] = df["urls_removed"].apply(remove_html_tags)

# Remove \n and strip extra whitespace
df["html_removed"] = df["html_removed"].str.replace(r'\n', ' ', regex=True).str.strip()

# Replace multiple spaces with a single space
df["html_removed"] = df["html_removed"].str.replace(r'\s+', ' ', regex=True)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)
df["html_removed"]

0                                                                                                                                                                                                                                                           i have bought several of the vitality canned dog food products and have found them all to be of good quality. the product looks more like a stew than a processed meat and it smells better. my labrador is finicky and she appreciates this product better than most.
1                                                                                                                                                                                                                                                                                                                                   product arrived labeled as jumbo salted peanuts...the peanuts were actually small sized unsalted. not sure if this was an error or if the vendor intended to repr

In [5]:
# Removal of emojis (if any)
import emoji

# replace emoji with ''
def remove_emojis(text):
    return emoji.replace_emoji(text, replace='')

df["emojis_removed"] = df["html_removed"].apply(remove_emojis)

# Display column content without truncation
pd.set_option('display.max_colwidth', None) # Set to None for unlimited width
df["emojis_removed"]

0                                                                                                                                                                                                                                                           i have bought several of the vitality canned dog food products and have found them all to be of good quality. the product looks more like a stew than a processed meat and it smells better. my labrador is finicky and she appreciates this product better than most.
1                                                                                                                                                                                                                                                                                                                                   product arrived labeled as jumbo salted peanuts...the peanuts were actually small sized unsalted. not sure if this was an error or if the vendor intended to repr

In [6]:
# Replace internet slang/chat words
# Dictionary of slang words and their replacements
slang_dict = {
    "tbh": "to be honest",
    "omg": "oh my god",
    "lol": "laugh out loud",
    "idk": "I don't know",
    "brb": "be right back",
    "btw": "by the way",
    "imo": "in my opinion",
    "smh": "shaking my head",
    "fyi": "for your information",
    "np": "no problem",
    "ikr": "I know right",
    "asap": "as soon as possible",
    "bff": "best friend forever",
    "gg": "good game",
    "hmu": "hit me up",
    "rofl": "rolling on the floor laughing",
    "sop" : "standard operating procedure",
    "mins" : "minutes",
    "h" : "hours",
    "hq" : "headquarters",
    "u" : "you",
    "qr" : "quick response",
    "meh" : "bad",
    "af" : "as hell",
    "zzz" : "bored",
    "n" : "and",
    "ppl" : "people",
    "na" : "no"
}

# Function to replace slang words
def replace_slang(text):
    # Create a list of escaped slang words
    escaped_slang_words = [] # Empty list to store escaped slang words
    
    for word in slang_dict.keys():
        escaped_word = re.escape(word) # Ensure special characters
        escaped_slang_words.append(escaped_word) # Add to list
    
    # Join the words using '|'
    slang_pattern = r'\b(' + '|'.join(escaped_slang_words) + r')\b'
    
    # Define a replacement function
    def replace_match(match):
        slang_word = match.group(0) # Extract matched slang word
        return slang_dict[slang_word.lower()] # Replace with full form
    
    # Use regex to replace slang words with full forms
    replaced_text = re.sub(slang_pattern, replace_match, text, flags=re.IGNORECASE)
    
    return replaced_text

# Apply the function to the column
df["slangs_replaced"] = df["emojis_removed"].apply(replace_slang)

# Display column content without truncation
pd.set_option('display.max_colwidth', None) # Set to None for unlimited width
df["slangs_replaced"]

0                                                                                                                                                                                                                                                           i have bought several of the vitality canned dog food products and have found them all to be of good quality. the product looks more like a stew than a processed meat and it smells better. my labrador is finicky and she appreciates this product better than most.
1                                                                                                                                                                                                                                                                                                                                   product arrived labeled as jumbo salted peanuts...the peanuts were actually small sized unsalted. not sure if this was an error or if the vendor intended to repr

In [7]:
# Replace Contractions
contractions_dict = {
    "ain't": "is not",
    "wasn't": "was not",
    "isn't": "is not",
    "aren't": "are not",
    "weren't": "were not",
    "doesn't": "does not",
    "don't": "do not",
    "didn't": "did not",
    "can't": "cannot",
    "can't've": "cannot have",
    "couldn't": "could not",
    "shouldn't": "should not",
    "wouldn't": "would not",
    "won't": "will not",
    "haven't": "have not",
    "hasn't": "has not",
    "hadn't": "had not",
    "needn't": "need not",
    "shan't": "shall not",
    "couldn't've": "could not have",
    "hadn't've": "had not have",
    "might've": "might have",
    "mightn't": "might not",
    "must've": "must have",
    "mustn't": "must not",
    "i'm": "i am",
    "you're": "you are",
    "he's": "he is",
    "she's": "she is",
    "it's": "it is",
    "we're": "we are",
    "they're": "they are",
    "i've": "i have",
    "you've": "you have",
    "we've": "we have",
    "they've": "they have",
    "could've": "could have",
    "i'd": "i would",
    "I'd've": "I would have",
    "you'd": "you would",
    "he'd": "he would",
    "he'd've": "he would have",
    "she'd": "she would",
    "we'd": "we would",
    "they'd": "they would",
    "should've": "should have",
    "shouldn't": "should not",
    "that'd": "that would",
    "that's": "that is",
    "there's": "there is",
    "i'll": "i will",
    "you'll": "you will",
    "he'll": "he will",
    "she'll": "she will",
    "we'll": "we will",
    "they'll": "they will",
    "let's": "let us",
    "that's": "that is",
    "who's": "who is",
    "what's": "what is",
    "where's": "where is",
    "when's": "when is",
    "why's": "why is",
    "cause": "because",
    "how'd": "how did",
    "how'd'y": "how do you",
    "how'll": "how will",
    "how's": "how is",
    "let's": "let us",
    "ma'am": "madam"
}

# Build the regex pattern for contractions
escaped_contractions = []  # List to store escaped contractions

for contraction in contractions_dict.keys():
    escaped_contraction = re.escape(contraction)  # Escape special characters (e.g., apostrophes)
    escaped_contractions.append(escaped_contraction)  # Add to list

# Join the escaped contractions with '|'
joined_contractions = "|".join(escaped_contractions)

# Create a regex pattern with word boundaries (\b)
contractions_pattern = r'\b(' + joined_contractions + r')\b'

# Compile the regex
compiled_pattern = re.compile(contractions_pattern, flags=re.IGNORECASE)

# Define a function to replace contractions
def replace_contractions(text):
    # Function to handle each match found
    def replace_match(match):
        matched_word = match.group(0)  # Extract matched contraction
        lower_matched_word = matched_word.lower()  # Convert to lowercase
        expanded_form = contractions_dict[lower_matched_word]  # Get full form from dictionary
        return expanded_form  # Return the expanded form

    # Apply regex substitution
    expanded_text = compiled_pattern.sub(replace_match, text)

    return expanded_text  # Return modified text

# Apply the function to a DataFrame column
df["contractions_replaced"] = df["slangs_replaced"].apply(replace_contractions)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
df["contractions_replaced"]

0                                                                                                                                                                                                                                                           i have bought several of the vitality canned dog food products and have found them all to be of good quality. the product looks more like a stew than a processed meat and it smells better. my labrador is finicky and she appreciates this product better than most.
1                                                                                                                                                                                                                                                                                                                                   product arrived labeled as jumbo salted peanuts...the peanuts were actually small sized unsalted. not sure if this was an error or if the vendor intended to repr

In [9]:
# Remove punctuations and special characters
import string

# Function to remove punctuation
def remove_punctuation(text):
    return text.translate(str.maketrans('', '', string.punctuation))

# Apply the function to the column
df["punctuations_removed"] = df["contractions_replaced"].apply(remove_punctuation)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
df["punctuations_removed"]

0                                                                                                                                                                                                                                            i have bought several of the vitality canned dog food products and have found them all to be of good quality the product looks more like a stew than a processed meat and it smells better my labrador is finicky and she appreciates this product better than most
1                                                                                                                                                                                                                                                                                                                        product arrived labeled as jumbo salted peanutsthe peanuts were actually small sized unsalted not sure if this was an error or if the vendor intended to represent the product as jumbo
2     

In [11]:
# Remove numbers
def remove_numbers(text):
    return re.sub(r'\d+', '', text)  # Removes all numeric characters

# Apply the function to the column
df["numbers_removed"] = df["punctuations_removed"].apply(remove_numbers)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
df["numbers_removed"]

0                                                                                                                                                                                                                                            i have bought several of the vitality canned dog food products and have found them all to be of good quality the product looks more like a stew than a processed meat and it smells better my labrador is finicky and she appreciates this product better than most
1                                                                                                                                                                                                                                                                                                                        product arrived labeled as jumbo salted peanutsthe peanuts were actually small sized unsalted not sure if this was an error or if the vendor intended to represent the product as jumbo
2     

In [18]:
!pip install pyspellchecker

Collecting pyspellchecker
  Downloading pyspellchecker-0.8.2-py3-none-any.whl.metadata (9.4 kB)
Downloading pyspellchecker-0.8.2-py3-none-any.whl (7.1 MB)
   ---------------------------------------- 0.0/7.1 MB ? eta -:--:--
   ----- ---------------------------------- 1.0/7.1 MB 6.3 MB/s eta 0:00:01
   ------------- -------------------------- 2.4/7.1 MB 6.1 MB/s eta 0:00:01
   -------------------- ------------------- 3.7/7.1 MB 6.4 MB/s eta 0:00:01
   --------------------------------- ------ 6.0/7.1 MB 7.5 MB/s eta 0:00:01
   ---------------------------------------- 7.1/7.1 MB 7.3 MB/s eta 0:00:00
Installing collected packages: pyspellchecker
Successfully installed pyspellchecker-0.8.2


In [None]:
# Correct spelling mistakes
from autocorrect import Speller

# Initialize spell checker
spell = Speller(lang='en')

# Function to correct spelling
def correct_spelling(text):
    return spell(text)  # Apply correction

# Apply the function to the column
df["spelling_corrected"] = df["numbers_removed"].apply(correct_spelling)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
df["spelling_corrected"]

In [17]:
# Remove stopwords
import nltk
from nltk.corpus import stopwords

# Download stopwords if not already downloaded
nltk.download('stopwords')

# Define stopwords list
stop_words = set(stopwords.words('english'))

# Function to remove stopwords
def remove_stopwords(text):
    words = text.split()  # Split text into words
    filtered_words = []  # Create an empty list to store words after stopword removal

    for word in words:  # Loop through each word in the list of words
        lower_word = word.lower()  # Convert the word to lowercase for uniform comparison
    
        if lower_word not in stop_words:  # Check if the lowercase word is NOT in the stopwords list
            filtered_words.append(word)  # If it's not a stopword, add it to the filtered list

    return " ".join(filtered_words)  # Join words back into a sentence

# Apply the function to the column
df["stopwords_removed"] = df["numbers_removed"].apply(remove_stopwords)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
df["stopwords_removed"]

[nltk_data] Downloading package stopwords to C:\Users\Afiq
[nltk_data]     Fikri\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


0                                                                                                                                  bought several vitality canned dog food products found good quality product looks like stew processed meat smells better labrador finicky appreciates product better
1                                                                                                                                                              product arrived labeled jumbo salted peanutsthe peanuts actually small sized unsalted sure error vendor intended represent product jumbo
2         confection around centuries light pillowy citrus gelatin nuts case filberts cut tiny squares liberally coated powdered sugar tiny mouthful heaven chewy flavorful highly recommend yummy treat familiar story cs lewis lion witch wardrobe treat seduces edmund selling brother sisters witch
3                                                                                                               

In [19]:
# Stemming - reduces words to their base root by chopping off suffixes
from nltk.stem import PorterStemmer

# Initialize the stemmer
stemmer = PorterStemmer()

# Function to apply stemming
def stem_text(text):
    if not isinstance(text, str):
        return ""

    words = text.split()
    stemmed_words = [stemmer.stem(word) for word in words]  # Apply stemming
    return " ".join(stemmed_words)

# Apply the function
df["stemmed_words"] = df["stopwords_removed"].apply(stem_text)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
df["stemmed_words"]

0                                                                                                                   bought sever vital can dog food product found good qualiti product look like stew process meat smell better labrador finicki appreci product better
1                                                                                                                                                product arriv label jumbo salt peanutsth peanut actual small size unsalt sure error vendor intend repres product jumbo
2         confect around centuri light pillowi citru gelatin nut case filbert cut tini squar liber coat powder sugar tini mouth heaven chewi flavor highli recommend yummi treat familiar stori cs lewi lion witch wardrob treat seduc edmund sell brother sister witch
3                                                                                                                                                    look secret ingredi robitussin believ found got addit root 

In [20]:
import nltk

# Download the required resources
nltk.download('wordnet')                    # For lemmatization
nltk.download('omw-1.4')                     # WordNet lexical database
nltk.download('averaged_perceptron_tagger_eng')  # For POS tagging
nltk.download('punkt_tab')                       # For tokenization

[nltk_data] Downloading package wordnet to C:\Users\Afiq
[nltk_data]     Fikri\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package omw-1.4 to C:\Users\Afiq
[nltk_data]     Fikri\AppData\Roaming\nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     C:\Users\Afiq Fikri\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger_eng is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package punkt_tab to C:\Users\Afiq
[nltk_data]     Fikri\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!


True

In [27]:
# Lemmatization - reduces words to their base dictionary form (lemma)
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet
from nltk.tokenize import word_tokenize
from nltk import pos_tag

# Initialize the lemmatizer
lemmatizer = WordNetLemmatizer()

# Function to map NLTK POS tags to WordNet POS tags
def get_wordnet_pos(nltk_tag):
    if nltk_tag.startswith('J'):  # Adjective
        return wordnet.ADJ
    elif nltk_tag.startswith('V'):  # Verb
        return wordnet.VERB
    elif nltk_tag.startswith('N'):  # Noun
        return wordnet.NOUN
    elif nltk_tag.startswith('R'):  # Adverb
        return wordnet.ADV
    else:
        return wordnet.NOUN  # Default to noun

# Function to lemmatize text with POS tagging
def lemmatize_text(text):
    if not isinstance(text, str):  # Ensure input is a string
        return ""

    words = word_tokenize(text)  # Tokenize text into words
    pos_tags = pos_tag(words)  # Get POS tags
    
    # Lemmatize each word with its correct POS tag
    lemmatized_words = [lemmatizer.lemmatize(word, get_wordnet_pos(tag)) for word, tag in pos_tags]
    
    return " ".join(lemmatized_words)  # Join words back into a sentence

# Apply the function to the column
df["lemmatized"] = df["stopwords_removed"].apply(lemmatize_text)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
print(df["lemmatized"])

0                                                                                                                                        buy several vitality can dog food product find good quality product look like stew process meat smell well labrador finicky appreciate product well
1                                                                                                                                                          product arrive labeled jumbo salt peanutsthe peanut actually small size unsalted sure error vendor intend represent product jumbo
2         confection around century light pillowy citrus gelatin nut case filberts cut tiny square liberally coat powdered sugar tiny mouthful heaven chewy flavorful highly recommend yummy treat familiar story c lewis lion witch wardrobe treat seduces edmund sell brother sister witch
3                                                                                                                                                

In [28]:
import nltk
from nltk.tokenize import word_tokenize

# Download tokenizer if not already available
nltk.download('punkt')

# Function to tokenize text
def tokenize_text(text):
    if not isinstance(text, str):  # Ensure the input is a string
        return []
    return word_tokenize(text)  # Tokenize text into words

# Apply tokenization to the column
df["tokenized"] = df["lemmatized"].apply(tokenize_text)

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
print(df["tokenized"])

[nltk_data] Downloading package punkt to C:\Users\Afiq
[nltk_data]     Fikri\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


0                                                                                                                                                         [buy, several, vitality, can, dog, food, product, find, good, quality, product, look, like, stew, process, meat, smell, well, labrador, finicky, appreciate, product, well]
1                                                                                                                                                                                [product, arrive, labeled, jumbo, salt, peanutsthe, peanut, actually, small, size, unsalted, sure, error, vendor, intend, represent, product, jumbo]
2         [confection, around, century, light, pillowy, citrus, gelatin, nut, case, filberts, cut, tiny, square, liberally, coat, powdered, sugar, tiny, mouthful, heaven, chewy, flavorful, highly, recommend, yummy, treat, familiar, story, c, lewis, lion, witch, wardrobe, treat, seduces, edmund, sell, brother, sister, witch]
3                     

In [29]:
# Keep only the specified columns into Excel
df_final = df[['ProductId', 'ProfileName', 'Time', 'Score', 'Text', 'lemmatized', 'tokenized']].copy()

# Display column content without truncation
pd.set_option('display.max_colwidth', None)  # Set to None for unlimited width
df_final

Unnamed: 0,ProductId,ProfileName,Time,Score,Text,lemmatized,tokenized
0,B001E4KFG0,delmartian,1303862400,5,I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.,buy several vitality can dog food product find good quality product look like stew process meat smell well labrador finicky appreciate product well,"[buy, several, vitality, can, dog, food, product, find, good, quality, product, look, like, stew, process, meat, smell, well, labrador, finicky, appreciate, product, well]"
1,B00813GRG4,dll pa,1346976000,1,"Product arrived labeled as Jumbo Salted Peanuts...the peanuts were actually small sized unsalted. Not sure if this was an error or if the vendor intended to represent the product as ""Jumbo"".",product arrive labeled jumbo salt peanutsthe peanut actually small size unsalted sure error vendor intend represent product jumbo,"[product, arrive, labeled, jumbo, salt, peanutsthe, peanut, actually, small, size, unsalted, sure, error, vendor, intend, represent, product, jumbo]"
2,B000LQOCH0,"Natalia Corres ""Natalia Corres""",1219017600,4,"This is a confection that has been around a few centuries. It is a light, pillowy citrus gelatin with nuts - in this case Filberts. And it is cut into tiny squares and then liberally coated with powdered sugar. And it is a tiny mouthful of heaven. Not too chewy, and very flavorful. I highly recommend this yummy treat. If you are familiar with the story of C.S. Lewis' ""The Lion, The Witch, and The Wardrobe"" - this is the treat that seduces Edmund into selling out his Brother and Sisters to the Witch.",confection around century light pillowy citrus gelatin nut case filberts cut tiny square liberally coat powdered sugar tiny mouthful heaven chewy flavorful highly recommend yummy treat familiar story c lewis lion witch wardrobe treat seduces edmund sell brother sister witch,"[confection, around, century, light, pillowy, citrus, gelatin, nut, case, filberts, cut, tiny, square, liberally, coat, powdered, sugar, tiny, mouthful, heaven, chewy, flavorful, highly, recommend, yummy, treat, familiar, story, c, lewis, lion, witch, wardrobe, treat, seduces, edmund, sell, brother, sister, witch]"
3,B000UA0QIQ,Karl,1307923200,2,If you are looking for the secret ingredient in Robitussin I believe I have found it. I got this in addition to the Root Beer Extract I ordered (which was good) and made some cherry soda. The flavor is very medicinal.,look secret ingredient robitussin believe find get addition root beer extract order good make cherry soda flavor medicinal,"[look, secret, ingredient, robitussin, believe, find, get, addition, root, beer, extract, order, good, make, cherry, soda, flavor, medicinal]"
4,B006K2ZZ7K,"Michael D. Bigham ""M. Wassir""",1350777600,5,"Great taffy at a great price. There was a wide assortment of yummy taffy. Delivery was very quick. If your a taffy lover, this is a deal.",great taffy great price wide assortment yummy taffy delivery quick taffy lover deal,"[great, taffy, great, price, wide, assortment, yummy, taffy, delivery, quick, taffy, lover, deal]"
...,...,...,...,...,...,...,...
568449,B001EO7N10,Lettie D. Carter,1299628800,5,Great for sesame chicken..this is a good if not better than resturants I have eaten at..My husband loved it..will find other recipes to use this in..,great sesame chickenthis good good resturants eat atmy husband love itwill find recipe use,"[great, sesame, chickenthis, good, good, resturants, eat, atmy, husband, love, itwill, find, recipe, use]"
568450,B003S1WTCU,R. Sawyer,1331251200,2,"I'm disappointed with the flavor. The chocolate notes are especially weak. Milk thickens it but the flavor still disappoints. This was worth a try but I'll never buy again. I will use what's left, which will be gone in no time thanks to the small cans.",disappointed flavor chocolate note especially weak milk thickens flavor still disappoints worth try never buy use leave go time thanks small can,"[disappointed, flavor, chocolate, note, especially, weak, milk, thickens, flavor, still, disappoints, worth, try, never, buy, use, leave, go, time, thanks, small, can]"
568451,B004I613EE,"pksd ""pk_007""",1329782400,5,"These stars are small, so you can give 10-15 of those in one training session. I tried to train our dog with ""Ceaser dog treats"", it just made our puppy hyper. If you compare the ingredients, you will know why. Little stars has just basic food ingredients without any preservatives and food coloring. Sweet potato flavor also did not make my hand smell like dog food.",star small give one training session try train dog ceaser dog treat make puppy hyper compare ingredient know little star basic food ingredient without preservative food color sweet potato flavor also make hand smell like dog food,"[star, small, give, one, training, session, try, train, dog, ceaser, dog, treat, make, puppy, hyper, compare, ingredient, know, little, star, basic, food, ingredient, without, preservative, food, color, sweet, potato, flavor, also, make, hand, smell, like, dog, food]"
568452,B004I613EE,"Kathy A. Welch ""katwel""",1331596800,5,These are the BEST treats for training and rewarding your dog for being good while grooming. Lower in calories and loved by all the doggies. Sweet potatoes seem to be their favorite Wet Noses treat!,best treat train reward dog good groom low calorie love doggy sweet potato seem favorite wet nose treat,"[best, treat, train, reward, dog, good, groom, low, calorie, love, doggy, sweet, potato, seem, favorite, wet, nose, treat]"


In [32]:
df_final.to_csv("preprocessed_reviews_final.csv", index=False)  # Saves without the index column