# Ice Cream Flavor Analysis

The ice cream dataset consists of ice cream names, descriptions, ratings, ingredients, and reviews of four different ice cream brands: Ben & Jerry's, Häagen-Dazs, Breyers, and Talenti. The data is collected from each brand's website and is available on Kaggle for public use. 

As an ice cream enthusiast, I am greatly concerned with getting the best grocery store ice cream five dollars can buy, and without the time and money to taste each flavor from each brand, I often resort to my old favorites. In this project, I am to discover which flavors and brands are most popular and why, with hopes of reducing my time spent deliberating over the best option in the freezer section of the grocery store.

In this notebook, I will examine the differences in the popularity of certain flavors by searching for key words in the descriptions of the ice creams. 

In [3]:
#import packages
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

In [4]:
data = pd.read_csv("./icecream_data/combined/products.csv")
data.head()

Unnamed: 0,brand,key,name,subhead,description,rating,rating_count,ingredients
0,bj,0_bj,Salted Caramel Core,Sweet Cream Ice Cream with Blonde Brownies & a...,Find your way to the ultimate ice cream experi...,3.7,208,"CREAM, SKIM MILK, LIQUID SUGAR (SUGAR, WATER),..."
1,bj,1_bj,Netflix & Chilll'd™,Peanut Butter Ice Cream with Sweet & Salty Pre...,There’s something for everyone to watch on Net...,4.0,127,"CREAM, SKIM MILK, LIQUID SUGAR (SUGAR, WATER),..."
2,bj,2_bj,Chip Happens,A Cold Mess of Chocolate Ice Cream with Fudge ...,Sometimes “chip” happens and everything’s a me...,4.7,130,"CREAM, LIQUID SUGAR (SUGAR, WATER), SKIM MILK,..."
3,bj,3_bj,Cannoli,Mascarpone Ice Cream with Fudge-Covered Pastry...,As a Limited Batch that captured the rapture o...,3.6,70,"CREAM, SKIM MILK, LIQUID SUGAR (SUGAR, WATER),..."
4,bj,4_bj,Gimme S’more!™,Toasted Marshmallow Ice Cream with Chocolate C...,It’s a gimme: there’s always room for s’more. ...,4.5,281,"CREAM, SKIM MILK, WATER, LIQUID SUGAR (SUGAR, ..."


In [97]:
data.tail()

Unnamed: 0,brand,key,name,subhead,description,rating,rating_count,ingredients
139,talenti,12_talenti,COCONUT CHOCOLATE COOKIE,,,4.3,29,"WATER, SUGAR, DESICCATED COCONUT, COCONUT OIL,..."


In [13]:
data['description'][202].lower()

'savor the flavors of vanilla and rich salted caramel and pieces of toffee swirled throughout breyers® no sugar added salted caramel swirl. all the rich, creamy taste—with no added sugar!\n\nbreyers® no sugar added salted caramel swirl is a great way to enjoy dessert time, without all the guilt. it’s the vanilla and caramel you love and expect from breyers® plus chunks of toffee pieces, without sugar – what can be sweeter? your sweet tooth can indulge in the smooth, creamy vanilla and caramel that has just the perfect amount of sweet.\n\nour breyers® no sugar added salted caramel swirl uses 100% grade a milk and fresh cream from american cows not treated with artificial growth hormones* and vanilla from sustainably sourced farms in madagascar, in partnership with the rainforest alliance.\n\ncheck out to snag some near you and leave us a review to share your love!\n\n*the fda states that no significant difference has been shown between dairy derived from rbst-treated and non-rbst-treate

In [118]:
#import packages for data cleaning
import re

#create dataframe with just the descriptions
flavors = data['description']
cleaned_text = []

#text cleaning
for i in range(len(flavors)):
    text = flavors[i]
    if(type(text) != str):
        text = ""
        cleaned_text.append(text)
        continue
    else:
        text = text.lower()
        text = re.sub(r'[.,"\'-?:!;—®–%]', '', text)
        text = re.sub(r'["\n"]', ' ', text)
        cleaned_text.append(text)
    
cleaned_text = pd.DataFrame(cleaned_text, columns = ['text'])
df = pd.concat([data, cleaned_text], axis=1)
df.head()

Unnamed: 0,brand,key,name,subhead,description,rating,rating_count,ingredients,text
0,bj,0_bj,Salted Caramel Core,Sweet Cream Ice Cream with Blonde Brownies & a...,Find your way to the ultimate ice cream experi...,3.7,208,"CREAM, SKIM MILK, LIQUID SUGAR (SUGAR, WATER),...",find your way to the ultimate ice cream experi...
1,bj,1_bj,Netflix & Chilll'd™,Peanut Butter Ice Cream with Sweet & Salty Pre...,There’s something for everyone to watch on Net...,4.0,127,"CREAM, SKIM MILK, LIQUID SUGAR (SUGAR, WATER),...",there’s something for everyone to watch on net...
2,bj,2_bj,Chip Happens,A Cold Mess of Chocolate Ice Cream with Fudge ...,Sometimes “chip” happens and everything’s a me...,4.7,130,"CREAM, LIQUID SUGAR (SUGAR, WATER), SKIM MILK,...",sometimes “chip” happens and everything’s a me...
3,bj,3_bj,Cannoli,Mascarpone Ice Cream with Fudge-Covered Pastry...,As a Limited Batch that captured the rapture o...,3.6,70,"CREAM, SKIM MILK, LIQUID SUGAR (SUGAR, WATER),...",as a limited batch that captured the rapture o...
4,bj,4_bj,Gimme S’more!™,Toasted Marshmallow Ice Cream with Chocolate C...,It’s a gimme: there’s always room for s’more. ...,4.5,281,"CREAM, SKIM MILK, WATER, LIQUID SUGAR (SUGAR, ...",it’s a gimme there’s always room for s’more an...


## Possible Analyses

- One-Way Anova (Ratings between brands)
- One-Way Anova (Number of Reviews between brands)
- Flavor popularity (ice cream base?)
- Regression number of words in flavor name versus rating
- Visualization of words in the description for each brand
- Regression number of popular words in flavor description versus rating
- Use of charged words (scrumptious, love) and rating