### Introduction to the Data
It contains 1058 responses to an online survey about what Americans eat for Thanksgiving dinner. Each survey respondent was asked questions about what they typically eat for Thanksgiving, along with some demographic questions, like their gender, income, and location. This dataset will allow us to discover regional and income-based patterns in what Americans eat for Thanksgiving dinner.

In [215]:
import pandas as pd
import numpy as np
import re

In [87]:
file = "data/thanksgiving-2015-poll-data.csv"
data = pd.read_csv(file, encoding="Latin-1")
data.head()

Unnamed: 0,RespondentID,Do you celebrate Thanksgiving?,What is typically the main dish at your Thanksgiving dinner?,What is typically the main dish at your Thanksgiving dinner? - Other (please specify),How is the main dish typically cooked?,How is the main dish typically cooked? - Other (please specify),What kind of stuffing/dressing do you typically have?,What kind of stuffing/dressing do you typically have? - Other (please specify),What type of cranberry saucedo you typically have?,What type of cranberry saucedo you typically have? - Other (please specify),...,Have you ever tried to meet up with hometown friends on Thanksgiving night?,"Have you ever attended a ""Friendsgiving?""",Will you shop any Black Friday sales on Thanksgiving Day?,Do you work in retail?,Will you employer make you work on Black Friday?,How would you describe where you live?,Age,What is your gender?,How much total combined money did all members of your HOUSEHOLD earn last year?,US Region
0,4337954960,Yes,Turkey,,Baked,,Bread-based,,,,...,Yes,No,No,No,,Suburban,18 - 29,Male,"$75,000 to $99,999",Middle Atlantic
1,4337951949,Yes,Turkey,,Baked,,Bread-based,,Other (please specify),Homemade cranberry gelatin ring,...,No,No,Yes,No,,Rural,18 - 29,Female,"$50,000 to $74,999",East South Central
2,4337935621,Yes,Turkey,,Roasted,,Rice-based,,Homemade,,...,Yes,Yes,Yes,No,,Suburban,18 - 29,Male,"$0 to $9,999",Mountain
3,4337933040,Yes,Turkey,,Baked,,Bread-based,,Homemade,,...,Yes,No,No,No,,Urban,30 - 44,Male,"$200,000 and up",Pacific
4,4337931983,Yes,Tofurkey,,Baked,,Bread-based,,Canned,,...,Yes,No,No,No,,Urban,30 - 44,Male,"$100,000 to $124,999",Pacific


## Filtering Out Rows From  A DataFrame

In [88]:
data["Do you celebrate Thanksgiving?"].value_counts()

Yes    980
No      78
Name: Do you celebrate Thanksgiving?, dtype: int64

In [89]:
data = data[data['Do you celebrate Thanksgiving?'] == 'Yes']
data.head()

Unnamed: 0,RespondentID,Do you celebrate Thanksgiving?,What is typically the main dish at your Thanksgiving dinner?,What is typically the main dish at your Thanksgiving dinner? - Other (please specify),How is the main dish typically cooked?,How is the main dish typically cooked? - Other (please specify),What kind of stuffing/dressing do you typically have?,What kind of stuffing/dressing do you typically have? - Other (please specify),What type of cranberry saucedo you typically have?,What type of cranberry saucedo you typically have? - Other (please specify),...,Have you ever tried to meet up with hometown friends on Thanksgiving night?,"Have you ever attended a ""Friendsgiving?""",Will you shop any Black Friday sales on Thanksgiving Day?,Do you work in retail?,Will you employer make you work on Black Friday?,How would you describe where you live?,Age,What is your gender?,How much total combined money did all members of your HOUSEHOLD earn last year?,US Region
0,4337954960,Yes,Turkey,,Baked,,Bread-based,,,,...,Yes,No,No,No,,Suburban,18 - 29,Male,"$75,000 to $99,999",Middle Atlantic
1,4337951949,Yes,Turkey,,Baked,,Bread-based,,Other (please specify),Homemade cranberry gelatin ring,...,No,No,Yes,No,,Rural,18 - 29,Female,"$50,000 to $74,999",East South Central
2,4337935621,Yes,Turkey,,Roasted,,Rice-based,,Homemade,,...,Yes,Yes,Yes,No,,Suburban,18 - 29,Male,"$0 to $9,999",Mountain
3,4337933040,Yes,Turkey,,Baked,,Bread-based,,Homemade,,...,Yes,No,No,No,,Urban,30 - 44,Male,"$200,000 and up",Pacific
4,4337931983,Yes,Tofurkey,,Baked,,Bread-based,,Canned,,...,Yes,No,No,No,,Urban,30 - 44,Male,"$100,000 to $124,999",Pacific


In [90]:
data['What is typically the main dish at your Thanksgiving dinner?'].value_counts()

Turkey                    859
Other (please specify)     35
Ham/Pork                   29
Tofurkey                   20
Chicken                    12
Roast beef                 11
I don't know                5
Turducken                   3
Name: What is typically the main dish at your Thanksgiving dinner?, dtype: int64

### Using value_counts To Explore Main Dishes

In [91]:
main_dish = "What is typically the main dish at your Thanksgiving dinner?"
gravy = "Do you typically have gravy?"

data[main_dish].value_counts()

Turkey                    859
Other (please specify)     35
Ham/Pork                   29
Tofurkey                   20
Chicken                    12
Roast beef                 11
I don't know                5
Turducken                   3
Name: What is typically the main dish at your Thanksgiving dinner?, dtype: int64

In [92]:
data[data[main_dish] == "Tofurkey"][gravy]

4      Yes
33     Yes
69      No
72      No
77     Yes
145    Yes
175    Yes
218     No
243    Yes
275     No
393    Yes
399    Yes
571    Yes
594    Yes
628     No
774     No
820     No
837    Yes
860     No
953    Yes
Name: Do you typically have gravy?, dtype: object

### 4. Figuring Out What Pies People Eat

In [93]:
apple = 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Apple'
pumpkin = 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pumpkin'
pecan = 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pecan'
print(data[apple].value_counts())
print("****")
print(data[pumpkin].value_counts())
print("****")
print(data[pecan].value_counts())

Apple    514
Name: Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Apple, dtype: int64
****
Pumpkin    729
Name: Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pumpkin, dtype: int64
****
Pecan    342
Name: Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pecan, dtype: int64


In [94]:
apple_isnull = data[apple].isnull()
pumpkin_isnull = data[pumpkin].isnull()
pecan_isnull = data[pecan].isnull()

no_pies = apple_isnull & pumpkin_isnull & pecan_isnull

no_pies.value_counts()

False    876
True     104
dtype: int64

### 5. Converting Age To Numeric

In [95]:
def age_category(x):
    if pd.isnull(x):
        return None
    category = x.split(' ')[0]
    category = category.replace('+', '')
    return int(category)

data['int_age'] = data['Age'].apply(age_category)
data['int_age'].describe()

count    947.000000
mean      40.089757
std       15.352014
min       18.000000
25%       30.000000
50%       45.000000
75%       60.000000
max       60.000000
Name: int_age, dtype: float64

### 6. Converting Income To Numeric

In [96]:
def income_category(x):
    if pd.isnull(x):
        return None
    category = x.split(' ')[0]
    if category == 'Prefer':
        return None
    category = category.replace('$', '')
    category = category.replace(',', '')
    return int(category)

income = 'How much total combined money did all members of your HOUSEHOLD earn last year?'
data['int_income'] = data[income].apply(income_category)
data['int_income'].describe()

count       829.000000
mean      75965.018094
std       59068.636748
min           0.000000
25%       25000.000000
50%       75000.000000
75%      100000.000000
max      200000.000000
Name: int_income, dtype: float64

### 7. Correlating Travel Distance And Income

In [97]:
travel = 'How far will you travel for Thanksgiving?'
income = 'int_income'
data[travel].value_counts()

Thanksgiving is happening at my home--I won't travel at all                         396
Thanksgiving is local--it will take place in the town I live in                     276
Thanksgiving is out of town but not too far--it's a drive of a few hours or less    197
Thanksgiving is out of town and far away--I have to drive several hours or fly       82
Name: How far will you travel for Thanksgiving?, dtype: int64

In [98]:
poor = data[data[income] < 15000][travel]
rich = data[data[income] > 15000][travel]
print(poor.value_counts())
print("****")
print(rich.value_counts())

Thanksgiving is happening at my home--I won't travel at all                         46
Thanksgiving is local--it will take place in the town I live in                     38
Thanksgiving is out of town but not too far--it's a drive of a few hours or less    22
Thanksgiving is out of town and far away--I have to drive several hours or fly       6
Name: How far will you travel for Thanksgiving?, dtype: int64
****
Thanksgiving is happening at my home--I won't travel at all                         301
Thanksgiving is local--it will take place in the town I live in                     199
Thanksgiving is out of town but not too far--it's a drive of a few hours or less    153
Thanksgiving is out of town and far away--I have to drive several hours or fly       64
Name: How far will you travel for Thanksgiving?, dtype: int64


### 8. Linking Friendship And Age

In [99]:
hometown_friend = 'Have you ever tried to meet up with hometown friends on Thanksgiving night?'
friendsgiving = 'Have you ever attended a "Friendsgiving?"'

data.pivot_table(index=hometown_friend, columns=friendsgiving, values='int_age')

"Have you ever attended a ""Friendsgiving?""",No,Yes
Have you ever tried to meet up with hometown friends on Thanksgiving night?,Unnamed: 1_level_1,Unnamed: 2_level_1
No,42.283702,37.010526
Yes,41.47541,33.976744


In [100]:
data.pivot_table(index=hometown_friend, columns=friendsgiving, values=income)

"Have you ever attended a ""Friendsgiving?""",No,Yes
Have you ever tried to meet up with hometown friends on Thanksgiving night?,Unnamed: 1_level_1,Unnamed: 2_level_1
No,78914.549654,72894.736842
Yes,78750.0,66019.736842


### 9. Next Steps
Here are some potential next steps:

- Figure out the most common dessert people eat.
- Figure out the most common complete meal people eat.
- Identify how many people work on Thanksgiving.
- Find regional patterns in the dinner menus.
- Find age, gender, and income based patterns in dinner menus.

In [101]:
data.columns

Index(['RespondentID', 'Do you celebrate Thanksgiving?',
       'What is typically the main dish at your Thanksgiving dinner?',
       'What is typically the main dish at your Thanksgiving dinner? - Other (please specify)',
       'How is the main dish typically cooked?',
       'How is the main dish typically cooked? - Other (please specify)',
       'What kind of stuffing/dressing do you typically have?',
       'What kind of stuffing/dressing do you typically have? - Other (please specify)',
       'What type of cranberry saucedo you typically have?',
       'What type of cranberry saucedo you typically have? - Other (please specify)',
       'Do you typically have gravy?',
       'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Brussel sprouts',
       'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Carrots',
       'Which of these side dishes aretypically served

#### 9.1 Figure out the most common dessert people eat.

In [347]:
dessert_columns = []
for i in data.columns:
    if "desserts" in i:
        dessert_columns.append(i)

desert_data = data.loc[:,dessert_columns]

In [348]:
#for col in desert_data:
#    desert_data.loc['Total', col] = desert_data[col].count()

In [349]:
#popular_desrt = np.argmax(desert_data.loc['Total'].values)

In [350]:
#desert_data.iloc[[980],[popular_desrt]]

In [353]:
for index, row in desert_data.iterrows():
    count = 1
    for i in row.dropna().tolist():
        col = "Disert %s" %count
        desert_data[col] = i
        count += 1

In [354]:
desert_data.head()

Unnamed: 0,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Apple cobbler,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Blondies,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Brownies,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Carrot cake,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Cheesecake,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Cookies,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Fudge,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Ice cream,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Peach cobbler,Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - None,...,Disert 12,Disert 13,Disert 14,Disert 15,Disert 16,Disert 17,Disert 18,Disert 19,Disert 20,Disert 21
0,,,,,Cheesecake,Cookies,,Ice cream,,,...,,Brownies,Cookies,Peach cobbler,Cookies,Peach cobbler,"Chocolate trifle, bread pudding",Ice cream,Peach cobbler,Peach cobbler
1,,,,,Cheesecake,Cookies,,,,,...,,Brownies,Cookies,Peach cobbler,Cookies,Peach cobbler,"Chocolate trifle, bread pudding",Ice cream,Peach cobbler,Peach cobbler
2,,,Brownies,Carrot cake,,Cookies,Fudge,Ice cream,,,...,,Brownies,Cookies,Peach cobbler,Cookies,Peach cobbler,"Chocolate trifle, bread pudding",Ice cream,Peach cobbler,Peach cobbler
3,,,,,,,,,,,...,,Brownies,Cookies,Peach cobbler,Cookies,Peach cobbler,"Chocolate trifle, bread pudding",Ice cream,Peach cobbler,Peach cobbler
4,,,,,,,,,,,...,,Brownies,Cookies,Peach cobbler,Cookies,Peach cobbler,"Chocolate trifle, bread pudding",Ice cream,Peach cobbler,Peach cobbler


#### 9.2 Figure out the most common complete meal people eat.

In [71]:
complete_dinner = []
#non null
main_dish = "What is typically the main dish at your Thanksgiving dinner?"
#non null
stuffing = "What kind of stuffing/dressing do you typically have?"
# non null
cranbery = "What type of cranberry saucedo you typically have?"
# only select yes
gravy = "Do you typically have gravy?"
side_dishes
pie
desert

In [75]:
side_dishes_columns = []
for i in data.columns:
    if "side" in i:
        side_dishes_columns.append(i)

side_dish_data = data.loc[:,side_dishes_columns]
side_dish_data.head()

data["Had a side dish"] = "Yes" if row in side_dish_data
for row in side_dish_data:
    if row not null

In [73]:
pie_columns = []
for i in data.columns:
    if "side" in i:
        pie_columns.append(i)

pie_data = data.loc[:,pie_columns]
pie_data.describe()

Unnamed: 0,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Brussel sprouts,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Carrots,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Cauliflower,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Corn,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Cornbread,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Fruit salad,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Green beans/green bean casserole,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Macaroni and cheese,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Mashed potatoes,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Rolls/biscuits,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Squash,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Vegetable salad,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Yams/sweet potato casserole,Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Other (please specify),Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Other (please specify).1
count,155,242,88,464,235,215,686,206,817,766,171,209,631,111,111
unique,1,1,1,1,1,1,1,1,1,1,1,1,1,1,91
top,Brussel sprouts,Carrots,Cauliflower,Corn,Cornbread,Fruit salad,Green beans/green bean casserole,Macaroni and cheese,Mashed potatoes,Rolls/biscuits,Squash,Vegetable salad,Yams/sweet potato casserole,Other (please specify),broccoli
freq,155,242,88,464,235,215,686,206,817,766,171,209,631,111,7
