## Thanksgiving Dinner Survey Analysis

This analysis investigates the responses to a survey based on Thanksgiving dinner and their correlation with race, age, income, etc. The raw data is supplied by FiveThirtyEight through the `thanksgiving.csv` file. See the csv file for a list of questions used in the survey.

There were a total of 1058 participants in the survey. of these, 92.6% of participants celebrated Thanksgiving.

The noteworthy results of this investigation are:

- 89.4% of this subgroup ate either pumpkin, pecan, or apple pie for dessert. Note that the False entries in the cell for this piece of data describes respondents who did in fact have pie for dessert.

- 27.0% of all respondents, including those that do not celebrate Thanksgiving, are members of the 45 - 59 age group, making them the largest age subgroup in the survey.

- 17.0% of all respondents belong to the income range of \$25,000 - \$49,000, making it the most common income range for survey respondents. However, we note that those who selected not to answer may have impacted this result. We should also keep in mind that those that preferred to not answer were simply included with those that selected no income in the survey.

- 59.3% of all respondents who have a household income under \$150,000, or did not provide their income, claim they have Thanksgiving somewhere other than their home. On the other hand, 52.9% of all respondents with a household income over \$150,000 claim they have Thanksgiving somewhere other than their home. Thus household income appears to impact travel in this survey.

- The average age of respondents who attend a "Friendsgiving" and meet up with friends on Thanksgiving is 34.0, while all other choices have ages much higher. This result shows that the bias that younger people tend to have Thanksgiving with friends holds weight in this survey.

There are many diffe

In [1]:
import pandas as pd
data = pd.read_csv("thanksgiving.csv", encoding="Latin-1")
data.head()

Unnamed: 0,RespondentID,Do you celebrate Thanksgiving?,What is typically the main dish at your Thanksgiving dinner?,What is typically the main dish at your Thanksgiving dinner? - Other (please specify),How is the main dish typically cooked?,How is the main dish typically cooked? - Other (please specify),What kind of stuffing/dressing do you typically have?,What kind of stuffing/dressing do you typically have? - Other (please specify),What type of cranberry saucedo you typically have?,What type of cranberry saucedo you typically have? - Other (please specify),...,Have you ever tried to meet up with hometown friends on Thanksgiving night?,"Have you ever attended a ""Friendsgiving?""",Will you shop any Black Friday sales on Thanksgiving Day?,Do you work in retail?,Will you employer make you work on Black Friday?,How would you describe where you live?,Age,What is your gender?,How much total combined money did all members of your HOUSEHOLD earn last year?,US Region
0,4337954960,Yes,Turkey,,Baked,,Bread-based,,,,...,Yes,No,No,No,,Suburban,18 - 29,Male,"$75,000 to $99,999",Middle Atlantic
1,4337951949,Yes,Turkey,,Baked,,Bread-based,,Other (please specify),Homemade cranberry gelatin ring,...,No,No,Yes,No,,Rural,18 - 29,Female,"$50,000 to $74,999",East South Central
2,4337935621,Yes,Turkey,,Roasted,,Rice-based,,Homemade,,...,Yes,Yes,Yes,No,,Suburban,18 - 29,Male,"$0 to $9,999",Mountain
3,4337933040,Yes,Turkey,,Baked,,Bread-based,,Homemade,,...,Yes,No,No,No,,Urban,30 - 44,Male,"$200,000 and up",Pacific
4,4337931983,Yes,Tofurkey,,Baked,,Bread-based,,Canned,,...,Yes,No,No,No,,Urban,30 - 44,Male,"$100,000 to $124,999",Pacific


In [2]:
response_values = data["Do you celebrate Thanksgiving?"].value_counts(normalize=True)
data_celebrate = (data["Do you celebrate Thanksgiving?"] == "Yes")
data_true = data[data_celebrate]
response_values

Yes    0.926276
No     0.073724
Name: Do you celebrate Thanksgiving?, dtype: float64

In [3]:
main_dish_counts = data_true["What is typically the main dish at your Thanksgiving dinner?"].value_counts()
tofurkey_true = (data_true["What is typically the main dish at your Thanksgiving dinner?"] == "Tofurkey")
tofurkey = data_true[tofurkey_true]
gravy = data_true["Do you typically have gravy?"]

In [4]:
apple_data = data_true["Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Apple"]
pumpkin_data = data_true["Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pumpkin"]
pecan_data = data_true["Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pecan"]
apple_isnull = pd.isnull(apple_data)
pumpkin_isnull = pd.isnull(pumpkin_data)
pecan_isnull = pd.isnull(pecan_data)
ate_pies = apple_isnull & pumpkin_isnull & pecan_isnull
ate_pies_values = ate_pies.value_counts(normalize=True)
ate_pies_values

False    0.893878
True     0.106122
dtype: float64

In [29]:
def age(column):
    if pd.isnull(column):
        return None
    age_split = column.split(' ')[0]
    age_split = age_split.replace('+', '')
    return int(age_split)

data["int_age"] = data["Age"].apply(age)
data["int_age"].describe()

count    1025.000000
mean       39.383415
std        15.398493
min        18.000000
25%        30.000000
50%        45.000000
75%        60.000000
max        60.000000
Name: int_age, dtype: float64

In [46]:
def income(column):
    if pd.isnull(column):
        return None
    income_split = column.split(' ')[0]
    if income_split == "Prefer":
        return None
    income_split = income_split.replace('$', '')
    income_split = income_split.replace(',', '')
    return int(income_split)

income_str = "How much total combined money did all members of your HOUSEHOLD earn last year?"
data["int_income"] = data[income_str].apply(income)
data["int_income"].describe()

count       889.000000
mean      74077.615298
std       59360.742902
min           0.000000
25%       25000.000000
50%       50000.000000
75%      100000.000000
max      200000.000000
Name: int_income, dtype: float64

In [49]:
data[data["int_income"] < 150000]["How far will you travel for Thanksgiving?"].value_counts()


Thanksgiving is happening at my home--I won't travel at all                         281
Thanksgiving is local--it will take place in the town I live in                     203
Thanksgiving is out of town but not too far--it's a drive of a few hours or less    150
Thanksgiving is out of town and far away--I have to drive several hours or fly       55
Name: How far will you travel for Thanksgiving?, dtype: int64

In [50]:
data[data["int_income"] >= 150000]["How far will you travel for Thanksgiving?"].value_counts()

Thanksgiving is happening at my home--I won't travel at all                         66
Thanksgiving is local--it will take place in the town I live in                     34
Thanksgiving is out of town but not too far--it's a drive of a few hours or less    25
Thanksgiving is out of town and far away--I have to drive several hours or fly      15
Name: How far will you travel for Thanksgiving?, dtype: int64

In [56]:
ind = "Have you ever tried to meet up with hometown friends on Thanksgiving night?"
col = 'Have you ever attended a "Friendsgiving?"'
friend_age_pivot = pd.pivot_table(data, values="int_age", index=ind, columns=col)
friend_income_pivot = data.pivot_table(values="int_income", index=ind, columns=col)
friend_age_pivot

"Have you ever attended a ""Friendsgiving?""",No,Yes
Have you ever tried to meet up with hometown friends on Thanksgiving night?,Unnamed: 1_level_1,Unnamed: 2_level_1
No,42.283702,37.010526
Yes,41.47541,33.976744


In [57]:
friend_income_pivot

"Have you ever attended a ""Friendsgiving?""",No,Yes
Have you ever tried to meet up with hometown friends on Thanksgiving night?,Unnamed: 1_level_1,Unnamed: 2_level_1
No,78914.549654,72894.736842
Yes,78750.0,66019.736842
