<h1>College Students Food Choices and Preferences</h1>

<h2>About Data</h2>

College students' food preferences, dietary habits, inclinations, childhood favorites, eating habits, preferred cuisines and other details are all included in this dataset. There are 125 responses from Mercyhurst University students.

<h2>Data Source</h2>

This data was referred from Borapajo in kaggle

<i>link to access data: https://www.kaggle.com/datasets/borapajo/food-choices?resource=download</i>

<h2>Purpose</h2>

The intent to work on this analysis is to find answers for these below questions.

<ul>
    <li>Can people maintain their physical fitness and well-being based on their dietary choices and lifestyle choices?</li>
    <li>Apart from being aware of the unhealthy food, is there any explanation for why students continue to have it?</li>
    <li>Do things like financial standing, family history, and childhood dietary preferences still have an impact on the foods students choose to eat now?</li>
    <li>Can we learn anything from the student survey replies to figure out how to make healthy food taste good?</li>
</ul>
    

<h2>1. Importing Data</h2>

In [102]:
# importing neccessary packages.
import pandas as pd
import numpy as np

In [103]:
# reading data from excel using pandas
Food_Choices = pd.read_excel('food_coded.xlsx')

In [104]:
Food_Choices.head()

Unnamed: 0,Gender,breakfast,calories_day,calories_scone,coffee,comfort_food,comfort_food_reasons,comfort_food_reasons_coded,cook,cuisine,...,marital_status,mother_profession,parents_cook,pay_meal_out,soup,sports,type_sports,veggies_day,vitamins,weight
0,Male,Cereal,,315.0,creamy frapuccino,none,we dont have comfort,none,A couple of times a week,,...,Single,unemployed,Almost everyday,$5.01 to $10.00,veggie soup,Yes,car racing,very likely,yes,187
1,Female,Cereal,it is moderately important,420.0,espresso shown,"chocolate, chips, ice cream","Stress, bored, anger",stress,"Whenever I can, but that is not very often",American,...,In a relationship,Nurse RN,Almost everyday,$20.01 to $30.00,veggie soup,Yes,Basketball,likely,no,155
2,Female,Cereal,it is very important,420.0,espresso shown,"frozen yogurt, pizza, fast food","stress, sadness",stress,Every day,Korean/Asian,...,In a relationship,owns business,Almost everyday,$10.01 to $20.00,veggie soup,No,none,very likely,yes,I'm not answering this.
3,Female,Cereal,it is moderately important,420.0,espresso shown,"Pizza, Mac and cheese, ice cream",Boredom,boredom,A couple of times a week,Mexican/Spanish,...,In a relationship,Special Education Teacher,Almost everyday,$5.01 to $10.00,veggie soup,No,,neutral,yes,"Not sure, 240"
4,Female,Cereal,it is not at all important,420.0,espresso shown,"Ice cream, chocolate, chips","Stress, boredom, cravings",stress,Every day,Mexican/Spanish,...,Single,Substance Abuse Conselor,Almost everyday,$20.01 to $30.00,veggie soup,Yes,Softball,likely,no,190


<h3> Renaming few columns for better readability and Understanding </h3>

In [105]:
Food_Choices.rename(columns = {'breakfast':'Breakfast Choice', 'calories_day':'Importance_of_calories', 'coffee':'coffee_preference','cook':'frequency_of_cooking','soup':'soup_preference','vitamins':'vitamins_supplements','weight':'weight(in pounds)'},inplace = True)

In [106]:
Food_Choices.head()

Unnamed: 0,Gender,Breakfast Choice,Importance_of_calories,calories_scone,coffee_preference,comfort_food,comfort_food_reasons,comfort_food_reasons_coded,frequency_of_cooking,cuisine,...,marital_status,mother_profession,parents_cook,pay_meal_out,soup_preference,sports,type_sports,veggies_day,vitamins_supplements,weight(in pounds)
0,Male,Cereal,,315.0,creamy frapuccino,none,we dont have comfort,none,A couple of times a week,,...,Single,unemployed,Almost everyday,$5.01 to $10.00,veggie soup,Yes,car racing,very likely,yes,187
1,Female,Cereal,it is moderately important,420.0,espresso shown,"chocolate, chips, ice cream","Stress, bored, anger",stress,"Whenever I can, but that is not very often",American,...,In a relationship,Nurse RN,Almost everyday,$20.01 to $30.00,veggie soup,Yes,Basketball,likely,no,155
2,Female,Cereal,it is very important,420.0,espresso shown,"frozen yogurt, pizza, fast food","stress, sadness",stress,Every day,Korean/Asian,...,In a relationship,owns business,Almost everyday,$10.01 to $20.00,veggie soup,No,none,very likely,yes,I'm not answering this.
3,Female,Cereal,it is moderately important,420.0,espresso shown,"Pizza, Mac and cheese, ice cream",Boredom,boredom,A couple of times a week,Mexican/Spanish,...,In a relationship,Special Education Teacher,Almost everyday,$5.01 to $10.00,veggie soup,No,,neutral,yes,"Not sure, 240"
4,Female,Cereal,it is not at all important,420.0,espresso shown,"Ice cream, chocolate, chips","Stress, boredom, cravings",stress,Every day,Mexican/Spanish,...,Single,Substance Abuse Conselor,Almost everyday,$20.01 to $30.00,veggie soup,Yes,Softball,likely,no,190


<h2>2. Handling Missing Data</h2>

Check if there are any missing values or NAN values in each column of data set

In [113]:
# handling missing values in 'comfort_food_reasons_coded' column
Food_Choices['comfort_food_reasons_coded'] = Food_Choices['comfort_food_reasons_coded'].astype('str')
comfort_reasons = Food_Choices['comfort_food_reasons']
comfort_reasons_coding = Food_Choices['comfort_food_reasons_coded']
length = len(comfort_reasons)

print(type(comfort_reasons_coding[i]))


for i in range(length):
    if str(comfort_reasons_coding[i]) == 'nan':
        if 'Stress'.lower() in comfort_reasons[i].lower():
            comfort_reasons_coding[i].replace('nan','stress')
        elif('sadness'.lower() in comfort_reasons[i].lower() or 'sad'.lower() in comfort_reasons[i].lower()):
            comfort_reasons_coding[i].replace('nan','depression/sadness')
        elif('boredom'.lower() in comfort_reasons[i].lower()):
            comfort_reasons_coding[i].replace('nan','boredom')
        elif('happiness'.lower() in comfort_reasons[i].lower()):
            comfort_reasons_coding[i].replace('nan','happiness')
        elif('hormones'.lower() in comfort_reasons[i].lower()):
            comfort_reasons_coding[i].replace('nan','hormones')
        elif('tired'.lower() in comfort_reasons[i].lower()):
            comfort_reasons_coding[i].replace('nan','Tired')
Food_Choices['comfort_food_reasons_coded'] = comfort_reasons_coding
Food_Choices['comfort_food_reasons_coded'].unique()

<class 'str'>


array(['none', 'stress', 'boredom', 'hunger', 'depression/sadness',
       'happiness', 'cold weather', 'laziness', 'watching tv', 'nan'],
      dtype=object)

In [114]:
# checking if there are still any missing values
Food_Choices['comfort_food_reasons_coded'].isna()

0      False
1      False
2      False
3      False
4      False
       ...  
120    False
121    False
122    False
123    False
124    False
Name: comfort_food_reasons_coded, Length: 125, dtype: bool

In [115]:
Food_Choices.head()

Unnamed: 0,Gender,Breakfast Choice,Importance_of_calories,calories_scone,coffee_preference,comfort_food,comfort_food_reasons,comfort_food_reasons_coded,frequency_of_cooking,cuisine,...,marital_status,mother_profession,parents_cook,pay_meal_out,soup_preference,sports,type_sports,veggies_day,vitamins_supplements,weight(in pounds)
0,Male,Cereal,,315.0,creamy frapuccino,none,we dont have comfort,none,A couple of times a week,,...,Single,unemployed,Almost everyday,$5.01 to $10.00,veggie soup,Yes,car racing,very likely,yes,187
1,Female,Cereal,it is moderately important,420.0,espresso shown,"chocolate, chips, ice cream","Stress, bored, anger",stress,"Whenever I can, but that is not very often",American,...,In a relationship,Nurse RN,Almost everyday,$20.01 to $30.00,veggie soup,Yes,Basketball,likely,no,155
2,Female,Cereal,it is very important,420.0,espresso shown,"frozen yogurt, pizza, fast food","stress, sadness",stress,Every day,Korean/Asian,...,In a relationship,owns business,Almost everyday,$10.01 to $20.00,veggie soup,No,none,very likely,yes,I'm not answering this.
3,Female,Cereal,it is moderately important,420.0,espresso shown,"Pizza, Mac and cheese, ice cream",Boredom,boredom,A couple of times a week,Mexican/Spanish,...,In a relationship,Special Education Teacher,Almost everyday,$5.01 to $10.00,veggie soup,No,,neutral,yes,"Not sure, 240"
4,Female,Cereal,it is not at all important,420.0,espresso shown,"Ice cream, chocolate, chips","Stress, boredom, cravings",stress,Every day,Mexican/Spanish,...,Single,Substance Abuse Conselor,Almost everyday,$20.01 to $30.00,veggie soup,Yes,Softball,likely,no,190
