**Food Prices and their Influence on Eating Behavior**

A prominent and plain influence on eating behaviour is food prices. When certain foods are cheaper, consumers are more likely to buy those types of foods than other, more expensive ones. 
The database used in this research, 'Food Prices in Turkey', can help with investigating the extent of this influence. Although its data is not from the United States of America as with our second database, the data can still be preprocessed to fit our research as best as possible.

In [1]:
import pandas as pd
import plotly.graph_objs as go
import plotly.express as px
import plotly.subplots as sp
from plotly.subplots import make_subplots
import numpy as np
import string

In [2]:
db_raw = pd.read_csv("train.csv")
db = db_raw[db_raw['Year']==2015]

The data from our newly preprocessed dataset now only includes data from 2015, just like our main dataset. Furthermore, designating the different foods within the dataset to their respective food categories results in the following list of categories:

- Grains & Potatoes (Rice, Wheat flour, Pasta, Bulgur, Bread (common), Bread (pita), Potatoes) --> carbohydrates

- Legumes (Beans (white), Lentils, Chickpeas, Peas (green, dry)) --> carbohydrates, proteins

- Fruits & Vegetables (Apples (red), Bananas, Oranges, Tomatoes, Garlic, Onions, Cabbage, Cauliflower, Cucumbers (greenhouse), Spinach, Eggplants) --> vitamines, fibers

- Meats (Meat (chicken), Meat (mutton), Meat (veal), Fish (fresh)) --> proteins

- Dairy (Milk (pasteurized), Yogurt, Cheese) --> fats, proteins

- Sugar --> carbohydrates

It is evident that the dataset is only referring to whole foods (not processed), which should be taken into account when viewing our data. Nonetheless, our data is represented in the following diagram:

In [3]:
data_barchart = {
    "Categories": ["Grains & Carbs", "Grains & Carbs", "Grains & Carbs", "Grains & Carbs", "Grains & Carbs", "Grains & Carbs", "Grains & Carbs",
                   "Legumes", "Legumes", "Legumes", "Legumes",
                   "Fruits & Vegetables", "Fruits & Vegetables", "Fruits & Vegetables", "Fruits & Vegetables", "Fruits & Vegetables", "Fruits & Vegetables", "Fruits & Vegetables", "Fruits & Vegetables", "Fruits & Vegetables", "Fruits & Vegetables", "Fruits & Vegetables",
                   "Meats", "Meats", "Meats", "Meats",
                   "Dairy", "Dairy", "Dairy", "Sugar"],
    "Subcategories": ["Rice", "Wheat flour", "Pasta", "Bulgur", "Bread (common)", "Bread (pita)", "Potatoes",
                      "Beans (white)", "Lentils", "Chickpeas", "Peas (green, dry)",
                      "Apples (red)", "Bananas", "Oranges", "Tomatoes", "Garlic", "Onions", "Cabbage", "Cauliflower", "Cucumbers (greenhouse)", "Spinach", "Eggplants",
                      "Meat (chicken)", "Meat (mutton)", "Meat (veal)", "Fish (fresh)",
                      "Milk (pasteurized)", "Yogurt", "Cheese", "Sugar"],
    "Prices": [6.43, 2.67, 2.94, 2.68, 2.98, 2.31, 1.54,
               7.29, 5.79, 5.67, 3.17,
               2.4, 5.37, 1.71, 1.93, 10.94, 1.38, 1.2, 2.77, 1.98, 2.66, 1.92,
               7.02, 27.0, 33.12, 14.09,
               2.82, 3.98, 15.42, 3.85]
}

data_piechart = {
    "Categories": ["Grains & Carbs", "Legumes", "Fruits & Vegetables", "Meats & Other", "Dairy", "Sugar"],
    "Total averages": [3.24, 6.01, 3.45, 16.32, 6.53, 3.85],
    "Ideal Percentages": [20, 20, 50, 15, 15, 5],
    "Colors": ["yellow", "brown", "green", "red", "lightblue", "orange"]
}

df_bar = pd.DataFrame(data_barchart)
df_pie = pd.DataFrame(data_piechart)

color_map1 = {
    "Grains & Carbs": "rgb(230, 230, 0)",
    "Legumes": "rgb(140, 35, 35)",
    "Fruits & Vegetables": "rgb(0, 100, 0)",
    "Meats": "rgb(200, 0, 0)",
    "Dairy": "rgb(150, 190, 200)",
    "Sugar": "rgb(200, 130, 0)"
}

color_map2 = {
    "Grains & Carbs": "rgb(255, 255, 0)",
    "Legumes": "rgb(165, 42, 42)",
    "Fruits & Vegetables": "rgb(0, 128, 0)",
    "Meats": "rgb(255, 0, 0)",
    "Dairy": "rgb(173, 216, 230)",
    "Sugar": "rgb(255, 165, 0)"
}

df_mean = df_bar.groupby("Categories")["Prices"].mean().reset_index()

scatter_fig = go.Figure()

for category, color in color_map1.items():
    category_data = df_bar[df_bar["Categories"] == category]
    scatter_fig.add_trace(
        go.Scatter(
            x = category_data["Categories"],
            y = category_data["Prices"],
            mode = "markers",
            marker = dict(
                color = color,
                symbol = "circle",
                size = 8,
                opacity = 0.7
            ),
            name = category,
            hovertemplate = "%{text}<extra></extra>",
            text = category_data["Subcategories"]
        )
    )

bar_fig = go.Figure()

for category, color in color_map2.items():
    category_mean = df_mean[df_mean["Categories"] == category]
    bar_fig.add_trace(
        go.Bar(
            x = category_mean["Categories"],
            y = category_mean["Prices"],
            marker = dict(
                color = color
            ),
            name = f"{category} (Mean)",
        )
    )

combined_fig = go.Figure(data = scatter_fig.data + bar_fig.data)

combined_fig.update_layout(
    title = "Average Prices of Different Foods in the USA, 2015",
    barmode = "overlay",
    xaxis = {"title": "Food Categories"},
    yaxis = {"title": "Average Price in 2015 ($ per kg)"},
    height = 400,
    width = 750,
    showlegend = False
)

combined_fig.show()

pie_fig = go.Figure()

pie_fig.add_trace(
    go.Pie(labels = df_pie["Categories"], values = df_pie["Ideal Percentages"], marker = dict(colors = df_pie["Colors"]))
)

pie_fig.update_layout(
    title = "Recommended Nutrition per Food Category",
    height = 400,
    width = 750
)

pie_fig.show()

The most notable statistic from this visualization is the high cost of meat, which is a main source of protein for many. Other sources of proteins and healthy fats such as dairy and legumes are also more expensive than most sources of carbohydrates. Carbohydrates like pasta, potatoes and direct sugars are often eaten in abundance by people who are considered unhealthy. When viewing the recommended nutrition of all food categories, it is significant how carbohydrates and especially sugars should be eaten with some sort of moderation. Prices of vegetables and fruits are fortunately relatively low, but as mentioned before, this data regards whole foods only. Particularly in the USA, processed foods tend to become more expensive with a more healthy 'image', whereas junk food prices remain the same. 