# Grocery ingredients #

### 1. Import Necessary Libraries ###

The code begins by importing the required Python libraries:
- `pandas as pd`: Used for data manipulation and analysis, particularly for handling the CSV dataset.
- `numpy as np`: Provides numerical operations, such as handling arrays and mathematical computations.
- `Dash` components (`Dash`, `dcc`, `html`, `Input`, `Output`, `State`): These are part of the Dash framework for building the web application:
    - `Dash`: The core class to initialize the app.
    - `dcc` (Dash Core Components): Provides interactive components like dropdowns and graphs.
    - `html`: Allows the creation of HTML elements for the app’s layout.
    - `Input`, `Output`, `State`: Used to define interactivity through callbacks.
- `dash_ag_grid as dag`: Adds an interactive data grid to display the dataset in a tabular format.
- `plotly.graph_objs as go`: Enables the creation of interactive visualizations like radar charts, scatter plots, bar charts, and histograms.

### 2. Data Loading and Preprocessing ###

This section loads the dataset and prepares it for analysis.

**Loading the Data**
- The dataset is loaded from a file named `GroceryDB_foods.csv` using `pd.read_csv()`.
- Debugging information is printed:
    - `Initial dataset shape:` shows the number of rows and columns (e.g., (1000, 20)).
    - `Columns:` lists all column names for verification.

**Renaming Columns**
- Columns are renamed for consistency and clarity using `df.rename()`:
    - `price percal` → `price_per_cal`
    - `Total Fat` → `total_fat`
    - `Carbohydrate` → `carbohydrate`
    - `Sugars, total` → `total_sugars`
    - `Fiber, total dietary` → `total_fiber`
    - `Fatty acids, total saturated` → `total_saturated_fat`
    - `Total Vitamin A` → `total_vitamin_A`
- This ensures uniform naming conventions (e.g., lowercase with underscores).

**Handling Missing Values**
- ***Vitamins and Minerals:***
    - A list `vitamins_minerals` defines columns like `Vitamin C`, `total_vitamin_A`, `Calcium`, and `Iron`.
    - Missing values in these columns are filled with `0` using `fillna(0)`.
    - If a column is missing, a warning is printed (e.g., "Warning: Column Vitamin C not found in the dataset.").
- ***Other Numeric Columns:***
    - For all numeric columns (detected with `df.select_dtypes(include=[np.number]`)):
        - If negative values exist in critical columns (`price` or `package_weight`), they are replaced with the column’s median, and a message is printed (e.g., "Corrected negative values in column price to median value 2.5.").
        - For other numeric columns with negative values, a warning is printed without correction.
        - Missing values are filled with the column’s median using `fillna(df[col].median())`.
- ***Categorical Columns:***
    - Missing values in text columns (detected with `df.select_dtypes(include=['object'])`) are filled with 'Unknown'.

## 3. Feature Engineering ##

This section creates new features and defines nutrient categories for analysis.

**Price per Weight**
- A new column `price_per_weight` is calculated as `price / package_weight` (assuming `package_weight` is in grams).
- This feature represents the cost per unit of weight, enhancing the dataset’s analytical value.
**Nutrient Groups**
- Two lists categorize nutrients for later nutrient density calculations:
    - `beneficial`: Nutrients that are good in higher amounts:
        - `Protein`, `total_fiber`, `Vitamin C`, `total_vitamin_A`, `Calcium`, `Iron`
    - `negative`: Nutrients that are harmful in excess:
        - `total_fat`, `total_saturated_fat`, `total_sugars`, `Sodium`, `Cholesterol`
- These groups are used to evaluate the nutritional quality of foods.

## 4. Dash App Setup and Layout ##

This section initializes the Dash application and defines its user interface.

**App Initialization**
- A Dash app is created with `app = Dash(__name__)` and a server is assigned (`server = app.server`) for deployment purposes.

**Dropdown Preparation**
- ***Categories***: Unique values from `harmonized single category` are sorted into `categories` for a dropdown menu.
- ***Brands***: Unique values from `brand` are sorted into `brands` for another dropdown menu.

**Nutrient and Axis Options**
- `nutrient_options`: A list of dictionaries for nutrient selection in visualizations, e.g., `{'label': 'Protein', 'value': 'Protein'}` for nutrients like `Protein`, `total_fat`, etc.
- `axis_options`: Options for scatter plot axes, e.g., `{'label': 'Price', 'value': 'price'}` for `price`, `price_per_cal`, `nutrient_density`, and `price_per_weight`.
- `color_options`: Same as `axis_options`, used for color-coding in the scatter plot.

**Recommended Daily Values**
- A dictionary `recommended_values` provides sample daily intake values for nutrients (used in percentage calculations):
    - `Protein`: 50g
    - `total_fat`: 70g
    - `total_fiber`: 28g
    - etc.
- These are estimates and can be adjusted.

**App Layout**
- The layout is a `html.Div` with a dark theme (`'backgroundColor': '#1a1a1a', color: 'white'`), containing:
    - ***Title:*** `Enhanced Grocery Ingredients Dashboard`.
    - ***Category Dropdown:*** Multi-select with all categories as default.
    - ***Brand Dropdown:*** Multi-select, defaulting to `DANNON`.
    - ***Nutrient Dropdown:*** Multi-select for radar chart nutrients, defaulting to `Protein` and `total_fat`.
    - ***Scatter Plot Axes:*** Dropdowns for X-axis (default: `price`) and Y-axis (default: `nutrient_density`).
    - ***Scatter Color Dropdown:*** Default: `price_per_weight`.
    - ***Histogram Nutrient Dropdown:*** Single-select, default: `Protein`.
    - ***Weight Inputs:*** Numeric inputs for beneficial (default: 1) and negative (default: 1) nutrient weights.
    - ***Radar Metric Dropdown:*** Options: `zscore`, `absolute`, `percentage`, default: `zscore`.
    - ***Graphs:*** Four `dcc.Graph` components for radar, scatter, bar, and histogram charts.
    - ***Data Grid:*** A `dag.AgGrid` displaying the dataset with filtering, sorting, and pagination.

## 5. Callbacks for Interactive Updates ##

This section defines a callback function to update the dashboard based on user inputs.

**Callback Definition**
- ***Outputs:*** Updates the data grid (`rowData`) and figures for the radar, scatter, bar, and histogram charts.
- ***Inputs:*** Values from all dropdowns, weight inputs, and radar metric selection.

**Input Handling**
- Ensures `selected_categories` and `selected_brands` are lists (converts strings to single-item lists or uses empty lists if None).

**Data Filtering**
- Filters the dataset (`dff`) to include only rows where `harmonized single category` and `brand` match the selected values.

**Nutrient Density Calculation**
- ***Z-Scores:*** For each nutrient in `beneficial` and `negative` present in `dff`, z-scores are computed on the filtered subset using `(value - mean) / std`.
- ***Density:***
    - Beneficial z-scores are multiplied by `beneficial_weight` and summed.
    - Negative z-scores are multiplied by `negative_weight` and summed.
    - `nutrient_density` = beneficial sum - negative sum (or 0 if no nutrients are present).

**Radar Chart**
- Groups data by category and calculates mean values for selected nutrients.
- Applies the selected `radar_metric`:
    - `zscore`: Normalizes values to z-scores.
    - `percentage`: Converts to percentage of `recommended_values`.
    - `absolute`: Uses raw values.
- Creates a `go.Scatterpolar` trace for each nutrient, with a dark-themed layout.

**Scatter Plot**
- Plots `scatter_x` vs. `scatter_y` with `scatter_color` as the color variable, using a dark theme and tooltips showing product names.

**Bar Chart**
- Shows average `price_per_cal` by category, with a dark theme and salmon-colored bars.

**Histogram**
- Displays the distribution of the selected nutrient (`hist_nutrient`) with 20 bins, using a dark theme and sky-blue bars.

**Return Values**
Returns updated grid data (`dff.to_dict('records')`) and figures for all four graphs.

## 6. Run the App ##
- The app runs in debug mode with app.run(debug=True) when executed as the main script.

### 1. Import necessary libraries ###

In [4]:
import pandas as pd
import numpy as np
from dash import Dash, dcc, html, Input, Output, State
import dash_ag_grid as dag
import plotly.graph_objs as go

### 2. Data Loading and Preprocessing ###

In [6]:
df = pd.read_csv('GroceryDB_foods.csv')
df

Unnamed: 0,name,store,harmonized single category,brand,price,price percal,package_weight,Protein,Total Fat,Carbohydrate,"Sugars, total","Fiber, total dietary",Calcium,Iron,Sodium,Vitamin C,Cholesterol,"Fatty acids, total saturated",Total Vitamin A
0,Stonyfield Organic Whole Milk Strawberry Beet ...,Target,baby-food,Stonyfield,5.29,0.043984,396.893000,5.050505,3.030303,12.121212,9.090909,0.000000,0.262626,0.000000,0.080808,0.000000,0.010101,2.020202,0.000018
1,Stonyfield Organic Whole Milk Pear Spinach Man...,Target,baby-food,Stonyfield,5.29,0.043984,396.893000,5.050505,3.030303,12.121212,9.090909,0.000000,0.262626,0.000000,0.080808,0.000000,0.010101,2.020202,0.000018
2,Once Upon a Farm Organic Mama Blueberry Fruit ...,Target,baby-food,Once Upon a Farm,2.79,0.055973,90.718400,1.098901,0.549451,13.186813,7.692308,2.197802,0.009890,0.000000,0.010989,,0.000000,0.000000,0.000101
3,Once Upon a Farm Organic Strawberry Kids&#39; ...,Target,baby-food,Once Upon a Farm,2.49,0.019213,90.718400,5.494505,7.692308,15.384615,8.791209,3.296703,0.020879,0.002198,0.000000,0.024176,0.000000,1.098901,
4,Horizon Organic Growing Years Strawberry Kids&...,Target,baby-food,DANNON,4.99,0.017781,396.893000,3.030303,1.010101,14.141414,6.060606,2.020202,0.131313,0.000000,0.050505,,0.005051,0.505051,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
50463,Fischer and Wieser Whole Lemon Fig Marmalade 1...,Walmart,spread-squeeze,Fischer & Wieser,3.97,,309.009550,,,,,,,,,,,,
50464,"Sabra Dark Chocolate Dessert Dip & Spread, 8 oz",Walmart,spread-squeeze,Sabra,,,226.796000,3.571429,16.071429,35.714286,21.428571,3.571429,0.000000,0.001286,0.142857,,0.000000,3.571429,
50465,"MaraNatha, No Stir Peanut Butter, 1.15 oz Packets",Walmart,spread-squeeze,MaraNatha,0.78,0.003828,32.601925,25.000000,53.125000,15.625000,3.125000,6.250000,0.000000,0.002250,0.203125,,0.000000,9.375000,
50466,Great Value No Stir Creamy Natural Peanut Butt...,Walmart,spread-squeeze,Great Value,3.76,0.000589,1133.980000,21.875000,46.875000,25.000000,3.125000,6.250000,0.081250,0.001125,0.375000,,0.000000,9.375000,


In [7]:
# Print basic info for debugging purposes
print("Initial dataset shape:", df.shape)
print("Columns:", df.columns.tolist())

Initial dataset shape: (50468, 19)
Columns: ['name', 'store', 'harmonized single category', 'brand', 'price', 'price percal', 'package_weight', 'Protein', 'Total Fat', 'Carbohydrate', 'Sugars, total', 'Fiber, total dietary', 'Calcium', 'Iron', 'Sodium', 'Vitamin C', 'Cholesterol', 'Fatty acids, total saturated', 'Total Vitamin A']


In [8]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50468 entries, 0 to 50467
Data columns (total 19 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   name                          50467 non-null  object 
 1   store                         50468 non-null  object 
 2   harmonized single category    50468 non-null  object 
 3   brand                         50078 non-null  object 
 4   price                         43684 non-null  float64
 5   price percal                  25921 non-null  float64
 6   package_weight                47529 non-null  float64
 7   Protein                       34515 non-null  float64
 8   Total Fat                     34645 non-null  float64
 9   Carbohydrate                  34633 non-null  float64
 10  Sugars, total                 32370 non-null  float64
 11  Fiber, total dietary          28945 non-null  float64
 12  Calcium                       28504 non-null  float64
 13  I

In [9]:
df.rename(columns={
    'price percal': 'price_per_cal',
    'package_weight': 'package_weight',
    'Total Fat': 'total_fat',
    'Carbohydrate': 'carbohydrate',
    'Sugars, total': 'total_sugars',
    'Fiber, total dietary': 'total_fiber',
    'Fatty acids, total saturated': 'total_saturated_fat',
    'Total Vitamin A': 'total_vitamin_A'
}, inplace=True)

In [10]:
# Define vitamins/minerals that should have missing values replaced with 0
vitamins_minerals = ['Vitamin C', 'total_vitamin_A', 'Calcium', 'Iron']
for col in vitamins_minerals:
    if col in df.columns:
        df[col] = df[col].fillna(0)
    else:
        print(f"Warning: Column {col} not found in the dataset.")

In [11]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50468 entries, 0 to 50467
Data columns (total 19 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   name                        50467 non-null  object 
 1   store                       50468 non-null  object 
 2   harmonized single category  50468 non-null  object 
 3   brand                       50078 non-null  object 
 4   price                       43684 non-null  float64
 5   price_per_cal               25921 non-null  float64
 6   package_weight              47529 non-null  float64
 7   Protein                     34515 non-null  float64
 8   total_fat                   34645 non-null  float64
 9   carbohydrate                34633 non-null  float64
 10  total_sugars                32370 non-null  float64
 11  total_fiber                 28945 non-null  float64
 12  Calcium                     50468 non-null  float64
 13  Iron                        504

In [12]:
# Fill missing values for other numeric columns with the median
# Also validate critical columns (price and package_weight)
for col in df.select_dtypes(include=[np.number]).columns:
    if col not in vitamins_minerals:
        if (df[col] < 0).any():
            # If negative values are found in price or package_weight, replace them with the median
            if col in ['price', 'package_weight']:
                median_val = df[col].median()
                df.loc[df[col] < 0, col] = median_val
                print(f"Corrected negative values in column {col} to median value {median_val}.")
            else:
                print(f"Warning: Negative values detected in column {col}.")
        df[col] = df[col].fillna(df[col].median())

In [13]:
# Fill missing values for categorical columns with "Unknown"
for col in df.select_dtypes(include=['object']).columns:
    df[col] = df[col].fillna('Unknown')

### 3. Feature Engineering ###

In [15]:
# Create a new feature: price per weight (assuming package_weight is in grams)
df['price_per_weight'] = df['price'] / df['package_weight']

In [16]:
# Define nutrient groups for nutrient density calculation.
# Beneficial nutrients (the higher, the better):
beneficial = ['Protein', 'total_fiber', 'Vitamin C', 'total_vitamin_A', 'Calcium', 'Iron']
# Negative nutrients (excess amounts reduce nutritional quality):
negative = ['total_fat', 'total_saturated_fat', 'total_sugars', 'Sodium', 'Cholesterol']

### 4. Dash App Setup and Layout ###

In [18]:
app = Dash(__name__)
server = app.server

In [19]:
# Prepare lists for dropdown menus
categories = sorted(df['harmonized single category'].unique())
brands = sorted(df['brand'].unique())

In [20]:
# Nutrient options for visualizations (using original column names)
nutrient_options = [{'label': col, 'value': col} for col in 
                    ['Protein', 'total_fat', 'total_fiber', 'total_saturated_fat', 'total_sugars', 
                     'Vitamin C', 'total_vitamin_A', 'Calcium', 'Iron']]

In [21]:
# Axis options for scatter plot
axis_options = [
    {'label': 'Price', 'value': 'price'},
    {'label': 'Price per Calorie', 'value': 'price_per_cal'},
    {'label': 'Nutrient Density', 'value': 'nutrient_density'},
    {'label': 'Price per Weight', 'value': 'price_per_weight'}
]

In [22]:
# Color coding options for scatter plot (same as axis options)
color_options = axis_options

In [23]:
# Recommended daily values for some nutrients (for percentage calculation)
# These values are sample estimates and can be adjusted.
recommended_values = {
    'Protein': 50,               # grams
    'total_fat': 70,             # grams
    'total_fiber': 28,           # grams
    'total_saturated_fat': 20,   # grams
    'total_sugars': 90,          # grams
    'Vitamin C': 90,             # mg
    'total_vitamin_A': 900,      # mcg
    'Calcium': 1300,             # mg
    'Iron': 18                 # mg
}

In [24]:
# Layout of the app with dark theme and additional controls for radar chart metric selection
app.layout = html.Div(style={'backgroundColor': '#1a1a1a', 'color': 'white', 'padding': '10px'}, children=[
    html.H1('Enhanced Grocery Ingredients Dashboard', style={'textAlign': 'center'}),
    # Dropdown for category selection
    html.Div([
        html.Label('Select Category:', style={'margin-right': '10px'}),
        dcc.Dropdown(
            id='category-dropdown',
            options=[{'label': cat, 'value': cat} for cat in categories],
            value=categories,
            multi=True,
            style={'color': 'black', 'width': '300px'}
        )
    ], style={'display': 'inline-block', 'padding': '10px'}),
    # Dropdown for brand selection
    html.Div([
        html.Label('Select Brand:', style={'margin-right': '10px'}),
        dcc.Dropdown(
            id='brand-dropdown',
            options=[{'label': brand, 'value': brand} for brand in brands],
            value='DANNON',
            multi=True,
            style={'color': 'black', 'width': '300px'}
        )
    ], style={'display': 'inline-block', 'padding': '10px'}),
    # Dropdown for selecting nutrients for the radar chart
    html.Div([
        html.Label('Select Nutrients for Radar Chart:', style={'margin-right': '10px'}),
        dcc.Dropdown(
            id='nutrient-dropdown',
            options=nutrient_options,
            value=['Protein', 'total_fat'],
            multi=True,
            style={'color': 'black', 'width': '500px'}
        )
    ], style={'padding': '10px'}),
    # Dropdowns to select X and Y axes for the scatter plot
    html.Div([
        html.Label('Select X-axis for Scatter Plot:', style={'margin-right': '10px'}),
        dcc.Dropdown(
            id='scatter-x-dropdown',
            options=axis_options,
            value='price',
            style={'color': 'black', 'width': '250px'}
        ),
        html.Label('Select Y-axis for Scatter Plot:', style={'margin-left': '10px', 'margin-right': '10px'}),
        dcc.Dropdown(
            id='scatter-y-dropdown',
            options=axis_options,
            value='nutrient_density',
            style={'color': 'black', 'width': '250px'}
        )
    ], style={'padding': '10px'}),
    # Dropdown for selecting the color variable for the scatter plot
    html.Div([
        html.Label('Select Color Variable for Scatter Plot:', style={'margin-right': '10px'}),
        dcc.Dropdown(
            id='scatter-color-dropdown',
            options=color_options,
            value='price_per_weight',
            style={'color': 'black', 'width': '300px'}
        )
    ], style={'padding': '10px'}),
    # Dropdown to select the nutrient for the histogram
    html.Div([
        html.Label('Select Nutrient for Histogram:', style={'margin-right': '10px'}),
        dcc.Dropdown(
            id='histogram-nutrient-dropdown',
            options=nutrient_options,
            value='Protein',
            multi=False,
            style={'color': 'black', 'width': '300px'}
        )
    ], style={'padding': '10px'}),
    # Input fields for setting custom weight factors for nutrient density calculation
    html.Div([
        html.Label('Set Beneficial Weight Factor (default=1):', style={'margin-right': '10px'}),
        dcc.Input(
            id='beneficial-weight-input',
            type='number',
            value=1,
            min=0,
            step=0.1,
            style={'color': 'black', 'width': '100px'}
        ),
        html.Label('Set Negative Weight Factor (default=1):', style={'margin-left': '20px', 'margin-right': '10px'}),
        dcc.Input(
            id='negative-weight-input',
            type='number',
            value=1,
            min=0,
            step=0.1,
            style={'color': 'black', 'width': '100px'}
        ),
    ], style={'padding': '10px'}),
    # Dropdown to choose the radar chart normalization method
    html.Div([
        html.Label('Select Radar Chart Metric:', style={'margin-right': '10px'}),
        dcc.Dropdown(
            id='radar-metric-dropdown',
            options=[
                {'label': 'Z-Score', 'value': 'zscore'},
                {'label': 'Absolute', 'value': 'absolute'},
                {'label': 'Percentage of Daily Value', 'value': 'percentage'}
            ],
            value='zscore',
            multi=False,
            style={'color': 'black', 'width': '300px'}
        )
    ], style={'padding': '10px'}),
    # Graphs: Radar Chart, Scatter Plot, Bar Chart, and Histogram
    html.Div([
        dcc.Graph(id='radar-chart'),
    ], style={'padding': '20px'}),
    html.Div([
        dcc.Graph(id='scatter-plot'),
    ], style={'padding': '20px'}),
    html.Div([
        dcc.Graph(id='bar-chart'),
    ], style={'padding': '20px'}),
    html.Div([
        dcc.Graph(id='histogram-chart'),
    ], style={'padding': '20px'}),
    # Data Grid: Display raw data
    html.H2('Data Table', style={'textAlign': 'center'}),
    dag.AgGrid(
        id='data-grid',
        rowData=df.to_dict('records'),
        columnDefs=[{"field": i, 'filter': True, 'sortable': True} for i in df.columns],
        dashGridOptions={'pagination': True},
        columnSize='sizeToFit'
    ),
])

### 5. Callbacks for Interactive Updates ###

In [26]:
@app.callback(
    [Output('data-grid', 'rowData'),
     Output('radar-chart', 'figure'),
     Output('scatter-plot', 'figure'),
     Output('bar-chart', 'figure'),
     Output('histogram-chart', 'figure')],
    [Input('category-dropdown', 'value'),
     Input('brand-dropdown', 'value'),
     Input('nutrient-dropdown', 'value'),
     Input('scatter-x-dropdown', 'value'),
     Input('scatter-y-dropdown', 'value'),
     Input('scatter-color-dropdown', 'value'),
     Input('histogram-nutrient-dropdown', 'value'),
     Input('beneficial-weight-input', 'value'),
     Input('negative-weight-input', 'value'),
     Input('radar-metric-dropdown', 'value')]
)
def update_dashboard(selected_categories, selected_brands, selected_nutrients, scatter_x, scatter_y, scatter_color, hist_nutrient, beneficial_weight, negative_weight, radar_metric):
    # Ensure inputs are lists, default to empty list if None
    if isinstance(selected_categories, str):
        selected_categories = [selected_categories]
    elif not selected_categories:
        selected_categories = []
    if isinstance(selected_brands, str):
        selected_brands = [selected_brands]
    elif not selected_brands:
        selected_brands = []
    
    # Filter data based on selected categories and brands.
    dff = df[(df['harmonized single category'].isin(selected_categories)) & (df['brand'].isin(selected_brands))].copy()

    # Recalculate nutrient_density using user-defined weights on the filtered data
    # For each nutrient in beneficial/negative lists that exists in the data,
    # compute z-scores on the filtered subset
    def compute_subset_z(s):
        return (s - s.mean()) / s.std() if s.std() !=0 else 0

    # Create temporary DataFrame for z-scores on the subset
    z_sub = pd.DataFrame()
    for nutrient in beneficial + negative:
        if nutrient in dff.columns:
            z_sub[nutrient] = compute_subset_z(dff[nutrient])

    # Keep only the nutrients that exist in the subset for each group
    beneficial_cols = [nutrient for nutrient in beneficial if nutrient in z_sub.columns]
    negative_cols = [nutrient for nutrient in negative if nutrient in z_sub.columns]

    # Compute nutrient_density as sum(weighted beneficial z-scores) minus sum(weighted negative z-scores)
    if not z_sub.empty:
        density_beneficial = z_sub[beneficial_cols].mul(beneficial_weight).sum(axis=1)
        density_negative = z_sub[negative_cols].mul(negative_weight).sum(axis=1)
        dff['nutrient_density'] = density_beneficial - density_negative
    else:
        dff['nutrient_density'] = 0

    # Radar Chart: Compute metric based on user selection
    # Group data by category and calculate mean values for the selected nutrients
    df_radar = dff.groupby('harmonized single category')[selected_nutrients].mean().reset_index()

    # Depending on radar_metric choice, process the data:
    # - "zscore": normalize each nutrient via z-score (for the subset of data)
    # - "absolute": use raw absolute values
    # - "percentage": calculate percentage of recommended daily value
    if radar_metric == 'zscore':
        for nutrient in selected_nutrients:
            df_radar[nutrient] = (df_radar[nutrient] - df_radar[nutrient].mean()) / df_radar[nutrient].std() if df_radar[nutrient].std() != 0 else 0
    elif radar_metric == 'percentage':
        for nutrient in selected_nutrients:
            rec_value = recommended_values.get(nutrient, 100)
            df_radar[nutrient] = (df_radar[nutrient] / rec_value) * 100
    radar_fig = go.Figure()
    # Add a trace for each selected nutrient
    for nutrient in selected_nutrients:
        radar_fig.add_trace(go.Scatterpolar(
            r=df_radar[nutrient],
            theta=df_radar['harmonized single category'],
            fill='toself',
            name=nutrient
        ))
    # Set layout for the radar chart with dark theme
    radar_fig.update_layout(
        title=f"Average Nutrients by Category ({'Z-Score' if radar_metric=='zscore' else ('Percentage' if radar_metric=='percentage' else 'Absolute')})",
        polar=dict(
            radialaxis=dict(
                visible=True,
                color='white'
            )
        ),
        template='plotly_dark',
        font=dict(color='white')
    )
    # Scatter Plot: Dynamic axes and color coding based on user selections
    scatter_fig = go.Figure(data=go.Scatter(
        x=dff[scatter_x],
        y=dff[scatter_y],
        mode='markers',
        marker=dict(
            size=10,
            color=dff[scatter_color],
            colorscale='Viridis',
            showscale=True
        ),
        text=dff['name']
    ))
    scatter_fig.update_layout(
        title=f"Scatter Plot: {scatter_x} vs {scatter_y} (Color: {scatter_color})",
        xaxis_title=scatter_x,
        yaxis_title=scatter_y,
        template='plotly_dark',
        font=dict(color='white')
    )
    # Bar Chart: Compare average price per calorie by category
    df_bar = dff.groupby('harmonized single category')['price_per_cal'].mean().reset_index()
    bar_fig = go.Figure(data=go.Bar(
        x=df_bar['harmonized single category'],
        y=df_bar['price_per_cal'],
        marker_color='lightsalmon'
    ))
    bar_fig.update_layout(
        title='Average Price per Calorie by Category',
        xaxis_title='Category',
        yaxis_title='Price per Calorie',
        template='plotly_dark',
        font=dict(color='white')
    )
    # Histogram: Distribution of selected nutrient
    hist_fig = go.Figure(data=go.Histogram(
        x=dff[hist_nutrient],
        nbinsx=20,
        marker_color='lightskyblue'
    ))
    hist_fig.update_layout(
        title=f"Distribution of {hist_nutrient}",
        xaxis_title=hist_nutrient,
        yaxis_title='Count',
        template='plotly_dark',
        font=dict(color='white')
    )

    return dff.to_dict("records"), radar_fig, scatter_fig, bar_fig, hist_fig

### 6. Run the App ##

In [28]:
if __name__ == "__main__":
    app.run(debug=True)