# Module 1: Can Food Models have Bias? 
### _A rigorous evaluation of the impact of data on model performance_

Model performance can vary significantly with different groups or types of data, especially if a model has high bias. Models have bias in part because training data has bias, and training data has bias because people curate training data. A prerequisite to evaluating model bias is understanding how your model performs on different types of data. To achieve a comprehensive understanding of model performance, you can use established evaluation metrics. In this module, you will learn what those metrics are, how they work, and how to interpret them. 


<img src="model_bias.png" alt="drawing" width="450" height="450"/>

This module will take place over the course of 3 in-class periods and an assignment. The schedule for the module is as follows: 

## Module Schedule

### Day 1: Intro to the Blackbox

#### _Agenda:_

* Discussion: impacts of machine learning (15 min)
* Lecture: The life cycle of an ML model (15 min)
* In-class activity: Playing with input data (40 min)
* Reflection (15 min)
* Brief course overview and assignment intro (10 min)
* Fill out course entrance survey and ask remaining questions (5 min)

#### _Homework:_ 

* Read [this](https://www.datacamp.com/blog/machine-learning-lifecycle-explained) article on the ML model development cycle

### Day 2: Model Evaluation

#### _Agenda:_

* Discussion: Debriefing the homework reading (10 min)
* Lecture: Evaluation Metrics (20 min)
* In-class activity: Interpreting metrics practice (40 min)
* Reflection (5 min)
* Work time: Assignment 1 (25 min)

#### _Homework:_ 

* Finish through section x of assignment 1 (in this notebook!)
* Read [this](https://spotintelligence.com/2023/04/07/data-quality-machine-learning/) article on evaluating data quality

### Day 3: Dataset Evaluation

#### _Agenda:_

* Discussion: How might you use evaluation metrics to assess bias? (15 min)
* Lecture: 
    - Dataset splitting (10 min)
    - In dataset vs. out of dataset evaluation (10 min)
* In class activity: Playing with datasets and dataset splitting (40 min)
* Reflection (5 min)
* Work time: Assignment 1 (20 min)

### _Homework:_ 

* Finish assignment 1

In [6]:
import json

with open("results.json", "r") as file:
    data = json.load(file)
    
class_mappings = data["class_indices"]
predictions = [x[0] for x in data["predictions_and_labels"]]
gt_labels = [x[1] for x in data["predictions_and_labels"]]

# Assignment 1: 
Welcome to the first Assignment of Machine Learning! Pretend that you are the head of the R&D department for FoodNetwork (change to be more funny), which is a new FoodTech start-up that is aimed at helping members of the visually impaired and blind community gain more autonomy in dining (see https://www.jsr.org/hs/index.php/path/article/view/2341 for more information on this topic!) situations. As one of your first products, you are working on a Machine Learning-based approach to identifying foods in a given image. Three of your employees have developed their own classification models to accomplish this task (`model1`, `model2`, and `model3`) and you need to make a recommendation of if any of the three models are up to par with being released to your customers.

## Metrics Overview
Before being able to evaluate your employees' models, you need to develop metrics to quantitative assess the validation results of each model. The metrics you will implement in this assignment are:

* Confusion Matrix 
* Classification Accuracy,
* Precision 
* Recall

_Consult [this article](https://arxiv.org/abs/2008.05756) to learn about the metrics and inform your implementation_

## Confusion Matrix

In [1]:
## Because of how graphical the confusion matrix is, we will implement most of it for you: 


## Classification Accuracy
First, read [this article](https://www.sharpsightlabs.com/blog/classification-accuracy-explained/) about classification accuracy for a brief intro. 

In other words, "classification accuracy is the ratio of the number of correct predictions to the total number of input samples" (Yalug, et al.) ([Source](https://www.sciencedirect.com/science/article/abs/pii/B9780128228289000058))

## Precision

## Recall