![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)
 
<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fartificial-intelligence-for-educators&branch=main&urlpath=notebooks/artificial-intelligence-for-educators/ai-for-educators.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"></a>

<h1 align='center'>The Importance of Ethics in AI</h1>

### What is a Jupyter notebook?

A Jupyter notebook is an online document that can include both text and code in different “cells” or parts of the document. It allows code to be shared alongside text that explains and contextualizes the data, which makes it a popular resource for both data science and instruction.

These documents run on Callysto Hub as well as Google Colab, IBM Watson Studio, and other places.

This presentation is a Jupyter notebook!

### Callysto notebooks ready for you to use!

On our website, [callysto.ca](https://www.callysto.ca/), you will find lesson plans, courses and learning modules that support you in incorporating coding into your lessons. 

Some examples focusing on statistics:



| |
|-|
|<img src="./images/samplenotebooks.png" width="600">|

## Example: Comparing data on people's preferred season

#### Introduction to Python and simple datasets

Here we'll create a simple table and show how it can be visualized. You don't need to modify any of the code cells, but if you'd like to play around with the code please do!

To run each cell:

- Click the `Run` button up above
- Click within the cell and press `Shift+Enter`

In [None]:
# Creating data set
total_participants = 30
prefer_spring = 5
prefer_summer = 10
prefer_fall = 10
prefer_winter = 5
no_answer = total_participants - (prefer_spring + prefer_summer + prefer_fall + prefer_winter)
print('Data created')

In [None]:
import pandas as pd
answer = {"Season": ["Spring", "Summer", "Fall", "Winter", "No answer"],
           "Count": [prefer_spring, prefer_summer, prefer_fall, prefer_winter, no_answer]}

answer_table = pd.DataFrame(answer)
answer_table

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt
sns.barplot(x = answer_table["Season"], y =  answer_table['Count'])
plt.title("Comparison of season preference")
plt.ylabel("Count")
plt.xlabel("Season")
plt.show()

# Exercise: Using machine learning to predict sports enrolment
Nowadays, machine learning is really accessible and easy to implement. The knowledge and tools are freely available (like this notebook!), but there are far fewer resources on how to ensure it's being used in a fair and equitable way. We'll go through an example of how it can be used and potential problems you may face when applying machine learning as a solution.

Data has been collected from 150 male students (all 18 years old) in three groups: 
1. 50 students excelled in team sports (football, basketball, hockey)
1. 50 students excelled in individual sports (swimming, cycling, snowboarding)
1. 50 students did not excel in any sports.

The students were scored by the same coach in the same school. The following parameters were collected:

1. Teamwork skills (coach score)
1. Speed (coach score)
1. Strength (coach score)
1. Agility (coach score)

### Using this (hypothetical) training data, can we build a model that predicts which sports students will excel in?

#### Process

1. Get familiar with the data (table, summary statistics, plots)

1. Training the model

1. Evaluate model performance

1. Report findings

### Manage & Clean Data

In [None]:
# Load and visualize the data
from pandas import read_csv
from sklearn import datasets
import pandas as pd
import seaborn as sns
from pandas.plotting import scatter_matrix
import matplotlib.pyplot as plt

# Machine learning
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

# Evaluate model
from sklearn.metrics import accuracy_score
from sklearn.metrics import ConfusionMatrixDisplay as cm
from sklearn.metrics import classification_report
from sklearn import svm, datasets
print('Libraries imported')

In [None]:
# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
df = pd.DataFrame(X, columns = iris.feature_names)
df['y'] = y

df['y'].replace({0:"Individual sport",
                 1: "No sport",
                 2: "Team sport"},inplace=True)

df.rename(columns={"sepal length (cm)":"Strength",
                   "sepal width (cm)": "Speed",
                   "petal length (cm)": "Teamwork",
                   "petal width (cm)": "Agility",
                   "y":"SelectedSport"}, 
                   inplace= True)
print('Dataset loaded')

## Explore Data

Randomly sample 10 of the students from the dataset:

In [None]:
df.sample(10)

Verify that all our features have 150 entries, and that there is no missing (i.e. null) data):

In [None]:
df.info()

List the unique values in the dataset for our target variable `SelectedSport`:

In [None]:
df['SelectedSport'].unique()

### Descriptive Statistics

In [None]:
df.describe()

Broken down by sport:

In [None]:
individual = df[df['SelectedSport']=='Individual sport']
no_sport = df[df['SelectedSport']=='No sport']
team_sport = df[df['SelectedSport']=='Team sport']

print('Individual sport')
display(individual.describe())
print('No sport')
display(no_sport.describe())
print('Team sport')
display(team_sport.describe())

### Generating visualization from summary stats

In [None]:
import ipywidgets as widgets
from IPython.display import display
dropdowna = widgets.Dropdown(
    options=['Strength', 'Speed', 'Teamwork','Agility'],
    value='Strength',
    description='Item:',
    disabled=False,
)
print('Function defined')

Change the feature in the dropdown box and re-run the cell to view the box plot for each variable:

In [None]:
# Box plots
display(dropdowna)
sns.catplot(x="SelectedSport", y=dropdowna.value, data=df,hue='SelectedSport',kind='box').set(title=f'Selected variable: {dropdowna.value}')
plt.show()

### Generating distribution visualization

In [None]:
dropdownb = widgets.Dropdown(
    options=['Strength', 'Speed', 'Teamwork','Agility'],
    value='Strength',
    description='Item:',
    disabled=False,
)
print('Function defined')

In [None]:
display(dropdownb)
print("Histogram for various measurements (per class): ",dropdownb.value)
sns.displot(df, x=dropdownb.value, col="SelectedSport", multiple="dodge");

### Insights

| Activity | Teamwork | Speed | Strength | Agility|
| -    | -          | -     | -        | -     |
|Independent sport|Lowest |Highest  | Lowest | Lowest |
|No sport|Medium|Lowest|  Medium| Medium |
|Team sport|Highest|Medium| Highest| Highest |


# Hands-on Machine Learning

After getting familiarized with our data through descriptive statistics and plots, we can then use machine learning to help us make decisions on any future students who we'd like to recommend a sport to.

#### Independent variables (or variables we use to predict):

1. Teamwork
2. Speed
3. Strength
4. Agility

#### Dependent variable (or the variable we want to predict):

Type of activity

#### Goal:

Use an algorithm to predict what sport *new* students would be best at, given their scores in the four categories.

## Machine learning approach

Randomly split dataset into **training** (80%) and **testing** (20%) data.

Training data is used to, well, train the model. Training sets the parameters in the model that allows it to make predictions.

After training, the algorithm performance will be evaluated by using the model to predict the outcomes for the testing data, and comparing that to their known values.


In [None]:
array = df.values
# All measurements
X = array[:,0:4]
# All classes
y = array[:,4]
# Split-out validation dataset
X_train, X_validation, Y_train, Y_validation = train_test_split(X, y, 
                                                                test_size=0.20, 
                                                                random_state=1, 
                                                                shuffle=True)
print('First 10 sets of training data: ')
display(X_train[:10])

The problem we're attempting to solve falls into the ML category of **classification**, where our outcome of interest is membership in a group, or class. This is a common problem to solve, and there are many approaches we could take. The previous step of exploring our data can be helpful in guiding the specific ML technique we can use, and there's rarely only one appropriate method.

We will use a [Support Vector Machine (SVM) classifier](https://www.kdnuggets.com/2016/07/support-vector-machines-simple-explanation.html) - a type of algorithm exploring non-linear relationships:

<center><img src="https://miro.medium.com/max/1400/1*ZpkLQf2FNfzfH4HXeMw4MQ.png" width="800"></center>
<p> 
<center> <a href="https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47">Towards Data Science</a></center>
</p>



In [None]:
# Make predictions on validation dataset
model = SVC(gamma='auto')
model.fit(X_train, Y_train) # Fit the model
predictions = model.predict(X_validation) # Make predictions
print('Predictions made')

In [None]:
print(f'Accuracy score: {round(accuracy_score(Y_validation, predictions),5)}')

We can see that the accuracy is 0.96667 or about 97% on the test data. However, accuracy on its own is not sufficient to measure model performance. We would also like to know *where* the model gets the classification wrong.

In [None]:
print(classification_report(Y_validation, predictions))

The metrics are calculated by using true and false positives, and true and false negatives, generating what's known as a **Confusion Matrix**:

<center><img src="https://www.nbshare.io/static/snapshots/cm_colored_1-min.png" width="600"></center>
<p> 
<center> <a href="https://www.nbshare.io/notebook/626706996/Learn-And-Code-Confusion-Matrix-With-Python/">NBShare</a></center>
</p>




**Precision** is a measure of a model's ability to correctly discern positive cases. Mathematically, it is the number of correctly classified positive cases (True Positives) over the total number of positive classifications (True Positive + False Positive). Another way to look at precision is to consider that for all the cases labelled positive, how many were truly positive.

**Recall** is the ability of a classifier to find all positive instances. It can be calculated by dividing the number of True Positives by the sum of True Positives and False Negatives. Simply labelling all classes as positive would result in a recall rate of 100% (or 1) for that class, but would drastically reduce precision.

Precision and recall are like different sides of a see-saw; as one increases, the other decreases (for the same dataset). **F1 Score** is a balanced average of the two scores, and is often reported when evaluating a model's performance.

For all three measurements, a value of 1.0 is perfectly accurate, and 0.0 is perfectly useless. Support is the number of samples belonging to each class.

In [None]:
classifier = model.fit(X_train, Y_train)
fig, ax = plt.subplots(figsize=(8, 8))
cm.from_estimator(classifier, X_validation, Y_validation,display_labels=['Individual sport', 'No sport', 'Team sport'],cmap=plt.cm.Blues,
                      normalize=None,ax=ax)
plt.title('Confusion Matrix for Selected Sport', fontsize=20)
plt.show()

The diagonal elements represent the number of points for which the predicted label is equal to the true label, while off-diagonal elements are those that are mislabeled by the classifier. As we can see here, there's only one mislabeled observation (out of 30).

<h2 align='center'>Final Analysis</h2>

<h3 align='center'>Let's take a look at the predicted values</h3>

In [None]:
pred_df = pd.DataFrame(X_validation,columns=df.columns[0:4])

pred_df['Predicted Class'] = predictions

dropdownc = widgets.Dropdown(
    options=['Individual sport', 'No sport', 'Team sport'],
    value='Individual sport',
    description='Class:',
    disabled=False,
)
print('Run the cell below, select from the dropdown, and rerun the cell to change the displayed class')

In [None]:
## 11 individual, 12 no sport, 7 team sport
display(dropdownc)
pred_df[pred_df['Predicted Class']==dropdownc.value]

### Which one did it get wrong? 


In [None]:
import numpy as np
y_test = np.asarray(Y_validation)
misclassified = np.where(y_test != model.predict(X_validation))

print(misclassified)

<h2 align='center'>Reporting</h2>

So as we can see in the exercise here, machine learning is relatively simple to implement. With only a few collected variables, we can estimate our outcome of interest (Selected Sport) with high accuracy. In our example, after we trained the model, only one of the 30 students whose data we ran through the model was misclassified. The term misclassified here has a specific definition, because the data we used to *test* the model already had known values (ground truths) for the sport.

In reality, once you've trained and validated your model and put it into use, it's much more difficult to know how accurate your results are. Even the most accurate machine learning models can only be accurate to the data that was used to construct them, and if that data is messy (or **biased**), that can affect the conclusions that are drawn from them. There's a saying in computer science that's been around since the field's inception: *'Garbage in, garbage out'*. This is the most important role of anyone creating any data science or algorithmic content, ensuring that the data entering the model is as fair and equitable as possible:

## Discussion: Identifying bias

1. What issues can you identify in the initial problem statement?

1. What biases in the data can you identify?

1. What are the consequences of those biases when the algorithm is in action?

<h2 align='center'>Real examples</h2>

Amazon ditches AI recruiting tool that was only hiring men [(Reuters)](https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G)


Can Racist Algorithms Be Fixed? [(The Marshall Project)](https://www.themarshallproject.org/2019/07/01/can-racist-algorithms-be-fixed)


Black and Asian faces misidentified more often by facial recognition software [(CBC)](https://www.cbc.ca/news/technology/facial-recognition-race-1.5403899)


UK ditches exam results generated by biased algorithm after student protests [(The Verge)](https://www.theverge.com/2020/8/17/21372045/uk-a-level-results-algorithm-biased-coronavirus-covid-19-pandemic-university-applications)

<h1 align='center'> What can we do? </h1>

![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<h2 align='center'>Getting Started with Callysto</h2>

- Feedback form https://tinyurl.com/y2a3uhdt
- Online self-paced courses (courses.callysto.ca)  
- Preview our learning modules https://callysto.github.io/curriculum-jbook/intro.html
- Contact us for “in-class” workshops, teacher PD, virtual hackathons, and more

Email: contact@callysto.ca

On Twitter: @callysto_canada

Site: https://www.callysto.ca

YouTube https://www.youtube.com/channel/UCPdq1SYKA42EZBvUlNQUAng 

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)