## To use this slideshow:
- Run All, using the menu item: Kernel/Restart & Run All
- Return to this top cell
- click on "Slideshow" menu item above, that looks like this:
![](images/SlideIcon.png)

![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)


<h1 align='center'>Artificial Intelligence for Educators</h1>

<h2 align='center'>with Laura G Funderburk</h2>

<h4 align='center'>On Twitter: @LGFunderburk </h4>

![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)

<h2 align='center'> About the Callysto Program</h2>

- Provide open educational infrastructure and learning resources
- Focus on making computational thinking and data science/literacy available
- Use Jupyter notebooks as main platform
- Host teacher and student training workshops

<h3 align='center'>Brought to you by</h3>


| | | |
|-|-|-|
|<img src="./images/Cybera_Logo_RBG_Colour.png" alt="Drawing" style="width: 300px;"/>|<img src="./images/PIMS_Logos_Web_PIMS_Logo_Colour.png" alt="Drawing" style="width: 400px;"/>| <img src="./images/With_Funding_Canada_Wordmark-colour_BIL-EN.png" alt="Drawing" style="width: 400px;"/>|

<h2 align='center'>What is a Jupyter notebook?</h2>

A Jupyter notebook is an online document that can include both text and (Python) code in different “cells” or parts of the document.

These documents run on Callysto Hub as well as Google Colab, IBM Watson Studio, and other places.


This presentation is a Jupyter notebook!

<h2 align='center'>What is Callysto?</h2>

Callysto is a free, online learning tool that helps students and teachers learn and apply in-demand data science skills including data analysis, visualization, coding, and computational thinking. The online tool’s interactive learning modules are available in a variety of subjects – from math to history – and are aligned with existing curriculum.

Callysto’s learning modules are built using Jupyter notebooks.

<h2 align='center'> Callysto notebooks ready for you to use</h2>

On our website callysto.ca you will find lesson plans, courses and learning modules that support you incorporate coding into your stats lesson. 

#### Objective: explore how we can use Callysto to explore machine learning in the classroom


| |
|-|
|<img src="./images/samplenotebooks.png" width="600">|



<h2 align='center'>What is Data?</h2>

Data is a collection of information. Usually obtained (or collected) to address a specific issue. 

Examples of data:

- Daily number of COVID-19 cases in Canada.
- The grades of your class. 
- Census data.

<center><img src="https://img2.pngio.com/download-free-png-19-data-graph-icon-packs-vector-icon-packs-data-graph-png-600_564.png" width="400"></center>

<h2 align='center'>What is Data Science?</h2>

Data science involves <b>obtaining</b> and <b>communicating</b> information from (usually large) sets of observations.


| |
|-|
| <img src="./images/what-is-data-science-workflow.jpg" alt="Drawing" style="width:800px;"/> |



<h2 align='center'>What is Artificial Intelligence?</h2>

Artificial Intelligence (AI) is a blanket term describing all efforts to make computer “think”.

Machine Learning (ML) algorithms are programs that improve, or “learn”, through exposure to data/experience.

ML is based on the idea that machines should be able to learn and adapt through experience. AI refers to a broader idea where machines can execute tasks "smartly."

AI applies ML and other techniques to solve actual problems.



<h2 align='center'>Let's work through an example</h2>


Let's take a look at a problem involving biology. Can we "train" a machine to predict what type of Iris flower we are studying, based on its petal and sepal length and/or width?

![Iris flower](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Machine+Learning+R/iris-machinelearning.png)

<h3 align='center'>This data set has been collected and cleaned</h3>

<h3 align='center'>Our focus: exploratory analysis</h3>

In [None]:
# load and visualize the data
from pandas import read_csv
from sklearn import datasets
import pandas as pd
import seaborn as sns
from pandas.plotting import scatter_matrix
# machine learning
from sklearn.model_selection import train_test_split
# compare algorithms
from sklearn.svm import SVC
# Evaluate model
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.metrics import plot_confusion_matrix
from sklearn import svm, datasets
import matplotlib.pyplot as plt

In [None]:
# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
df = pd.DataFrame(X, columns = iris.feature_names)
df['y'] = y

df['y'].replace({0:"Iris-setosa",
           1: "Iris-versicolor",
           2: "Iris-virginica"},inplace=True)

df.head()

### Getting summary stats for all Iris flowers

In [None]:
df.describe()

### Getting summar stats for specific types of Iris flower

In [None]:
setosa = df[df['y']=='Iris-setosa']
versicolor = df[df['y']=='Iris-versicolor']
virginica = df[df['y']=='Iris-virginica']

virginica.describe()

### Generating visualization from summary stats

In [None]:
# box and whisker plots
sns.boxplot(x="y", y='sepal length (cm)', data=df,hue='y');
sns.swarmplot(x="y", y='sepal length (cm)', data=df);

### Generating distribution visualization

In [None]:
# histograms
df.hist(figsize=(10,6))
plt.show()

### More visualizations

In [None]:
sns.pairplot(df,hue='y');

## Can we predict class of Iris based on these four measurements?

Yes! 

### Machine learning technique

Split data set into training and testing data.

We will provide random data points for the algorithm to "learn". 

We will test how well the model does by providing the testing data to the algorithm after it trains. 


In [None]:
array = df.values
# All measurements
X = array[:,0:4]
# All classes
y = array[:,4]
# Split-out validation dataset
X_train, X_validation, Y_train, Y_validation = train_test_split(X, y, test_size=0.20, random_state=1, shuffle=True)

In [None]:
# Make predictions on validation dataset
model = SVC(gamma='auto')
model.fit(X_train, Y_train)
predictions = model.predict(X_validation)

In [None]:
print(accuracy_score(Y_validation, predictions))

We can see that the accuracy is 0.966 or about 96% on the hold out dataset.

In [None]:
print(classification_report(Y_validation, predictions))

Finally, the classification report provides a breakdown of each class by precision, recall, f1-score and support showing excellent results (granted the validation dataset was small).

In [None]:
classifier = model.fit(X_train, Y_train)
class_names = iris.target_names
print("Confusion matrix")
plot_confusion_matrix(classifier, X_validation, Y_validation,
                                 display_labels=class_names,
                                 cmap=plt.cm.Blues,
                                 normalize=None);

 The diagonal elements represent the number of points for which the predicted label is equal to the true label, while off-diagonal elements are those that are mislabeled by the classifier. 

### Let's take a look at the predicted values

In [None]:
pred_df = pd.DataFrame(X_validation,columns=df.columns[0:4])

pred_df['Predicted Class'] = predictions

In [None]:
## 11 setosa, 12 versicolor, 7 virginica
pred_df[pred_df['Predicted Class']=='Iris-virginica']

### Which one did it get wrong? 

The machine classified one sample as virginica, when it was versicolor.

This entry is in row with index 22.

In [None]:
import numpy as np
y_test = np.asarray(Y_validation)
misclassified = np.where(y_test != model.predict(X_validation))

print(misclassified)

<h2 align='center'> What are the potential impacts on education and society, and how do we talk to students about all of this? </h2>

- What happens when we apply ML to problems involving human choices? 

- Example: who gets approved for a mortgage, who gets admitted to high school, who is selected for a job interview

Impact of error present in algorithm must be considered. 

Predictions will be made based on training data that is provided $\Rightarrow$ bias in training data increases probability for bias in predicted outcome. 

#### To prevent this: bias must be addressed when a scientist asks a question and proceeds to the data collection stage in the data science process.  

<h2 align='center'>Real examples</h2>

Amazon ditches AI recruiting tool that didn’t like women (Reuters) [link](https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G)


Can Racist Algorithms Be Fixed? (The Marshall Project) [link](https://www.themarshallproject.org/2019/07/01/can-racist-algorithms-be-fixed)


Black and Asian faces misidentified more often by facial recognition software (CBC) [link](https://www.cbc.ca/news/technology/facial-recognition-race-1.5403899)


UK ditches exam results generated by biased algorithm after student protests (The Verge) [link](https://www.theverge.com/2020/8/17/21372045/uk-a-level-results-algorithm-biased-coronavirus-covid-19-pandemic-university-applications)

<h2 align='center'> What can we do? </h2>

- Work towards addressing our own biases in the classroom and daily life

- Identify how our biases play a role in our decision making

- Identify how our biases affect the machines we program 

- Collaborate with people offering diverse points of view

![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<h2 align='center'>Getting Started with Callysto</h2>

- Feedback form https://tinyurl.com/y2a3uhdt
- Online self-paced courses (courses.callysto.ca)  
- Preview our learning modules https://callysto.github.io/curriculum-jbook/intro.html
- Contact us for “in-class” workshops, teacher PD, virtual hackathons, and more

Email: contact@callysto.ca

On Twitter: @callysto_canada

Site: https://www.callysto.ca

YouTube https://www.youtube.com/channel/UCPdq1SYKA42EZBvUlNQUAng 

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)