In [1]:
# Setup Reveal.JS
from traitlets.config.manager import BaseJSONConfigManager
from pathlib import Path
path = Path.home() / ".jupyter" / "nbconfig"
cm = BaseJSONConfigManager(config_dir=str(path))
# ipyleaflet hack to load full map in Reveal.js
# These settings are also injected into the notebook metadata
# (Edit -> Edit NotebookMetadata), which is the preferred method
cm.update(
    "rise",
    {"minScale": 1.25,
     "width": "80%",
     "transition": "none"}
)

import numpy as np
np.random.seed(42)

# Computer says "I Don't Know"

## The case for Honest AI

<img width=35% align="right" src="PeterCartoon-square.jpg">

Peter Flach, University of Bristol and Alan Turing Institute, UK

[flach.github.io](https://flach.github.io)

Would you consider it newsworthy if a human passes a multiple-choice test? 

<img width=50% align="right" src="mc.jpg" alt="credit: https://www.learningscientists.org/blog/2017/10/10-1">

**Probably not.**


Yet multiple-choice tests are behind many AI successes reported in the media, leading to recent headlines such as 

- Researchers taught an AI to recognize smells!

- AI Trained on Old Scientific Papers Makes Discoveries Humans Missed!

- AI learns to recognize nerve cells!

We are told that "AI passed the test" or "the algorithm worked" --

<img width=30% align="right" align="bottom" src="pass.jpg" alt="credit: TODO">

- but what exactly does that mean?

**Who sets the exam, and what is the passing grade?**

# The case for Honest AI

In this talk I will discuss why performance evaluation is not something that can be easily summarised in a catchy headline -- neither for humans nor for machines. 

Furthermore, I will argue why it is imperative that AI algorithms become more *honest* about their own abilities.

<img width=25% align="right" align="bottom" src="honest.png" alt="credit: TODO">

# Owning up to uncertainty

- Quantifying the uncertainty in predictions would be a good start.
  - E.g., saying "the chance of rain is 70%" rather than "it will rain".
  
<img width=35% align="right" align="bottom" src="weather.jpg" alt="credit: Met Office">

- Quantifying the uncertainty in that chance estimate would be even better. 
  - Am I confident that 70% is close to the right number, or did I just guess that on the basis of the last three days?

# Computer says "I don't know"

But what would really demonstrate an AI algorithm's awareness of its own strengths *and* limitations is if it would occasionally say **"I don't know"** --
- something that not many contemporary AI algorithms and machine-learned classifiers do;
- often leading to problems with "adversarial examples" which are doctored to mislead the algorithm. 

# In this talk...
I will discuss in an accessible way how this arises due to a focus on *discriminative learning*, and how recent research has developed ways to overcome this, 

<img width=50% align="right" src="lb.jpg">

allowing AI and machine learning to become more **honest and aware of their own limitations**.  

In [2]:
from utils import AIUKSlides
bristol_center = (51.4545, -2.5879)

slides = AIUKSlides(local_center=bristol_center, isolines=[.0, .2, .4, .6, .8, 1.0],
                    width='600px', height='400px', grid_density=500)

<img height=100% align="center" src="PedroBandero.png" alt="credit: TODO">

# Some recent COVID19 data

In [9]:
display(slides.map_covid_uk())

Map(center=[53, -2], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_out_tex…

Case numbers going up/down in <span style="color:red">red</span>/<span style="color:blue">blue</span>.

# Zooming in on Bristol

In [10]:
display(slides.map_covid_local())

Map(center=[51.4545, -2.5879], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zo…

In [11]:
from utils import KDE
clf = KDE(bandwidth=0.005)

slides.train_local_classifier(clf)
slides.train_local_foreground()

# AI can distinguish between up/down areas

In [12]:
display(slides.map_local_classifier_foreground())

Map(center=[51.4545, -2.5879], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zo…

# What actually happens with discriminative models

In [13]:
display(slides.map_local_classifier(fillopacity=0.4, lineopacity=1.0))

Map(center=[51.4545, -2.5879], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zo…

# Different model, similar issue

In [14]:
from sklearn.ensemble import RandomForestClassifier
clf2 = RandomForestClassifier()
slides.train_local_classifier(clf2)
display(slides.map_local_classifier(fillopacity=0.4, lineopacity=1.0))

Map(center=[51.4545, -2.5879], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zo…