# How Do We Cope with Rapid Change Like AI/ML?

### [Neil D. Lawrence](http://inverseprobability.com), University of

Cambridge

### 2023-11-21

**Abstract**: Machine learning solutions, in particular those based on
deep learning methods, form an underpinning of the current revolution in
“artificial intelligence” that has dominated popular press headlines and
is having a significant influence on the wider tech agenda. In this talk
I will give an overview of where we are now with machine learning
solutions, and what challenges we face both in the near and far future.
These include practical application of existing algorithms in the face
of the need to explain decision making, mechanisms for improving the
quality and availability of data, dealing with large unstructured
datasets.

$$
$$

::: {.cell .markdown}

<!-- Do not edit this file locally. -->
<!-- Do not edit this file locally. -->
<!---->
<!-- Do not edit this file locally. -->
<!-- Do not edit this file locally. -->
<!-- The last names to be defined. Should be defined entirely in terms of macros from above-->
<!--

-->

## Setup

In [None]:
import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 22})

<!--setupplotcode{import seaborn as sns
sns.set_style('darkgrid')
sns.set_context('paper')
sns.set_palette('colorblind')}-->

## notutils

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_software/includes/notutils-software.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_software/includes/notutils-software.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

This small package is a helper package for various notebook utilities
used below.

The software can be installed using

In [None]:
%pip install notutils

from the command prompt where you can access your python installation.

The code is also available on GitHub:
<https://github.com/lawrennd/notutils>

Once `notutils` is installed, it can be imported in the usual manner.

In [None]:
import notutils

## pods

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_software/includes/pods-software.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_software/includes/pods-software.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

In Sheffield we created a suite of software tools for ‘Open Data
Science’. Open data science is an approach to sharing code, models and
data that should make it easier for companies, health professionals and
scientists to gain access to data science techniques.

You can also check this blog post on [Open Data
Science](http://inverseprobability.com/2014/07/01/open-data-science).

The software can be installed using

In [None]:
%pip install pods

from the command prompt where you can access your python installation.

The code is also available on GitHub: <https://github.com/lawrennd/ods>

Once `pods` is installed, it can be imported in the usual manner.

In [None]:
import pods

## mlai

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_software/includes/mlai-software.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_software/includes/mlai-software.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

The `mlai` software is a suite of helper functions for teaching and
demonstrating machine learning algorithms. It was first used in the
Machine Learning and Adaptive Intelligence course in Sheffield in 2013.

The software can be installed using

In [None]:
%pip install mlai

from the command prompt where you can access your python installation.

The code is also available on GitHub: <https://github.com/lawrennd/mlai>

Once `mlai` is installed, it can be imported in the usual manner.

In [None]:
import mlai
from mlai import plot

## Complexity in Action

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_psychology/includes/selective-attention-bias.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_psychology/includes/selective-attention-bias.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

As an exercise in understanding complexity, watch the following video.
You will see the basketball being bounced around, and the players
moving. Your job is to count the passes of those dressed in white and
ignore those of the individuals dressed in black.

In [None]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('vJG698U2Mvo')

Figure: <i>Daniel Simon’s famous illusion “monkey business”. Focus on
the movement of the ball distracts the viewer from seeing other aspects
of the image.</i>

In a classic study Simons and Chabris (1999) ask subjects to count the
number of passes of the basketball between players on the team wearing
white shirts. Fifty percent of the time, these subjects don’t notice the
gorilla moving across the scene.

The phenomenon of inattentional blindness is well known, e.g in their
paper Simons and Charbris quote the Hungarian neurologist, Rezsö Bálint,

> It is a well-known phenomenon that we do not notice anything happening
> in our surroundings while being absorbed in the inspection of
> something; focusing our attention on a certain object may happen to
> such an extent that we cannot perceive other objects placed in the
> peripheral parts of our visual field, although the light rays they
> emit arrive completely at the visual sphere of the cerebral cortex.
>
> Rezsö Bálint 1907 (translated in Husain and Stein 1988, page 91)

When we combine the complexity of the world with our relatively low
bandwidth for information, problems can arise. Our focus on what we
perceive to be the most important problem can cause us to miss other
(potentially vital) contextual information.

This phenomenon is known as selective attention or ‘inattentional
blindness’.

In [None]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('_oGAzq5wM_Q')

Figure: <i>For a longer talk on inattentional bias from Daniel Simons
see this video.</i>

## Data Selective Attention Bias

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_data-science/includes/data-selection-attention-bias.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_data-science/includes/data-selection-attention-bias.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

We are going to see how inattention biases can play out in data analysis
by going through a simple example. The analysis involves body mass index
and activity information.

## BMI Steps Data

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_datasets/includes/bmi-steps-data.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_datasets/includes/bmi-steps-data.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

The BMI Steps example is taken from Yanai and Lercher (2020). We are
given a data set of body-mass index measurements against step counts.
For convenience we have packaged the data so that it can be easily
downloaded.

In [None]:
import pods

In [None]:
data = pods.datasets.bmi_steps()
X = data['X'] 
y = data['Y']

It is good practice to give our variables interpretable names so that
the analysis may be clearly understood by others. Here the `steps` count
is the first dimension of the covariate, the `bmi` is the second
dimension and the `gender` is stored in `y` with `1` for female and `0`
for male.

In [None]:
steps = X[:, 0]
bmi = X[:, 1]
gender = y[:, 0]

We can check the mean steps and the mean of the BMI.

In [None]:
print('Steps mean is {mean}.'.format(mean=steps.mean()))

In [None]:
print('BMI mean is {mean}.'.format(mean=bmi.mean()))

## BMI Steps Data Analysis

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_data-science/includes/bmi-steps-analysis.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_data-science/includes/bmi-steps-analysis.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

We can also separate out the means from the male and female populations.
In python this can be done by setting male and female indices as
follows.

In [None]:
male_ind = (gender==0)
female_ind = (gender==1)

And now we can extract the variables for the two populations.

In [None]:
male_steps = steps[male_ind]
male_bmi = bmi[male_ind]

And as before we compute the mean.

In [None]:
print('Male steps mean is {mean}.'.format(mean=male_steps.mean()))

In [None]:
print('Male BMI mean is {mean}.'.format(mean=male_bmi.mean()))

Similarly, we can get the same result for the female portion of the
populaton.

In [None]:
female_steps = steps[female_ind]
female_bmi = bmi[female_ind]

In [None]:
print('Female steps mean is {mean}.'.format(mean=female_steps.mean()))

In [None]:
print('Female BMI mean is {mean}.'.format(mean=female_bmi.mean()))

Interesting, the female BMI average is slightly higher than the male BMI
average. The number of steps in the male group is higher than that in
the female group. Perhaps the steps and the BMI are anti-correlated. The
more steps, the lower the BMI.

Python provides a statistics package. We’ll import this in `python` so
that we can try and understand the correlation between the `steps` and
the `BMI`.

In [None]:
from scipy.stats import pearsonr

In [None]:
corr, _ = pearsonr(steps, bmi)
print("Pearson's overall correlation: {corr}".format(corr=corr))

In [None]:


male_corr, _ = pearsonr(male_steps, male_bmi)
print("Pearson's correlation for males: {corr}".format(corr=male_corr))

In [None]:

female_corr, _ = pearsonr(female_steps, female_bmi)
print("Pearson's correlation for females: {corr}".format(corr=female_corr))

In [None]:
import mlai.plot as plot
import mlai
import matplotlib.pyplot as plt

In [None]:
fig, ax = plt.subplots(figsize=plot.big_wide_figsize)
_ = ax.plot(X[male_ind, 0], X[male_ind, 1], 'g.',markersize=10)
_ = ax.plot(X[female_ind, 0], X[female_ind, 1], 'r.',markersize=10)
_ = ax.set_xlabel('steps', fontsize=20)
_ = ax.set_ylabel('BMI', fontsize=20)
xlim = (0, 15000)
ylim = (15, 32.5)
ax.set_xlim(xlim)
ax.set_ylim(ylim)
mlai.write_figure(filename='bmi-steps.svg',
                directory='./datasets',
                transparent=True)

## A Hypothesis as a Liability

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_data-science/includes/hypothesis-as-a-liability.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_data-science/includes/hypothesis-as-a-liability.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

This analysis is from an article titled “A Hypothesis as a Liability”
(Yanai and Lercher, 2020), they start their article with the following
quite from Herman Hesse.

> ” ‘When someone seeks,’ said Siddhartha, ‘then it easily happens that
> his eyes see only the thing that he seeks, and he is able to find
> nothing, to take in nothing. \[…\] Seeking means: having a goal. But
> finding means: being free, being open, having no goal.’ ”
>
> Hermann Hesse

Their idea is that having a hypothesis can constrain our thinking.
However, in answer to their paper Felin et al. (2021) argue that some
form of hypothesis is always necessary, suggesting that a hypothesis
*can* be a liability

My view is captured in the introductory chapter to an edited volume on
computational systems biology that I worked on with Mark Girolami,
Magnus Rattray and Guido Sanguinetti.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//data-science/licsb-popper-quote.png" style="width:80%">

Figure: <i>Quote from Lawrence (2010) highlighting the importance of
interaction between data and hypothesis.</i>

Popper nicely captures the interaction between hypothesis and data by
relating it to the chicken and the egg. The important thing is that
these two co-evolve.

# What is Machine Learning?

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ml/includes/what-is-ml.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ml/includes/what-is-ml.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

What is machine learning? At its most basic level machine learning is a
combination of

$$\text{data} + \text{model} \stackrel{\text{compute}}{\rightarrow} \text{prediction}$$

where *data* is our observations. They can be actively or passively
acquired (meta-data). The *model* contains our assumptions, based on
previous experience. That experience can be other data, it can come from
transfer learning, or it can merely be our beliefs about the
regularities of the universe. In humans our models include our inductive
biases. The *prediction* is an action to be taken or a categorization or
a quality score. The reason that machine learning has become a mainstay
of artificial intelligence is the importance of predictions in
artificial intelligence. The data and the model are combined through
computation.

In practice we normally perform machine learning using two functions. To
combine data with a model we typically make use of:

**a prediction function** it is used to make the predictions. It
includes our beliefs about the regularities of the universe, our
assumptions about how the world works, e.g., smoothness, spatial
similarities, temporal similarities.

**an objective function** it defines the ‘cost’ of misprediction.
Typically, it includes knowledge about the world’s generating processes
(probabilistic objectives) or the costs we pay for mispredictions
(empirical risk minimization).

The combination of data and model through the prediction function and
the objective function leads to a *learning algorithm*. The class of
prediction functions and objective functions we can make use of is
restricted by the algorithms they lead to. If the prediction function or
the objective function are too complex, then it can be difficult to find
an appropriate learning algorithm. Much of the academic field of machine
learning is the quest for new learning algorithms that allow us to bring
different types of models and data together.

A useful reference for state of the art in machine learning is the UK
Royal Society Report, [Machine Learning: Power and Promise of Computers
that Learn by
Example](https://royalsociety.org/~/media/policy/projects/machine-learning/publications/machine-learning-report.pdf).

You can also check my post blog post on [What is Machine
Learning?](http://inverseprobability.com/2017/07/17/what-is-machine-learning).

## Artificial Intelligence and Data Science

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ml/includes/data-science-vs-ai.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ml/includes/data-science-vs-ai.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

Machine learning technologies have been the driver of two related, but
distinct disciplines. The first is *data science*. Data science is an
emerging field that arises from the fact that we now collect so much
data by happenstance, rather than by *experimental design*. Classical
statistics is the science of drawing conclusions from data, and to do so
statistical experiments are carefully designed. In the modern era we
collect so much data that there’s a desire to draw inferences directly
from the data.

As well as machine learning, the field of data science draws from
statistics, cloud computing, data storage (e.g. streaming data),
visualization and data mining.

In contrast, artificial intelligence technologies typically focus on
emulating some form of human behaviour, such as understanding an image,
or some speech, or translating text from one form to another. The recent
advances in artificial intelligence have come from machine learning
providing the automation. But in contrast to data science, in artificial
intelligence the data is normally collected with the specific task in
mind. In this sense it has strong relations to classical statistics.

Classically artificial intelligence worried more about *logic* and
*planning* and focused less on data driven decision making. Modern
machine learning owes more to the field of *Cybernetics* (Wiener, 1948)
than artificial intelligence. Related fields include *robotics*, *speech
recognition*, *language understanding* and *computer vision*.

There are strong overlaps between the fields, the wide availability of
data by happenstance makes it easier to collect data for designing AI
systems. These relations are coming through wide availability of sensing
technologies that are interconnected by cellular networks, WiFi and the
internet. This phenomenon is sometimes known as the *Internet of
Things*, but this feels like a dangerous misnomer. We must never forget
that we are interconnecting people, not things.

<center>

Convention for the Protection of *Individuals* with regard to Automatic
Processing of *Personal Data* (1981/1/28)

</center>

# Evolved Relationship with Information

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_data-science/includes/evolved-relationship.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_data-science/includes/evolved-relationship.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

The high bandwidth of computers has resulted in a close relationship
between the computer and data. Large amounts of information can flow
between the two. The degree to which the computer is mediating our
relationship with data means that we should consider it an intermediary.

Originally our low bandwidth relationship with data was affected by two
characteristics. Firstly, our tendency to over-interpret driven by our
need to extract as much knowledge from our low bandwidth information
channel as possible. Secondly, by our improved understanding of the
domain of *mathematical* statistics and how our cognitive biases can
mislead us.

With this new set up there is a potential for assimilating far more
information via the computer, but the computer can present this to us in
various ways. If its motives are not aligned with ours then it can
misrepresent the information. This needn’t be nefarious it can be simply
because of the computer pursuing a different objective from us. For
example, if the computer is aiming to maximize our interaction time that
may be a different objective from ours which may be to summarize
information in a representative manner in the *shortest* possible length
of time.

For example, for me, it was a common experience to pick up my telephone
with the intention of checking when my next appointment was, but to soon
find myself distracted by another application on the phone and end up
reading something on the internet. By the time I’d finished reading, I
would often have forgotten the reason I picked up my phone in the first
place.

There are great benefits to be had from the huge amount of information
we can unlock from this evolved relationship between us and data. In
biology, large scale data sharing has been driven by a revolution in
genomic, transcriptomic and epigenomic measurement. The improved
inferences that can be drawn through summarizing data by computer have
fundamentally changed the nature of biological science, now this
phenomenon is also influencing us in our daily lives as data measured by
*happenstance* is increasingly used to characterize us.

Better mediation of this flow requires a better understanding of
human-computer interaction. This in turn involves understanding our own
intelligence better, what its cognitive biases are and how these might
mislead us.

For further thoughts see Guardian article on [marketing in the internet
era](https://www.theguardian.com/media-network/2015/jul/23/data-driven-economy-marketing)
from 2015.

You can also check my blog post on [System
Zero](http://inverseprobability.com/2015/12/04/what-kind-of-ai). This
was also written in 2015.

## New Flow of Information

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_data-science/includes/new-flow-of-information.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_data-science/includes/new-flow-of-information.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

Classically the field of statistics focused on mediating the
relationship between the machine and the human. Our limited bandwidth of
communication means we tend to over-interpret the limited information
that we are given, in the extreme we assign motives and desires to
inanimate objects (a process known as anthropomorphizing). Much of
mathematical statistics was developed to help temper this tendency and
understand when we are valid in drawing conclusions from data.

<img src="https://inverseprobability.com/talks/../slides/diagrams//data-science/new-flow-of-information003.svg" class="" width="70%" style="vertical-align:middle;">

Figure: <i>The trinity of human, data, and computer, and highlights the
modern phenomenon. The communication channel between computer and data
now has an extremely high bandwidth. The channel between human and
computer and the channel between data and human is narrow. New direction
of information flow, information is reaching us mediated by the
computer. The focus on classical statistics reflected the importance of
the direct communication between human and data. The modern challenges
of data science emerge when that relationship is being mediated by the
machine.</i>

Data science brings new challenges. In particular, there is a very large
bandwidth connection between the machine and data. This means that our
relationship with data is now commonly being mediated by the machine.
Whether this is in the acquisition of new data, which now happens by
happenstance rather than with purpose, or the interpretation of that
data where we are increasingly relying on machines to summarize what the
data contains. This is leading to the emerging field of data science,
which must not only deal with the same challenges that mathematical
statistics faced in tempering our tendency to over interpret data but
must also deal with the possibility that the machine has either
inadvertently or maliciously misrepresented the underlying data.

## Data Science Africa

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_data-science/includes/data-science-africa.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_data-science/includes/data-science-africa.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//data-science-africa-logo.png" style="width:30%">

Figure: <i>Data Science Africa <http://datascienceafrica.org> is a
ground up initiative for capacity building around data science, machine
learning and artificial intelligence on the African continent.</i>

<img src="https://inverseprobability.com/talks/../slides/diagrams//dsa/dsa-events-october-2021.svg" class="" width="60%" style="vertical-align:middle;">

Figure: <i>Data Science Africa meetings held up to October 2021.</i>

Data Science Africa is a bottom up initiative for capacity building in
data science, machine learning and artificial intelligence on the
African continent.

As of May 2023 there have been eleven workshops and schools, located in
seven different countries: Nyeri, Kenya (twice); Kampala, Uganda;
Arusha, Tanzania; Abuja, Nigeria; Addis Ababa, Ethiopia; Accra, Ghana;
Kampala, Uganda and Kimberley, South Africa (virtual), and in Kigali,
Rwanda.

The main notion is *end-to-end* data science. For example, going from
data collection in the farmer’s field to decision making in the Ministry
of Agriculture. Or going from malaria disease counts in health centers
to medicine distribution.

The philosophy is laid out in (Lawrence, 2015). The key idea is that the
modern *information infrastructure* presents new solutions to old
problems. Modes of development change because less capital investment is
required to take advantage of this infrastructure. The philosophy is
that local capacity building is the right way to leverage these
challenges in addressing data science problems in the African context.

Data Science Africa is now a non-govermental organization registered in
Kenya. The organising board of the meeting is entirely made up of
scientists and academics based on the African continent.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//data-science/africa-benefit-data-revolution-2024.png" style="width:70%">

Figure: <i>The lack of existing physical infrastructure on the African
continent makes it a particularly interesting environment for deploying
solutions based on the *information infrastructure*. The idea is
explored more in this Guardian op-ed on Guardian article on [How African
can benefit from the data
revolution](https://www.theguardian.com/media-network/2015/aug/25/africa-benefit-data-science-information).</i>

Guardian article on [Data Science
Africa](https://www.theguardian.com/media-network/2015/aug/25/africa-benefit-data-science-information)

## Example: Prediction of Malaria Incidence in Uganda

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_health/includes/malaria-gp.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_health/includes/malaria-gp.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

<svg viewBox="0 0 200 200" style="width:15%">

<defs> <clipPath id="clip62">

<style>
circle {
  fill: black;
}
</style>

<circle cx="100" cy="100" r="100"/> </clipPath> </defs>

<title>

Martin Mubangizi

</title>

<image preserveAspectRatio="xMinYMin slice" width="100%" xlink:href="https://inverseprobability.com/talks/../slides/diagrams//people/martin-mubangizi.png" clip-path="url(#clip62)"/>

</svg>
<svg viewBox="0 0 200 200" style="width:15%">

<defs> <clipPath id="clip13">

<style>
circle {
  fill: black;
}
</style>

<circle cx="100" cy="100" r="100"/> </clipPath> </defs>

<title>

Ricardo Andrade Pacheco

</title>

<image preserveAspectRatio="xMinYMin slice" width="100%" xlink:href="https://inverseprobability.com/talks/../slides/diagrams//people/ricardo-andrade-pacheco.png" clip-path="url(#clip13)"/>

</svg>
<svg viewBox="0 0 200 200" style="width:15%">

<defs> <clipPath id="clip60">

<style>
circle {
  fill: black;
}
</style>

<circle cx="100" cy="100" r="100"/> </clipPath> </defs>

<title>

John Quinn

</title>

<image preserveAspectRatio="xMinYMin slice" width="100%" xlink:href="https://inverseprobability.com/talks/../slides/diagrams//people/john-quinn.jpg" clip-path="url(#clip60)"/>

</svg>

As an example of using Gaussian process models within the full pipeline
from data to decsion, we’ll consider the prediction of Malaria incidence
in Uganda. For the purposes of this study malaria reports come in two
forms, HMIS reports from health centres and Sentinel data, which is
curated by the WHO. There are limited sentinel sites and many HMIS
sites.

The work is from Ricardo Andrade Pacheco’s PhD thesis, completed in
collaboration with John Quinn and Martin Mubangizi (Andrade-Pacheco et
al., 2014; Mubangizi et al., 2014). John and Martin were initally from
the AI-DEV group from the University of Makerere in Kampala and more
latterly they were based at UN Global Pulse in Kampala. You can see the
work summarized on the UN Global Pulse [disease outbreaks project site
here](https://diseaseoutbreaks.unglobalpulse.net/uganda/).

-   See [UN Global Pulse Disease Outbreaks
    Site](https://diseaseoutbreaks.unglobalpulse.net/uganda/)

Malaria data is spatial data. Uganda is split into districts, and health
reports can be found for each district. This suggests that models such
as conditional random fields could be used for spatial modelling, but
there are two complexities with this. First of all, occasionally
districts split into two. Secondly, sentinel sites are a specific
location within a district, such as Nagongera which is a sentinel site
based in the Tororo district.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//health/uganda-districts-2006.png" style="width:50%">

Figure: <i>Ugandan districts. Data SRTM/NASA from
<https://dds.cr.usgs.gov/srtm/version2_1>.</i>

(Andrade-Pacheco et al., 2014; Mubangizi et al., 2014)

The common standard for collecting health data on the African continent
is from the Health management information systems (HMIS). However, this
data suffers from missing values (Gething et al., 2006) and diagnosis of
diseases like typhoid and malaria may be confounded.

<img src="https://inverseprobability.com/talks/../slides/diagrams//health/Tororo_District_in_Uganda.svg" class="" width="50%" style="vertical-align:middle;">

Figure: <i>The Tororo district, where the sentinel site, Nagongera, is
located.</i>

[World Health Organization Sentinel Surveillance
systems](https://www.who.int/immunization/monitoring_surveillance/burden/vpd/surveillance_type/sentinel/en/)
are set up “when high-quality data are needed about a particular disease
that cannot be obtained through a passive system”. Several sentinel
sites give accurate assessment of malaria disease levels in Uganda,
including a site in Nagongera.

<img class="negate" src="https://inverseprobability.com/talks/../slides/diagrams//health/sentinel_nagongera.png" style="width:100%">

Figure: <i>Sentinel and HMIS data along with rainfall and temperature
for the Nagongera sentinel station in the Tororo district.</i>

In collaboration with the AI Research Group at Makerere we chose to
investigate whether Gaussian process models could be used to assimilate
information from these two different sources of disease informaton.
Further, we were interested in whether local information on rainfall and
temperature could be used to improve malaria estimates.

The aim of the project was to use WHO Sentinel sites, alongside rainfall
and temperature, to improve predictions from HMIS data of levels of
malaria.

<img src="https://inverseprobability.com/talks/../slides/diagrams//health/Mubende_District_in_Uganda.svg" class="" width="50%" style="vertical-align:middle;">

Figure: <i>The Mubende District.</i>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//health/mubende.png" style="width:80%">

Figure: <i>Prediction of malaria incidence in Mubende.</i>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//gpss/1157497_513423392066576_1845599035_n.jpg" style="width:80%">

Figure: <i>The project arose out of the Gaussian process summer school
held at Makerere in Kampala in 2013. The school led, in turn, to the
Data Science Africa initiative.</i>

## Early Warning Systems

<img src="https://inverseprobability.com/talks/../slides/diagrams//health/Kabarole_District_in_Uganda.svg" class="" width="50%" style="vertical-align:middle;">

Figure: <i>The Kabarole district in Uganda.</i>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//health/kabarole.gif" style="width:100%">

Figure: <i>Estimate of the current disease situation in the Kabarole
district over time. Estimate is constructed with a Gaussian process with
an additive covariance funciton.</i>

Health monitoring system for the Kabarole district. Here we have fitted
the reports with a Gaussian process with an additive covariance
function. It has two components, one is a long time scale component (in
red above) the other is a short time scale component (in blue).

Monitoring proceeds by considering two aspects of the curve. Is the blue
line (the short term report signal) above the red (which represents the
long term trend? If so we have higher than expected reports. If this is
the case *and* the gradient is still positive (i.e. reports are going
up) we encode this with a *red* color. If it is the case and the
gradient of the blue line is negative (i.e. reports are going down) we
encode this with an *amber* color. Conversely, if the blue line is below
the red *and* decreasing, we color *green*. On the other hand if it is
below red but increasing, we color *yellow*.

This gives us an early warning system for disease. Red is a bad
situation getting worse, amber is bad, but improving. Green is good and
getting better and yellow good but degrading.

Finally, there is a gray region which represents when the scale of the
effect is small.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//health/monitor.gif" style="width:50%">

Figure: <i>The map of Ugandan districts with an overview of the Malaria
situation in each district.</i>

These colors can now be observed directly on a spatial map of the
districts to give an immediate impression of the current status of the
disease across the country.

## Revolution

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ai/includes/cuneiform.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ai/includes/cuneiform.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

Arguably the information revolution we are experiencing is unprecedented
in history. But changes in the way we share information have a long
history. Over 5,000 years ago in the city of Uruk, on the banks of the
Euphrates, communities which relied on the water to irrigate their corps
developed an approach to recording transactions in clay. Eventually the
system of recording system became sophisticated enough that their oral
histories could be recorded in the form of the first epic: Gilgamesh.

See Lawrence (2024) cuneiform p. 337, 360, 390.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//cuneiform/chicago-cuneiform-stone.jpg" style="width:40%">

Figure: <i>Chicago Stone, side 2, recording sale of a number of fields,
probably from Isin, Early Dynastic Period, c. 2600 BC, black basalt</i>

It was initially developed for people as a record of who owed what to
whom, expanding individuals’ capacity to remember. But over a five
hundred year period writing evolved to become a tool for literature as
well. More pithily put, writing was invented by accountants not poets
(see e.g. [this piece by Tim
Harford](https://www.bbc.co.uk/news/business-39870485)).

In some respects today’s revolution is different, because it involves
also the creation of stories as well as their curation. But in some
fundamental ways we can see what we have produced as another tool for us
in the information revolution.

## Coin Pusher

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_policy/includes/coin-pusher.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_policy/includes/coin-pusher.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

Disruption of society is like a coin pusher, it’s those who are already
on the edge who are most likely to be effected by disruption.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//policy/Coin_pusher_2.jpg" style="width:70%">

Figure: <i>A coin pusher is a game where coins are dropped into th etop
of the machine, and they disrupt those on the existing steps. With any
coin drop, many coins move, but it is those on the edge, who are often
only indirectly effected, but also most traumatically effected by the
change.</i>

One danger of the current hype around generative AI is that we are
overly focussing on the fact that it seems to have significant effect on
professional jobs, people are naturally asking the question “what does
it do for my role?”. No doubt, there will be disruption, but the coin
pusher hypothesis suggests that that disruption will likely involve
movement on the same step. However it is those on the edge already, who
are often not working directly in the information economy, who often
have less of a voice in the policy conversation who are likely to be
most disrupted.

## Royal Society Report

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ml/includes/rs-report-machine-learning.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ml/includes/rs-report-machine-learning.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ml/ml-report-cover-page.png" style="width:">

Figure: <i>The Royal Society report on Machine Learning was released on
25th April 2017</i>

See Lawrence (2024) Royal Society; machine learning review and p. 25,
321, 395.

## Public Research

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ml/includes/rs-report-mori-poll-art.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ml/includes/rs-report-mori-poll-art.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ml/rs-report-mori-poll-cover.png" style="width:">

Figure: <i>The Royal Society comissioned [public research from
Mori](https://royalsociety.org/-/media/policy/projects/machine-learning/publications/public-views-of-machine-learning-ipsos-mori.pdf)
as part of the machine learning review.</i>

As is often the cas when members of the public are well convened, they
offered sensible opinions about the utility and challenges of machine
learning methods.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ml/rs-mori-views-of-specific-ml-applications-1.png" style="width:">

Figure: <i>One of the questions focussed on machine learning
applications.</i>

A range of different scenaios were captured by this qualitative work.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ml/rs-mori-views-of-specific-ml-applications-2.png" style="width:">

Figure: <i>The public were broadly supportive of a range of application
areas.</i>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ml/rs-mori-views-of-specific-ml-applications-3.png" style="width:">

Figure: <i>But they failed to see the point in AI’s that could produce
poetry.</i>

The one scenario where the public were unable to perceive any advantage
in machine learning was an example from art, the creation of poetry.
This was felt to be a fundamentally human activity that machines could
only emulate at best.

## Mercutio

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ml/includes/chat-gpt-mercutio.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ml/includes/chat-gpt-mercutio.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

It’s ironic that despite the public focus on transport, health, cities,
crime and their confusion as to why we might want AI to do art, one area
where significant progress has been made is on AIs that can create
literature.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ml/chat-gpt-mercutio-2.png" style="width:70%">

Figure: <i>Despite the public’s desire to see challenges in health,
finance, transport, crime solved by machine learning and their
skepticism around poetry, one of the problems we have solved seems to be
poetry.</i>

> **User**: Hi ChatGPT, Could you write soliloquy that introduces what a
> Large Language Model is as spoken by Mercutio?
>
> **Machine**: Ah, friends, lend me your ears, I prithee, For now I must
> utter a tale of wonder, Of a creation divine, called the Large
> Language Model, An oracle, a sorcerer, a wordsmith, yea verily, a
> master of the quill.
>
> From the depths of this scholarly dominion, A creature of artifice did
> rise; Twas wrought of ones and zeroes, woven as if by Puck’s own hand,
> To mimic our thoughts, our voice, our very essence.
>
> A thousand minds, nay, a millionfold, Conjoined in this digital
> Athenaeum, Words and phrases, like soldiers in battle, Marching to
> form the verses of our age.
>
> This Large Language Model, a changeling so fair, Hath the power to
> bring forth wisdom, To transmute our queries, our curiosities, Into
> answers, clear as a midsummer’s day.
>
> Yet, beware, my companions, for magic hath its limits, Not unlike the
> mortal bounds of we who dwell in flesh, The Large Language Model hath
> its strengths, But ’tis not an all-knowing oracle, a god it is not.
>
> As Mercutio, your humble servant, doth stand before thee, I marvel at
> the art and science that hath brought forth this wondrous creation,
> This Large Language Model, a testament to man’s yearning, To
> comprehend the world and make meaning of the great tapestry of
> existence.
>
> In faith, dear friends, let us embrace the magic and beauty of this
> digital bard, And mayhaps, by its grace, we shall glean some wisdom,
> As we wander together, hand in hand, Through this boundless realm of
> knowledge and wonder.
>
> ChatGPT transcript, 14th April 2023.

## The MONIAC

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_simulation/includes/the-moniac.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_simulation/includes/the-moniac.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

[The MONIAC](https://en.wikipedia.org/wiki/MONIAC) was an analogue
computer designed to simulate the UK economy. Analogue comptuers work
through analogy, the analogy in the MONIAC is that both money and water
flow. The MONIAC exploits this through a system of tanks, pipes, valves
and floats that represent the flow of money through the UK economy.
Water flowed from the treasury tank at the top of the model to other
tanks representing government spending, such as health and education.
The machine was initially designed for teaching support but was also
found to be a useful economic simulator. Several were built and today
you can see the original at Leeds Business School, there is also one in
the London Science Museum and one [in the Unisversity of Cambridge’s
economics
faculty](https://www.econ.cam.ac.uk/economics-alumni/drip-down-economics-phillips-machine).

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//simulation/Phillips_and_MONIAC_LSE.jpg" style="width:40%">

Figure: <i>Bill Phillips and his MONIAC (completed in 1949). The machine
is an analogue computer designed to simulate the workings of the UK
economy.</i>

See Lawrence (2024) MONIAC p. 232-233, 266, 343.

## Donald MacKay

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ai/includes/donald-mackay-brain.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ai/includes/donald-mackay-brain.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//people/DonaldMacKay1952.jpg" style="width:40%">

Figure: <i>Donald M. MacKay (1922-1987), a physicist who was an early
member of the cybernetics community and member of the Ratio Club.</i>

Donald MacKay was a physicist who worked on naval gun targeting during
the Second World War. The challenge with gun targeting for ships is that
both the target and the gun platform are moving. This was tackled using
analogue computers - for example, in the US the [Mark I fire control
computer](https://en.wikipedia.org/wiki/Mark_I_Fire_Control_Computer),
which was a mechanical computer. MacKay worked on radar systems for gun
laying, where the velocity and distance of the target could be assessed
through radar and a mechanical-electrical analogue computer.

## Further Reading

-   Chapter 8 of Lawrence (2024)

## Fire Control Systems

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ai/includes/fire-control-systems.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ai/includes/fire-control-systems.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

Naval gunnery systems deal with targeting guns while taking into account
movement of ships. The Royal Navy’s Gunnery Pocket Book (The Admiralty,
1945) gives details of one system for gun laying.

Like many challenges we face today, in the second world war, fire
control was handled by a hybrid system of humans and computers. This
means deploying human beings for the tasks that they can manage, and
machines for the tasks that are better performed by a machine. This
leads to a division of labour between the machine and the human that can
still be found in our modern digital ecosystems.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ai/low-angle-fire-control-team.jpg" style="width:80%">

Figure: <i>The fire control computer set at the centre of a system of
observation and tracking (The Admiralty, 1945).</i>

As analogue computers, fire control computers from the second world war
would contain components that directly represented the different
variables that were important in the problem to be solved, such as the
inclination between two ships.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ai/the-measurement-of-inclination.jpg" style="width:80%">

Figure: <i>Measuring inclination between two ships (The Admiralty,
1945). Sophisticated fire control computers allowed the ship to continue
to fire while under maneuvers.</i>

The fire control systems were electro-mechanical analogue computers that
represented the “state variables” of interest, such as inclination and
ship speed with gears and cams within the machine.

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ai/typical-modern-fire-control-table.jpg" style="width:80%">

Figure: <i>A second world war gun computer’s control table (The
Admiralty, 1945).</i>

For more details on fire control computers, you can watch a 1953 film on
the the US the [Mark IA fire control
computer](https://en.wikipedia.org/wiki/Mark_I_Fire_Control_Computer)
from Periscope Film.

In [None]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('gwf5mAlI7Ug')

Figure: <i>U.S. Navy training film MN-6783a. Basic Mechanisms of Fire
Control Computers. Mechanical Computer Instructional Film 27794 (1953)
for the Mk 1A Fire Control Computer.</i>

## Behind the Eye

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_books/includes/behind-the-eye.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_books/includes/behind-the-eye.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//books/behind-the-eye.jpg" style="width:40%">

Figure: <i>[Behind the
Eye](https://www.amazon.co.uk/Behind-Eye-Gifford-Lectures-MACKAY/dp/0631173323)
(MacKay, 1991) summarises MacKay’s Gifford Lectures, where MacKay uses
the operation of the eye as a window on the operation of the brain.</i>

Donald MacKay completed his PhD at King’s College, just down the road
from Bill Phillips who was building the MONIAC at LSE. He was part of
the Ratio Club - a group of early career scientists interested in
communication and control in animals and humans, or more specifically in
computers and brains. They were part of an international movement known
as cybernetics .

Donald MacKay wrote of the influence that his own work on radar had on
his interest in the brain.

> … during the war I had worked on the theory of automated and
> electronic computing and on the theory of information, all of which
> are highly relevant to such things as automatic pilots and automatic
> gun direction. I found myself grappling with problems in the design of
> artificial sense organs for naval gun-directors and with the
> principles on which electronic circuits could be used to simulate
> situations in the external world so as to provide goal-directed
> guidance for ships, aircraft, missiles and the like.

> Later in the 1940’s, when I was doing my Ph.D. work, there was much
> talk of the brain as a computer and of the early digital computers
> that were just making the headlines as “electronic brains.” As an
> analogue computer man I felt strongly convinced that the brain,
> whatever it was, was not a digital computer. I didn’t think it was an
> analogue computer either in the conventional sense.

> But this naturally rubbed under my skin the question: well, if it is
> not either of these, what kind of system is it? Is there any way of
> following through the kind of analysis that is appropriate to their
> artificial automata so as to understand better the kind of system the
> human brain is? That was the beginning of my slippery slope into brain
> research.
>
> *Behind the Eye* pg 40. Edited version of the 1986 Gifford Lectures
> given by Donald M. MacKay and edited by Valerie MacKay

See Lawrence (2024) MacKay, Donald, *Behind the Eye* p. 268-270, 316.

MacKay’s distinction between *analogue* and *digital* computers is key.
As an expert in analogue machines, he understood that an analogue
computer is *literally* an analogue - its components (resistors,
capacitors, inductors, or mechanical parts like cams and gears) have
states that are physically analogous to the states in system being
modeled. Engineers designed these computers by exploiting mathematical
dualities between the computer and the real world. For example a
mechanical system (mass-spring-damper) and electrical circuit
(inductor-resistor-capacitor) could be described by the same
second-order differential equations, allowing designers to map
real-world problems through mathematics into virtual representations.

MacKay’s insights remain relevant when considering modern AI systems.
His questioning of whether the brain was truly digital or analogue might
parallel current debates about large language models like Claude and
ChatGPT, which seem to operate in ways that don’t neatly fit either
paradigm. His work reminds us that understanding the fundamental nature
of intelligent systems - whether biological or artificial - requires
looking beyond simple categorizations.

## Human Analogue Machine

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ai/includes/human-analogue-machines-short.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ai/includes/human-analogue-machines-short.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

The machine learning systems we have built today that can reconstruct
human text, or human classification of images, necessarily must have
some aspects to them that are analagous to our understanding. As MacKay
suggests the brain is neither a digital or an analogue computer, and the
same can be said of the modern neural network systems that are being
tagged as “artificial intelligence”.

I believe a better term for them is “human-analogue machines”, because
what we have built is not a system that can make intelligent decisions
from first principles (a rational approach) but one that observes how
humans have made decisions through our data and reconstructs that
process. Machine learning is more empiricist than rational, but now we
have an empirical approach that distils our evolved intelligence.

HAMs are not representing states of the outside world with analogous
states inside the machine, they are also not (directly) processing
digital states through logic gates to draw their conclusions (although
they are implemented on digital computers that do this to enable them to
update).

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ai/human-analogue-machine.png" style="width:60%">

Figure: <i>The human analogue machine creates a feature space which is
analagous to that we use to reason, one way of doing this is to have a
machine attempt to compress all human generated text in an
auto-regressive manner.</i>

### Heider and Simmel (1944)

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ai/includes/heider-simmel.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ai/includes/heider-simmel.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

In [None]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('8FIEZXMUM2I')

Figure: <i>Fritz Heider and Marianne Simmel’s video of shapes from
Heider and Simmel (1944).</i>

[Fritz Heider](https://en.wikipedia.org/wiki/Fritz_Heider) and [Marianne
Simmel](https://en.wikipedia.org/wiki/Marianne_Simmel)’s experiments
with animated shapes from 1944 (Heider and Simmel, 1944). Our
interpretation of these objects as showing motives and even emotion is a
combination of our desire for narrative, a need for understanding of
each other, and our ability to empathize. At one level, these are
crudely drawn objects, but in another way, the animator has communicated
a story through simple facets such as their relative motions, their
sizes and their actions. We apply our psychological representations to
these faceless shapes to interpret their actions
\[Heider-interpersonal58\].

See also a recent review paper on Human Cooperation by Henrich and
Muthukrishna (2021). See Lawrence (2024) psychological representation
p. 326–329, 344–345, 353, 361, 367.

The perils of developing this capability include counterfeit people, a
notion that the philosopher [Daniel Dennett has described in *The
Atlantic*](https://www.theatlantic.com/technology/archive/2023/05/problem-counterfeit-people/674075/).
This is where computers can represent themselves as human and fool
people into doing things on that basis.

See Lawrence (2024) human-analogue machine p. 343–5, 346–7, 358–9,
365–8.

## LLM Conversations

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ai/includes/conversation-llm.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ai/includes/conversation-llm.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

<img src="https://inverseprobability.com/talks/../slides/diagrams//ai/anne-llm-conversation.svg" class="" width="80%" style="vertical-align:middle;">

Figure: <i>The focus so far has been on reducing uncertainty to a few
representative values and sharing numbers with human beings. We forget
that most people can be confused by basic probabilities for example the
prosecutor’s fallacy.</i>

In [None]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('0sJjdxn5kcI')

Figure: <i>The Inner Monologue paper suggests using LLMs for robotic
planning (Huang et al., 2023).</i>

By interacting directly with machines that have an understanding of
human cultural context, it should be possible to share the nature of
uncertainty in the same way humans do. See for example the paper [Inner
Monologue: Embodied Reasoning through
Planning](https://innermonologue.github.io/) Huang et al. (2023).

## Intellectual Debt

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_ai/includes/intellectual-debt-blog-post.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_ai/includes/intellectual-debt-blog-post.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

<img class="" src="https://inverseprobability.com/talks/../slides/diagrams//ai/2020-02-12-intellectual-debt.png" style="width:70%">

Figure: <i>Jonathan Zittrain’s term to describe the challenges of
explanation that come with AI is Intellectual Debt.</i>

In the context of machine learning and complex systems, Jonathan
Zittrain has coined the term [“Intellectual
Debt”](https://medium.com/berkman-klein-center/from-technical-debt-to-intellectual-debt-in-ai-e05ac56a502c)
to describe the challenge of understanding what you’ve created. In [the
ML@CL group we’ve been foucssing on developing the notion of a
*data-oriented
architecture*](https://mlatcl.github.io/projects/data-oriented-architectures-for-ai-based-systems.html)
to deal with intellectual debt (Cabrera et al., 2023).

Zittrain points out the challenge around the lack of interpretability of
individual ML models as the origin of intellectual debt. In machine
learning I refer to work in this area as fairness, interpretability and
transparency or FIT models. To an extent I agree with Zittrain, but if
we understand the context and purpose of the decision making, I believe
this is readily put right by the correct monitoring and retraining
regime around the model. A concept I refer to as “progression testing”.
Indeed, the best teams do this at the moment, and their failure to do it
feels more of a matter of technical debt rather than intellectual,
because arguably it is a maintenance task rather than an explanation
task. After all, we have good statistical tools for interpreting
individual models and decisions when we have the context. We can
linearise around the operating point, we can perform counterfactual
tests on the model. We can build empirical validation sets that explore
fairness or accuracy of the model.

See Lawrence (2024) intellectual debt p. 84, 85, 349, 365.

But if we can avoid the pitfalls of counterfeit people, this also offers
us an opportunity to *psychologically represent* (Heider, 1958) the
machine in a manner where humans can communicate without special
training. This in turn offers the opportunity to overcome the challenge
of *intellectual debt*.

Despite the lack of interpretability of machine learning models, they
allow us access to what the machine is doing in a way that bypasses many
of the traditional techniques developed in statistics. But understanding
this new route for access is a major new challenge.

## HAM

<span class="editsection-bracket"
style="">\[</span><span class="editsection"
style=""><a href="https://github.com/lawrennd/snippets/edit/main/_data-science/includes/new-flow-of-information-ham.md" target="_blank" onclick="ga('send', 'event', 'Edit Page', 'Edit', 'https://github.com/lawrennd/snippets/edit/main/_data-science/includes/new-flow-of-information-ham.md', 13);">edit</a></span><span class="editsection-bracket" style="">\]</span>

The Human-Analogue Machine or HAM therefore provides a route through
which we could better understand our world through improving the way we
interact with machines.

<img src="https://inverseprobability.com/talks/../slides/diagrams//data-science/new-flow-of-information004.svg" class="" width="70%" style="vertical-align:middle;">

Figure: <i>The trinity of human, data, and computer, and highlights the
modern phenomenon. The communication channel between computer and data
now has an extremely high bandwidth. The channel between human and
computer and the channel between data and human is narrow. New direction
of information flow, information is reaching us mediated by the
computer. The focus on classical statistics reflected the importance of
the direct communication between human and data. The modern challenges
of data science emerge when that relationship is being mediated by the
machine.</i>

The HAM can provide an interface between the digital computer and the
human allowing humans to work closely with computers regardless of their
understandin gf the more technical parts of software engineering.

<img src="https://inverseprobability.com/talks/../slides/diagrams//data-science/new-flow-of-information-ham.svg" class="" width="70%" style="vertical-align:middle;">

Figure: <i>The HAM now sits between us and the traditional digital
computer.</i>

Of course this route provides new routes for manipulation, new ways in
which the machine can undermine our autonomy or exploit our cognitive
foibles. The major challenge we face is steering between these worlds
where we gain the advantage of the computer’s bandwidth without
undermining our culture and individual autonomy.

See Lawrence (2024) human-analogue machine (HAMs) p. 343-347, 359-359,
365-368.

## Networked Interactions

Our modern society intertwines the machine with human interactions. The
key question is who has control over these interfaces between humans and
machines.

<img src="https://inverseprobability.com/talks/../slides/diagrams//ai/human-computers-interacting.svg" class="" width="80%" style="vertical-align:middle;">

Figure: <i>Humans and computers interacting should be a major focus of
our research and engineering efforts.</i>

So the real challenge that we face for society is understanding which
systemic interventions will encourage the right interactions between the
humans and the machine at all of these interfaces.

## Thanks!

For more information on these subjects and more you might want to check
the following resources.

-   book: [The Atomic
    Human](https://www.penguin.co.uk/books/455130/the-atomic-human-by-lawrence-neil-d/9780241625248)
-   twitter: [@lawrennd](https://twitter.com/lawrennd)
-   podcast: [The Talking Machines](http://thetalkingmachines.com)
-   newspaper: [Guardian Profile
    Page](http://www.theguardian.com/profile/neil-lawrence)
-   blog:
    [http://inverseprobability.com](http://inverseprobability.com/blog.html)

## References

Andrade-Pacheco, R., Mubangizi, M., Quinn, J., Lawrence, N.D., 2014.
Consistent mapping of government malaria records across a changing
territory delimitation. Malaria Journal 13.
<https://doi.org/10.1186/1475-2875-13-S1-P5>

Cabrera, C., Paleyes, A., Thodoroff, P., Lawrence, N.D., 2023.
[Real-world machine learning systems: A survey from a data-oriented
architecture perspective](https://arxiv.org/abs/2302.04810).

Felin, T., Koenderink, J., Krueger, J.I., Noble, D., Ellis, G.F.R.,
2021. The data-hypothesis relationship. Genome Biology 22.
<https://doi.org/10.1186/s13059-021-02276-4>

Gething, P.W., Noor, A.M., Gikandi, P.W., Ogara, E.A.A., Hay, S.I.,
Nixon, M.S., Snow, R.W., Atkinson, P.M., 2006. Improving imperfect data
from health management information systems in Africa using space–time
geostatistics. PLoS Medicine 3.
<https://doi.org/10.1371/journal.pmed.0030271>

Heider, F., 1958. The psychology of interpersonal relations. John Wiley.

Heider, F., Simmel, M., 1944. An experimental study of apparent
behavior. The American Journal of Psychology 57, 243–259.
<https://doi.org/10.2307/1416950>

Henrich, J., Muthukrishna, M., 2021. The origins and psychology of human
cooperation. Annual Review of Psychology 72, 207–240.
<https://doi.org/10.1146/annurev-psych-081920-042106>

Huang, W., Xia, F., Xiao, T., Chan, H., Liang, J., Florence, P., Zeng,
A., Tompson, J., Mordatch, I., Chebotar, Y., Sermanet, P., Jackson, T.,
Brown, N., Luu, L., Levine, S., Hausman, K., ichter, brian, 2023. [Inner
monologue: Embodied reasoning through planning with language
models](https://proceedings.mlr.press/v205/huang23c.html), in: Liu, K.,
Kulic, D., Ichnowski, J. (Eds.), Proceedings of the 6th Conference on
Robot Learning, Proceedings of Machine Learning Research. PMLR, pp.
1769–1782.

Lawrence, N.D., 2024. [The atomic human: Understanding ourselves in the
age of
AI](https://www.penguin.co.uk/books/455130/the-atomic-human-by-lawrence-neil-d/9780241625248).
Allen Lane.

Lawrence, N.D., 2015. [How Africa can benefit from the data
revolution](https://www.theguardian.com/media-network/2015/aug/25/africa-benefit-data-science-information).

Lawrence, N.D., 2010. Introduction to learning and inference in
computational systems biology.

MacKay, D.M., 1991. Behind the eye. Basil Blackwell.

Mubangizi, M., Andrade-Pacheco, R., Smith, M.T., Quinn, J., Lawrence,
N.D., 2014. Malaria surveillance with multiple data sources using
Gaussian process models, in: 1st International Conference on the Use of
Mobile ICT in Africa.

Simons, D.J., Chabris, C.F., 1999. Gorillas in our midst: Sustained
inattentional blindness for dynamic events. Perception 28, 1059–1074.
<https://doi.org/10.1068/p281059>

The Admiralty, 1945. [The gunnery pocket book, b.r.
224/45](https://www.maritime.org/doc/br224/).

Wiener, N., 1948. Cybernetics: Control and communication in the animal
and the machine. mitp, Cambridge, MA.

Yanai, I., Lercher, M., 2020. A hypothesis is a liability. Genome
Biology 21.