# Classifying hand-written digits

In this tutorial, we will try to classify hand-written digits using the tools seen in previous chapters.

## Loading packages


In [None]:
using MLDatasets
using Images, Makie, CairoMakie
using Distances
using Ripserer, PersistenceDiagrams#, Plots
using StatsBase: mean

## The dataset

MNIST is a dataset consisting of ??? hand-written digits. Each digit is a 28x28 grayscale image, that is: a 28x28 matrix of values 0 to 1. To get this dataset, run


In [None]:
train_x, train_y = MNIST(split=:train)[:];

If the console asks you to download some data, just press `y`.

Next, we transpose the digits and plot some of them in a mosaic


In [None]:
# store all digits in a figs variable
figs = [train_x[:, :, i]' |> Matrix for i ∈ 1:size(train_x)[3]]

In [None]:
n = 10

# convert to Gray so we can plot a mosaic
figs_plot = [fig .|> Gray for fig in figs[1:n^2]]
mosaicview(figs_plot, nrow = n, rowmajor = true)

## Preparing the topology weapons

What topological tools can be useful to distinguish between different numbers?

Persistence homology alone won't be of much help. All digits are connected, so the 0-persistence is useless. For the 1-dimensional persistence,

- 0, 2 (not always), 6, 8 and 9 contain holes;
- 1, 3, 4, 5, 7 do not contain holes.

What if we starting chopping the digits with sublevels of some functions? Two come readily to the mind: the density and the excentricity. The excentricity is able to detect "edges" of our figures, and the density will see where there are many points together.

### From matrices to points in the plane

In order to calculate the excentricity, we need to convert the digits to points in $\mathbb{R}^2$. A simple function can do that:


In [None]:
function img_to_points(img, threshold = 0.5)
    ids = findall(x -> x >= threshold, img)
    pts = getindex.(ids, [1 2])
    pts
end

Notice that we had to define a threshold: coordinates with values less than the threshold are not considered.

Let's also define a function to plot the digit in $\mathbb{R}^2$:


In [None]:
function plot_digit(pt, values = :black)
    f = Figure();
    ax = Makie.Axis(f[1, 1], autolimitaspect = 1)
    scatter!(ax, pt; markersize = 40, marker = :rect, color = values)
    if values isa Vector{<:Real}
        Colorbar(f[1, 2])
    end
    f
end

We can see that it works as expected:


In [None]:
pt = img_to_points(figs[1]');
plot_digit(pt)

The excentricity of a metric space $(X, d)$ is a measure of how far a point is from the "center". It is defined as follows for each $x \in X$:

$$
e(x) = \sum_{y \in X} \frac{d(x, y)}{N}
$$

where $N$ is the amount of points of $X$.
