# Mapper

In this notebook, we will explore the Mapper algorithm for graph-based dimension reduction and data visualization. A nice implementation of the Mapper algorithm is built into the `giotto-tda` library. The first part of the notebook is adapted from a tutorial on the `giotto` [Github](https://github.com/giotto-ai/giotto-tda). As was mentioned last week, they have several other nice notebooks describing applications of TDA.

## Import libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from gtda.plotting import plot_point_cloud

from gtda.mapper import (
    CubicalCover,
    OneDimensionalCover,
    FirstSimpleGap,
    Eccentricity,
    Entropy,
    FirstHistogramGap,
    make_mapper_pipeline,
    Projection,
    plot_static_mapper_graph,
    plot_interactive_mapper_graph
)

from sklearn import datasets
from sklearn.cluster import DBSCAN
from sklearn.decomposition import PCA

## Toy Data

We'll start with a basic example which will allow us to easily interpret the results of the mapper algorithm.

In [None]:
data, _ = datasets.make_circles(n_samples=5000, noise=0.05, factor=0.3, random_state=42)

plot_point_cloud(data)

## The Mapper Algorithm

Recall that a Mapper visualization of a finite metric space $(X,d_X)$ (such as a point cloud in $\mathbb{R}^n$) requires several parameters:

- A reference topological space $Y$ --- typically $\mathbb{R}$

- A function $f:X \to Y$ --- e.g., a data-driven function, a density estimate, eccentricity, etc.

- An open cover of $Y$ --- we'll generally pick from a built-in family of covers of $\mathbb{R}$

- A clustering algorithm --- our definition in class always used clusters coming from a Vietoris-Rips complex at a chosen scale $\epsilon > 0$; as was mentioned in class, we can choose from other clustering algorithms as well.

For each set $V_i$ in the open cover of $Y$, we pull back to an open set $U_i = f^{-1}(V_i)$ in $Y$. The open set is divided into clusters $U_i^{(1)},\ldots,U_i^{(n_i)}$ via the clustering algorithm. The Mapper graph is the 1-skeleton of the *nerve* of the collection $\{U_i^{(j)}\}$.

Let's choose various parameters now.

#### Filter Function

We can choos any `scikit-learn` [transformer](https://scikit-learn.org/stable/data_transforms.html). There are also some functions built into `giotto-tda`---see the [documentation](https://giotto-ai.github.io/gtda-docs/latest/modules/mapper.html). 

Here we are using a function $f:X \to \mathbb{R}$ which projects onto the second coordinate. 

In [None]:
filter_func = Projection(columns=[1])

#### Cover

We'll choose from a built-in family of covers. Here we are just choosing a number of intervals and how much they overlap (similar to the picture we drew in lecture).

In [None]:
cover = OneDimensionalCover(n_intervals=10, overlap_frac=0.3)

#### Clustering

The default clustering algorithm is called `DBSCAN`---this is more refined than the Vietoris-Rips-based clustering described in class.

In [None]:
clusterer = DBSCAN()

#### Mapper Pipeline

Now we create a Mapper pipeline which takes in all of these parameters. There is also an `n_jobs` parameter, in case we wanted to run some things in parallel.

In [None]:
pipe = make_mapper_pipeline(
    filter_func=filter_func,
    cover=cover,
    clusterer=clusterer,
    verbose=False,
    n_jobs=1,
)

## Visualise the Mapper graph

Here is a plot of the resulting mapper graph with default options.

In [None]:
fig = plot_static_mapper_graph(pipe, data)
fig.show(config={'scrollZoom': True})

Note that these choices of parameters aren't capturing the correct topology. The Mapper graph *is* observing the two holes in the data, but it's not seeing the two connected components. Let's play around with parameters to see if we can get a Mapper graph that looks closer to the original data.

First we can change around the number of intervals and overlap fraction.

In [None]:
filter_func = Projection(columns=[1])
cover = OneDimensionalCover(n_intervals=20, overlap_frac=0.1)
clusterer = DBSCAN()

pipe = make_mapper_pipeline(
    filter_func=filter_func,
    cover=cover,
    clusterer=clusterer,
    verbose=False,
    n_jobs=1,
)

fig = plot_static_mapper_graph(pipe, data)
fig.show(config={'scrollZoom': True})

I can't find parameters that will get the above to work. 

We can also change the clustering algorithm. Here we use the `FirstSimpleGap` function. This uses the Vietoris-Rips construction, but automatically chooses $\epsilon$ to be the height of the first 'significant gap' in the hierarchical clustering dendrogram.

In [None]:
filter_func = Projection(columns=[1])
cover = OneDimensionalCover(n_intervals=30, overlap_frac=0.3)
clusterer = FirstSimpleGap()

pipe = make_mapper_pipeline(
    filter_func=filter_func,
    cover=cover,
    clusterer=clusterer,
    verbose=False,
    n_jobs=1,
)

fig = plot_static_mapper_graph(pipe, data)
fig.show(config={'scrollZoom': True})

This seems to give a correct result!

We can also change the filter function to, say, $p$-eccentricity (as introduced in class). This picks up connected components, but not cycles.

In [None]:
filter_func = Eccentricity(exponent = 2)
cover = OneDimensionalCover(n_intervals=30, overlap_frac=0.1)
clusterer = FirstSimpleGap()

pipe = make_mapper_pipeline(
    filter_func=filter_func,
    cover=cover,
    clusterer=clusterer,
    verbose=False,
    n_jobs=1,
)

fig = plot_static_mapper_graph(pipe, data)
fig.show(config={'scrollZoom': True})

Finally, we could change the target space $Y$ *and* the function $f:X \to Y$. We'll use $Y = \mathbb{R}^2$ and $f$ the identity function---this is not realistic for high-dimensional data, but we can use it for this low-dimensional example.

Since we are changing the target space, we also need to change the type of cover. We'll cover $Y$ by squares with a prescribed overlap percentage.

In [None]:
filter_func = Projection(columns=[0,1])
cover = CubicalCover(n_intervals=10, overlap_frac=0.2)
clusterer = DBSCAN()

pipe = make_mapper_pipeline(
    filter_func=filter_func,
    cover=cover,
    clusterer=clusterer,
    verbose=False,
    n_jobs=1,
)

fig = plot_static_mapper_graph(pipe, data)
fig.show(config={'scrollZoom': True})

This produces a very good result, which makes sense since we are using a filtration which faithfully represents the data.

## Visualization Parameters

There are lots of options built into `giotto-tda` for visualizing Mapper graphs. We'll explore some options here.

#### Node Coloring

In [None]:
plotly_params = {"node_trace": {"marker_colorscale": "Blues"}}
fig = plot_static_mapper_graph(
    pipe, data, color_by_columns_dropdown=True, plotly_params=plotly_params
)
fig.show(config={'scrollZoom': True})

#### Graph Layout

In [None]:
fig = plot_static_mapper_graph(
    pipe, data, layout="fruchterman_reingold", color_by_columns_dropdown=True
)
fig.show(config={'scrollZoom': True})

#### Plotting Dimension

In [None]:
fig = plot_static_mapper_graph(pipe, data, layout_dim=3, color_by_columns_dropdown=True)
fig.show(config={'scrollZoom': True})

#### Node scale

In [None]:
node_scale = 50
fig = plot_static_mapper_graph(pipe, data, layout_dim=3, node_scale=node_scale)
fig.show(config={'scrollZoom': True})

## Run the Mapper pipeline

Behind the scenes of ``plot_static_mapper_graph`` is a ``MapperPipeline`` object ``pipe`` that can be used like a typical ``scikit-learn`` estimator. For example, to extract the underlying graph data structure we can do the following:

In [None]:
graph = pipe.fit_transform(data)

The resulting graph is a [python-igraph](https://igraph.org/python/) object which stores node metadata in the form of attributes. We can access this data as follows:

In [None]:
graph.vs.attributes()

Here ``'pullback_set_label'`` and ``'partial_cluster_label'`` refer to the interval and cluster sets described above. ``'node_elements'`` refers to the indices of our original data that belong to each node. For example, to find which points belong to the first node of the graph we can access the desired data as follows:

In [None]:
node_id = 0
node_elements = graph.vs["node_elements"]

print(f"""
Node ID: {node_id}
Node elements: {node_elements[node_id]}
Data points: {data[node_elements[node_id]]}
""")

## Creating custom filter functions

We can create custom filter functions. These will act row-wise on the input data.

In [None]:
filter_func = np.sum

pipe = make_mapper_pipeline(
    filter_func=filter_func,
    cover=cover,
    clusterer=clusterer,
    verbose=False,
    n_jobs=1,
)

In [None]:
fig = plot_static_mapper_graph(pipe, data)
fig.show(config={'scrollZoom': True})

## Interactive Visualization

A useful feature is live visualization, which can be used to tune parameters and see the result in real time.

In [None]:
pipe = make_mapper_pipeline()

# Generate interactive widget
plot_interactive_mapper_graph(pipe, data, color_by_columns_dropdown=True)

## Testing Mapper on Some Datasets

Next let's try to produce Mapper graphs for some real data. The `sklearn` package has several classic toy datasets that can be loaded. The available datasets can be found [here](https://scikit-learn.org/stable/datasets/toy_dataset.html).

We'll start with the `iris` dataset which contains measurements of flower biology coming from three classes of iris'.

The syntax for loading the data looks like this:

In [None]:
from sklearn.datasets import load_iris
iris = load_iris()

This loads a dictionary, and we want to collect the data matrix and labels for analysis.

In [None]:
iris = load_iris()
iris

In [None]:
X = iris['data']
y = iris['target']

Looking at the shape of the data matrix, we see that there are 150 samples, with 4 features per sample.

In [None]:
X.shape

There are 3 classes, indexed by the labels in `y`. The distribution of classes is even.

In [None]:
print(y)
plt.hist(y)
plt.show()

Let's take a look at the Mapper graph and play with parameters. Note that we can color the nodes by ground truth label, rather than function value.

In [None]:
filter_func = Projection(columns = [0])
clusterer = FirstSimpleGap()

pipe = make_mapper_pipeline(filter_func = filter_func, clusterer = clusterer)
plot_interactive_mapper_graph(pipe, X, color_by_columns_dropdown=True, color_variable = y)

We can plot 2D projections of the data. We see that (for certain choices of parameters), the mapper graph does seem to reflect the shape of the data.

In [None]:
proj_coord1 = 0
proj_coord2 = 1

plt.scatter(X[:,proj_coord1],X[:,proj_coord2],c = y)

## Group Work: Other Datasets

In breakout groups, try exploring some other datasets. You can either load one of the other `sklearn` datasets, or use a dataset of your own that you'd like to test this on.

Feel free to work independently in your groups and compare results, or all work together on a single dataset. I will bounce between rooms and try to help troubleshoot as necessary.

Probably the most familiar of the datasets is `digits`, containing low res samples from the famous MNIST handwritten digit dataset.

**Hint:** It can be useful to do some simple linear dimension reduction via PCA before applying the topological dimension reduction. This step is not necessary, but I had some luck with it.

## Extra Time: Exploring Neural Network Activations

There is a cool application of Mapper in [this paper](http://www.sci.utah.edu/~beiwang/publications/TopoAct_BeiWang_2021.pdf) by Rathore et al. They use Mapper graphs to visualize the structural changes that occur as data is passed through the layers of a trained neural network. The paper comes with an interactive demo, available [here](https://tdavislab.github.io/TopoAct/). If you have extra time, explore this webtool with your group---try to understand what the Mapper graphs represent and see if you can find any interesting features as you explore the neural network.