# First time with ``giotto-tda`` Mapper

In [None]:
from IPython.display import SVG, display

import numpy as np
from sklearn.cluster import DBSCAN

# Data viz
from gtda.plotting import plot_point_cloud

from gtda.mapper import (
    Projection,
    CubicalCover,
    make_mapper_pipeline,
    MapperInteractivePlotter
    )

Create 1000 points in 5 dimensions

In [None]:
X = np.random.random((1000, 5))
plotly_params={"trace": {"marker": {"size": 2}}}
plot_point_cloud(X, plotly_params=plotly_params)

## Define a `MapperPipeline`

In [None]:
display(SVG("https://giotto-ai.github.io/gtda-docs/latest/_images/mapper_pipeline.svg"))

1. **Filter fuction**: projection onto first 2 coordinates
2. **Covering scheme**: a "uniform" cover by rectangles
3. **Clusterer**: ``DBSCAN`` with default values

In [None]:
# Filter function -- project on first two dimensions
filter_func = Projection(columns=[0, 1])

# Covering scheme -- uniform rectangular cover with 15 intervals
cover = CubicalCover(n_intervals=10,
                     overlap_frac=0.3)

# Clustering scheme -- DBSCAN from sklearn
clusterer = DBSCAN()

pipeline = make_mapper_pipeline(
    filter_func=filter_func,
    cover=cover,
    clusterer=clusterer,
#     store_edge_elements=True
    )

## Fit-transform the pipeline on the data to have an `igraph.Graph` object

In [None]:
graph = pipeline.fit_transform(X)
graph

Node and edge information are stored in the ``vs`` and ``es`` attributes of the ``igraph.Graph`` object.

In [None]:
nodes = graph.vs
edges = graph.es

n_nodes = len(nodes)
n_edges = len(edges)
print(f"There are {n_nodes} nodes and {n_edges} edges.")

Nodes and edges contain metadata stored as attributes:

In [None]:
print(f"Available node attributes: {nodes.attributes()}")
print(f"Available edge attributes: {edges.attributes()}")

[*Note*: by passing ``store_edge_elements=True`` in addition to ``make_mapper_pipeline``, we could store the intersections between nodes as an additional edge attribute ``'edge_elements'``. Just remove the commented argument from ``make_mapper_pipeline`` above.]

We can query an individual node (or edge), say the first one as globally indexed in the graph, by simply indexing:

In [None]:
i = 0
print(f"Node {i} has the following attributes:\n{nodes[i].attributes()}\n")
print(f"This node comes from pullback cover set {nodes[i]['pullback_set_label']} and represents {len(nodes[i]['node_elements'])} data points.")

In [None]:
j = 0
print(f"Edge {j} has the following attributes:\n{edges[j].attributes()}\n")
print(f"This edge has weight {edges[j]['weight']}, i.e. it represents {edges[j]['weight']} data points.")

## Plot the resulting Mapper graph *interactively*!

In [None]:
plotter = MapperInteractivePlotter(pipeline, X)
plotter.plot()

With this API, you can inspect the *current* state of the objects you interactively changed:

In [None]:
print("Attributes:", [attr for attr in dir(plotter) if attr.endswith("_") and attr[0] != "_"])

Now try running the cell below, then change some parameter in the widget above so that the graph changes, and run the cell below again!

In [None]:
current_graph = plotter.graph_
n_nodes = len(current_graph.vs)
print(f"There are {n_nodes} nodes in the currently displayed graph!")