## Potly plot of the kmapper graph for the Fashion MNIST data set ##

Fashion MNIST is a newer dataset provided to the ML community by Zalando Research. See details [here](https://github.com/zalandoresearch/fashion-mnist).

In [None]:
import numpy as np
import pandas as pd
import sklearn
from sklearn import datasets
import umap
import kmapper as km
from kmapper import plotlyviz as pl

In [None]:
from plotly.offline import download_plotlyjs, init_notebook_mode,  iplot
init_notebook_mode(connected=True)

In [None]:
pl_brewer=[[0.0, '#a50026'],
           [0.1, '#d73027'],
           [0.2, '#f46d43'],
           [0.3, '#fdae61'],
           [0.4, '#fee08b'],
           [0.5, '#ffffbf'],
           [0.6, '#d9ef8b'],
           [0.7, '#a6d96a'],
           [0.8, '#66bd63'],
           [0.9, '#1a9850'],
           [1.0, '#006837']]

Download the file `fashion-mnist_test.csv`  from [kaggle](https://www.kaggle.com/residentmario/dimensionality-reduction-and-pca-for-fashion-mnist/data),
save it in your working directory and read it.  It consists in 10000 28x28-grayscale images and their associated labels.

In [None]:
df = pd.read_csv('fashion-mnist_test.csv')
X = df.iloc[:, 1:].values
y = (df.iloc[:, :1].values).reshape(-2)

Define the dict (label: fashion), where fashion stands for ten fashion items, such as  clothes, shoes, bags:

In [None]:
fashion_dict={0: 'T-shirt/top',
              1: 'Trouser',
              2: 'Pullover',
              3: 'Dress',
              4: 'Coat',
              5: 'Sandal',
              6: 'Shirt',
              7: 'Sneaker',
              8: 'Bag',
              9: 'Ankle boot'}

In [None]:
mapper = km.KeplerMapper(verbose=0)


projected_data = mapper.fit_transform(X, projection=umap.UMAP(n_neighbors=5,
                                                              n_components=2,
                                                              min_dist=0.1,
                                                              random_state=123
                                                            )) 

In [None]:
scomplex = mapper.map(projected_data,
                      clusterer=sklearn.cluster.DBSCAN(eps=0.15, min_samples=6),#0.1 15
                      coverer=km.Cover(23, 0.15))#20

In [None]:
color_function=projected_data[:,0]-projected_data[:,0].min()
kmgraph,  meta=mapper.visualize(scomplex, custom_tooltips=y,  color_function=color_function, path_html=None)
#Comment the above line line and uncomment the next one to get the Kepler-Mapper original graph
#html=mapper.visualize(scomplex,  color_function=color_function, path_html='fashion-mnist.html')

In [None]:
plotly_graph_data=pl.plotly_graph(kmgraph, graph_layout='fr', colorscale=pl_brewer,  
                                  reversescale=True, factor_size=2, edge_linewidth=0.5)
title='Topological network representing the  Fashion MNIST  dataset,<br> via   Kepler-Mapper,\
       and UMAP as a filter function'
layout=pl.plot_layout(title=title,  width=800, height=800,
                      annotation_text=meta,  
                      bgcolor='rgba(0,0,0, 1)')

fig_network=dict(data=plotly_graph_data, layout=layout)
iplot(fig_network)

Some node tooltips display a too long rectangle. To avoid this inconvenient pass `keep_kmtooltips=False` to the function
`pl.plotly_graph()`:

In [None]:
new_plotly_graph_data=pl.plotly_graph(kmgraph, graph_layout='fr', colorscale=pl_brewer,  keep_kmtooltips=False,
                                  reversescale=True, factor_size=2, edge_linewidth=0.5)
fig_network=dict(data=new_plotly_graph_data, layout=layout)
iplot(fig_network)

To keep the initial information displayed for each node, we count the number of labels of each type associated to a graph node and update the initial tooltips:

In [None]:
tooltips=new_plotly_graph_data[1]['text']

Define custom tooltips that point out how many items from each fashion type form a cluster(node):

In [None]:
for j, node in enumerate(kmgraph['nodes']):
    member_label_ids=y[scomplex['nodes'][node['name']]]
    member_labels=[fashion_dict[id] for id in member_label_ids]
    f_type, f_number=np.unique(member_labels, return_counts=True) 
    for m in range(len(f_number)):
        tooltips[j]+='<br>'+str(f_type[m])+': '+ str(f_number[m])

new_plotly_graph_data[1].update(text=tooltips)

In [None]:
new_plotly_graph_data[1]['marker']['colorbar'].update(title='proj-data<br>x-coord')
fign=dict(data=new_plotly_graph_data, layout=layout)

iplot(fign)