# Sheet 7

Johannes van de Locht, Finn Kalvelage, Anna Beckers

## Task 1

## Task 2

### a) What is the difference between geometric zoom and detail zoom in the system?

Geometric zoom maps a part of the adjacency matrix from the current detail level to the viewport. This can be navigated by the user through zooming in and out (changing the amount of cells displayed in the viewport) or through horizontal and vertical translation. This makes the geometric zoom continious.

The detail zoom switches through the different detail levels and is therefore discrete.

### b) The paper suggests eight different ways to visualize aggregated edge information. Which of them are actually used in the screenshots in Figures 1 and 9? 

The images in figure 1 both use *step histograms* for cells with edges while the left image of figure 9 uses the *average* visualization and the right image looks like it uses *color shading* though the used color shows no differences between cells.

### c) What is the role of tile management within the ZAME system? What is an LRU cache? 

As the data structure (especially for bigger datasets) can be to large for the RAM or VRAM (atleast at the time of the release of the paper) not the complete adjacency matrix let alone each detail level could be present in the cache (RAM or VRAM). The tile manager is used for the loading the correct detail level and has the ability to temporarily present a lower detail level during the loading process. The policy for choosing which tiles to keep in the caches is the least recently used (LRU) policy which discards the tile that wasn't used for the longest when a new tile needs to be cached and the cache is already full.

### d) How does ZAME aggregate nominal attributes? Why is that problematic?

As there is no immediatly obvious way for nominal attributes to be aggregated ZAME uses the first label that represents the whole aggregate. This method fails to capture the diversity of the whole aggregate and can also be quite arbitrary depending on the ordering of labels.

### e) Why is the Traveling Salesman Problem relevant to adjacency matrix based graph visualization?

ZAME uses an aproximation algorithm for the TSP to find good orderings of adjacency matrices for graphs with relativly dissimilar weighted edges. The desired ordering of vertices to build good local structures probably arises from the need to travel to each vertex of a highly interconnected cluster in quick succession to achieve the shortest route.

### f) In the pseudocode listed in the paper’s Figure 4, some modifications are highlighted in boldface, on lines starting with a bar. What is the purpose of these modifications?

The bold-faced parts of the algorithm exist to spread the pivot point equally around the subtrees. Without the penalization the algorithm would prioritize the 2 largest subtrees and therefore misrepresent the rest of the data.

## Task 3

In [1]:
from dash import Dash, html, State, dcc, callback, Output, Input
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from sklearn.decomposition import PCA
from sklearn.manifold import Isomap, TSNE

In [2]:
df = pd.read_excel('Data_Cortex_Nuclear.xls').dropna()
df_filt = df[df['class'].isin(['t-CS-s', 'c-CS-s'])].reset_index(drop=True)
classes = df_filt['class']
df_filt = df_filt.drop(columns=['MouseID', 'Genotype', 'Treatment', 'Behavior', 'class'])

df_filt_classes = df_filt.copy()
df_filt_classes['class'] = classes

In [3]:
pca = PCA(n_components=2)
df_pca = pd.DataFrame(pca.fit_transform(df_filt))
df_pca['class'] = classes

iso = Isomap(n_neighbors=8)
df_iso = pd.DataFrame(iso.fit_transform(df_filt))
df_iso['class'] = classes

tsne = TSNE(perplexity=5.0)
df_tsne = pd.DataFrame(tsne.fit_transform(df_filt))
df_tsne['class'] = classes

In [4]:
app = Dash()

In [5]:
reduc_methods = html.Div([
                dcc.RadioItems(
                    options=[
                        {'label': 'PCA', 'value': 'PCA'},
                        {'label': 'ISOMAP', 'value': 'ISOMAP'},
                        {'label': 't-SNE', 'value': 't-SNE'}
                    ],
                    value='PCA',
                    id='dim_reduc_method',
                    labelStyle={'display': 'inline-block', 'margin-right': '10px'}
                )
            ], style={
                'display': 'flex',
                'justifyContent': 'center',
                'alignItems': 'center',
                'width': '50%'
            })

dropdown_1 = html.Div([
                    html.Div("x-Axis", style={'textAlign': 'center', 'fontWeight': 'bold', 'marginBottom': '2px'}),
                    dcc.Dropdown(
                        options=[{'label': col, 'value': col} for col in df_filt.columns],
                        value=df_filt.columns[0],
                        id='dropdown_1',
                        clearable=False,
                        style={'width': '200px'}
                    )
                ], style={'margin': '5px'})

dropdown_2 = html.Div([
                    html.Div("y-Axis", style={'textAlign': 'center', 'fontWeight': 'bold', 'marginBottom': '2px'}),
                    dcc.Dropdown(
                        options=[{'label': col, 'value': col} for col in df_filt.columns],
                        value=df_filt.columns[1],
                        id='dropdown_2',
                        clearable=False,
                        style={'width': '200px'}
                    )
                ], style={'margin': '5px'})

button = html.Button(
                    "Add",
                    id='add_button',
                    n_clicks=0,
                    style={'height': '35px', 'margin': '5px', 'alignSelf': 'flex-end'}
                )

options = html.Div([
            reduc_methods,
        
            html.Div([
                dropdown_1,
                dropdown_2,
                button
            ], style={
                'display': 'flex',
                'flexDirection': 'row',
                'justifyContent': 'center',
                'alignItems': 'flex-end',
                'width': '50%'
            })
        ], style={
            'display': 'flex',
            'flexDirection': 'row',
            'marginBottom': '20px'
        })

In [6]:
graphs_top = html.Div([dcc.Graph(figure={}, id='dim_reduc_graph', style={'width': '50%'}),
                       dcc.Graph(figure={}, id='map_ind_feats', style={'width': '50%'})], 
                      style={'display': 'flex', 'justifyContent': 'space-between', 'gap': '20px'})

In [7]:
@callback(
    Output(component_id='dim_reduc_graph', component_property='figure'),
    Input(component_id='dim_reduc_method', component_property='value')
)
def update_vis(meth_chosen):
    if meth_chosen == 'PCA':
        fig = px.scatter(df_pca, x=0, y=1, color='class', title="PCA Projection")
        fig.update_layout(yaxis=dict(scaleanchor="x", scaleratio=1))
    elif meth_chosen == 'ISOMAP':
        fig = px.scatter(df_iso, x=0, y=1, color='class', title="ISOMAP Projection")
        fig.update_layout(yaxis=dict(scaleanchor="x", scaleratio=1))
    elif meth_chosen == 't-SNE':
        fig = px.scatter(df_tsne, x=0, y=1, color='class', title="t-SNE Projection")
        fig.update_layout(yaxis=dict(scaleanchor="x", scaleratio=1))
    else:
        fig = {}
    return fig

In [8]:
@callback(
    Output(component_id='map_ind_feats', component_property='figure'),
    Input(component_id='dropdown_1', component_property='value'),
    Input(component_id='dropdown_2', component_property='value')
)
def map_features(col1, col2):
    fig = px.scatter(df_filt_classes, x=col1, y=col2, color='class')
    return fig

In [9]:
graph_bottom_storage = dcc.Store(id='plot_history', data=[])
graph_bottom_cont = html.Div(id='history_plots_container')

In [10]:
@callback(
    Output(component_id='plot_history', component_property='data'),
    Input(component_id='add_button', component_property='n_clicks'),
    State(component_id='dropdown_1', component_property='value'),
    State(component_id='dropdown_2', component_property='value'),
    State(component_id='plot_history', component_property='data')
)
def add_plot(n_clicks, col1, col2, plot_hist):
    if n_clicks > 0:
        fig = px.scatter(df_filt_classes, x=col1, y=col2, color='class')
        new_plot = fig.to_dict()
        plot_hist.append(new_plot)
        if len(plot_hist) > 4:
            plot_hist = plot_hist[-4:]
    return plot_hist

In [11]:
@callback(
    Output('history_plots_container', 'children'),
    Input('plot_history', 'data')
)
def vis_plot_hist(plot_history):
    if not plot_history:
        return []

    plots = [
        dcc.Graph(
            figure=go.Figure(fig),
            style={
                'width': '22%',
                'aspectRatio': '1',
                'margin': '10px'
            }
        ) for fig in plot_history
    ]

    return html.Div(
        children=plots,
        style={
            'display': 'flex',
            'justifyContent': 'center',
            'alignItems': 'center',
            'gap': '10px',
            'overflowX': 'auto'
        }
    )

In [12]:
app.layout = html.Div([
    graph_bottom_storage,
    
    html.H2('Interactive Visualization with Dash', style={'textAlign': 'center'}),
    html.Hr(),
    options,
    graphs_top,
    graph_bottom_cont
])      

if __name__ == '__main__':
    app.run(jupyter_mode="external")

Dash app running on http://127.0.0.1:8050/
