<a href="https://colab.research.google.com/github/mtazike/Visualization_Design_Exercise/blob/main/Week_10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Interactive Visualizations for Networks

Networks and hierarchies such as social networks, geography, etc., are often best stored in the form of **graphs**, a collection of nodes (entities) and edges (connections between nodes). By their very nature, graph data are structured completely different from tabular data, and this makes them difficult to visualize in the usual way.

In this exercise, we'll explore a network visualization and a treemap using Plotly.

*Note: we will continue with Dash Fundamentals Chapters 3 and 4 next week.*

<font color='darkred'>Again, **please make sure to install dash first**.</font>

In [1]:
!pip install dash

Collecting dash
  Downloading dash-3.2.0-py3-none-any.whl.metadata (10 kB)
Collecting retrying (from dash)
  Downloading retrying-1.4.2-py3-none-any.whl.metadata (5.5 kB)
Downloading dash-3.2.0-py3-none-any.whl (7.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.9/7.9 MB[0m [31m21.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading retrying-1.4.2-py3-none-any.whl (10 kB)
Installing collected packages: retrying, dash
Successfully installed dash-3.2.0 retrying-1.4.2


In [2]:
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
import networkx as nx

from dash import Dash, html, dcc, Input, Output, callback
from networkx.algorithms import community

# Exercises

Unfortunately, building a nice interactive network in [Plotly](https://plotly.com/python/network-graphs/) can be a bit combersome. Even if you use their recommended Dash Cytoscape framework, you'll find yourself coding a lot more than usual. In practice, it is recommended that you use a tool like [Gephi](https://gephi.org/) to visualize network data, but learning how to use this software is outside the scope of this class. So, we will still explore an implementation of network visualization in Plotly, but we will do so as an exercise in learning how traces and callbacks work in Dash.

Trees and hierarchies, on the other hand, can be visualized relatively quickly using [treemaps](https://plotly.com/python/treemaps/) in Plotly, and we'll incorporate them here as well.

<font color='darkred'>**Your grade for this exercise will come from the app cell at the bottom of this notebook.**</font>


## EXERCISE 1 (SETUP)

For this exercise (and the next) we will use data from the the Game of Thrones (GOT) [network data set](https://github.com/mathbeveridge/gameofthrones) originally compiled for the [Network of Thrones](https://networkofthrones.wordpress.com/) project. We will use NetworkX to capture our network. *Note: your focus for this exercise will be in Plotly, not NetworkX — all the network management code should be done for you.*

In [3]:
#@title \<capture network data>

# we import just the data from season 1
df_edges = pd.read_csv('https://raw.githubusercontent.com/mathbeveridge/gameofthrones/master/data/got-s1-edges.csv')
df_nodes = pd.read_csv('https://raw.githubusercontent.com/mathbeveridge/gameofthrones/master/data/got-s1-nodes.csv')
df_nodes.set_index('Id', inplace=True)

# build graph
G = nx.from_pandas_edgelist(df_edges, 'Source', 'Target', 'Weight')

# calculate communities
communities = community.greedy_modularity_communities(G, weight='Weight')

df_nodes['Community'] = 0
for i, c in enumerate(communities):
    df_nodes.loc[c, 'Community'] = i + 1

df_nodes.loc[:, 'Community'] = df_nodes['Community'].astype(str)

# name a few popular characters according to Google
popular_chars = ['Ned', 'Tyrian', 'Daenerys', 'Arya', 'Jon', 'Eddard', 'Brienne', 'Jaime', 'Cersei', 'Sandor']
df_nodes['Popularity'] = df_nodes['Label'].apply(lambda s: 'Popular' if s in popular_chars else 'Normal')

# calculate networkx layout positions, e.g., `spring_layout`
pos = nx.spring_layout(G)

x = []
y = []

for node in G.nodes():
    x_, y_ = pos[node]
    x.append(x_)
    y.append(y_)

# add locations to node dataset
df_nodes['x'] = x
df_nodes['y'] = y

# calculate max weight for visualizing edges
max_weight = df_edges['Weight'].max()

 '4' '4' '3' '4' '4' '2' '6' '1' '2' '3' '1' '2' '2' '7' '1' '1' '4' '1'
 '4' '4' '5' '1' '3' '3' '4' '5' '1' '3' '1' '2' '4' '1' '2' '5' '1' '5'
 '4' '1' '7' '1' '3' '1' '5' '1' '3' '2' '4' '1' '5' '3' '1' '1' '4' '5'
 '1' '1' '5' '1' '2' '2' '3' '3' '1' '3' '4' '4' '3' '3' '1' '1' '4' '2'
 '1' '2' '2' '1' '5' '2' '1' '2' '6' '2' '3' '1' '1' '5' '5' '1' '1' '1'
 '2' '1' '2' '1' '1' '1' '5' '5' '5' '5' '1' '1' '4' '2' '6' '6' '4' '1']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  df_nodes.loc[:, 'Community'] = df_nodes['Community'].astype(str)


Using NetworkX, we can build a table of edges (`df_edges`) and a table of nodes (`df_nodes`) which share a node ID mapping. Below, we use these tables to create a basic network in Plotly.

In [4]:
df_edges.head(3)

Unnamed: 0,Source,Target,Weight,Season
0,NED,ROBERT,192,1
1,DAENERYS,JORAH,154,1
2,JON,SAM,121,1


- **Weight:** Strength of the connection between characters *(see docs linked above for more on this)*.
- **Season:** Season of Game of Thrones show.

In [5]:
df_nodes.head(3)

Unnamed: 0_level_0,Label,Community,Popularity,x,y
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
ADDAM_MARBRAND,Addam,5,Normal,-0.044669,-0.093975
AEGON,Aegon,3,Normal,-0.042882,0.01756
AERYS,Aerys,1,Normal,-0.021087,0.398984


- **Community:** divides the graph into reasonable sub-graphs.
- **Popularity:** most popular characters according to Google.
- **x/y:** the positions of nodes based on a graph layout.

In [6]:
# scalar for linewidth of edges
c = 3

# Create a Plotly figure
fig = go.Figure()

# add edges
for i in range(df_edges.shape[0]):
    edge = df_edges.iloc[i]

    x0, y0 = df_nodes.loc[edge['Source']][['x', 'y']]
    x1, y1 = df_nodes.loc[edge['Target']][['x', 'y']]

    fig.add_trace(go.Scatter(x=[x0, x1],
                             y=[y0, y1],
                             mode='lines',
                             line=dict(width=(edge['Weight'] / max_weight) * c,
                                       color='gray'),
                             showlegend=False))

# add nodes
fig_nodes = px.scatter(df_nodes, x='x', y='y', color='Popularity',
                       hover_name='Label')

fig.add_traces(fig_nodes.data)

fig.update_layout(template='simple_white',
                  legend_title='Popularity',
                  xaxis_visible=False,
                  yaxis_visible=False)

# Show the figure
fig.show()

## EXERCISE 1 (TASK)

1. **Copy** the network figure code into the app cell, then rename the figure something like `fig_1`.
2. Add this figure into your web app using a [`dcc.Graph` object](https://dash.plotly.com/dash-core-components/graph).
3. Adjust the visualization so that the edges connected to "popular" characters share the same color as the popular character nodes.
4. Give the graph a title, and change the color map to something appropriate.

<font color='darkblue'>**Use the space below to test your code**</font>. Update the app cell when you're ready.


In [8]:
# Exercise 1 (Task)

import plotly.graph_objects as go
import plotly.express as px
from dash import Dash, html, dcc

# Create a Plotly figure
fig_1 = go.Figure()
c = 3  # line width scale

# Add edges
for i in range(df_edges.shape[0]):
    edge = df_edges.iloc[i]
    x0, y0 = df_nodes.loc[edge['Source'], ['x', 'y']]
    x1, y1 = df_nodes.loc[edge['Target'], ['x', 'y']]

    # Color edges red if connected to a popular character
    if (df_nodes.loc[edge['Source'], 'Popularity'] == 'Popular' or
        df_nodes.loc[edge['Target'], 'Popularity'] == 'Popular'):
        edge_color = 'red'
    else:
        edge_color = 'gray'

    fig_1.add_trace(go.Scatter(
        x=[x0, x1],
        y=[y0, y1],
        mode='lines',
        line=dict(width=(edge['Weight'] / df_edges['Weight'].max()) * c, color=edge_color),
        hoverinfo='none',
        showlegend=False
    ))

# Add nodes (characters)
fig_nodes = px.scatter(
    df_nodes, x='x', y='y',
    color='Popularity',
    hover_name='Label',
    color_discrete_map={'Popular': 'red', 'Normal': 'blue'}
)

fig_1.add_traces(fig_nodes.data)

# Update layout
fig_1.update_layout(
    title='Game of Thrones Character Network',
    template='simple_white',
    legend_title='Popularity',
    xaxis_visible=False,
    yaxis_visible=False
)

fig_1.show()


**What are the insights  from this visualization?**

<font color='darkblue'> The network shows that popular characters (in red) are central and well connected, often linking different groups. Their red edges indicate frequent interactions with others. Normal characters (in blue) appear more on the edges, forming smaller, less connected clusters.

## EXERCISE 2

1. Create a [treemap](https://plotly.com/python/treemaps/) from the `df_nodes` data. Give the figure a different name like `fig_2`.
2. Try a few different options for the `path` (e.g., you may have `Community` after `Popularity` or before, etc.).
3. Determine whether to remove a variable from the `path` and encode it as color instead. If you don't choose either, is color needed at all? Explain your reasoning.
4. Just as you did the network, move this treemap to the app cell, and add the visualization to the dash app, then give it a title.

<font color='darkblue'>**Use the space below to test your code**</font>. Update the app cell when you're ready.

In [17]:
# Exercise 2, option 1

import plotly.express as px
from dash import Dash, html, dcc

# Create treemap
fig_2 = px.treemap(
    df_nodes,
    path=['Popularity', 'Community', 'Label'],
    color='Popularity',
    color_discrete_map={'Popular': 'red', 'Normal': 'blue'},
    title='Game of Thrones Treemap by Popularity and Community'
)


# Add to Dash app
app = Dash(__name__)
app.layout = html.Div([
    html.H3("Exercise 2 – GOT Treemap Visualization"),
    dcc.Graph(figure=fig_2)
])

fig_2.show()


In [16]:
# Exercise 2, option 2

import plotly.express as px
from dash import Dash, html, dcc

# Create treemap
fig_2 = px.treemap(
    df_nodes,
    path=['Community', 'Label'],
    color='Popularity',
    color_discrete_map={'Popular': 'red', 'Normal': 'blue'},
    title='GOT Treemap Colored by Popularity'
)

# Add to Dash app
app = Dash(__name__)
app.layout = html.Div([
    html.H3("Exercise 2 – GOT Treemap Visualization"),
    dcc.Graph(figure=fig_2)
])

fig_2.show()


**What are the insights from this visualization?**

<font color='darkblue'> The treemap shows how Game of Thrones characters are grouped into different communities. Most characters are shown in blue, meaning they are normal in popularity, while a few red boxes represent popular characters. These popular characters appear in different communities, showing that they are well connected across groups. The visualization helps highlight which parts of the network have more well-known or central characters.

## DASH APP

---

<font color='darkblue'>**The cell below will be your "app cell".**</font>

- This is the cell that will be graded for this week's exercise.
- Any time you update code, re-run the cell to render changes in the app.
- Click the icon on the upper left corner of the output, and select "View output fullscreen". *Type **Esc** to return to the notebook.*

In [25]:
# add visualization code here

app = Dash("Networks and Trees")

app.layout = html.Div(children=[
    html.H1(children='Networks and Trees'),

    html.H3(children="Below, we use two methods to visualize a network."),

    html.H2("Game of Thrones Character Network"),
    dcc.Graph(figure=fig_1),

    html.H2("Game of Thrones Treemap by Popularity and Community"),
    dcc.Graph(figure=fig_2)

])

# DO NOT EDIT BELOW THIS LINE (except to change `jupyter_height`)
if __name__ == '__main__':
    app.run(debug=True, jupyter_mode="inline", jupyter_height=1000)

<IPython.core.display.Javascript object>

*Note: If your cell output is stuck on "Loading ..." for more than a minute, you may need to reconnect/restart your Google Colab runtime.*

---