## Module 7 - Network Practice

In this notebook, we will see how to create network and tree visualizations using `igraph` and `networkD3`. 


You may have to **rerun** cells to get the network output for networkD3 examples. 


Here are some references for the libraries we will use:

- [igraph manual pages](http://igraph.org/r/doc/)

- [igraph reference](https://cran.r-project.org/web/packages/igraph/igraph.pdf)

- [networkD3 reference](https://cran.r-project.org/web/packages/networkD3/networkD3.pdf)

- [network data sources](http://www-personal.umich.edu/~mejn/netdata/)


## `igraph` Library


There are several different ways of defining graphs. Let's start with a simple network. Here, we will create the network **from scratch by giving the list of edges**. 


Edges are represented by vertex pairs. Two vertices (nodes) are connected by an edge (link) in the following example:

In [None]:
library(igraph)

# we can create a graph by giving a list of edges that are represented as vertex pairs. 
edges <- c(1,2, 3,2, 2,4)

# now CONVERT it to a directed graph
g<-graph(edges, n=max(edges), directed=TRUE)

# there are 3 pairs, so there will be three edges 
plot(g)

Note that `g` is a graph, not a data frame. 

In [None]:
g

We can find out some properties and compute some basic **statistics** about a graph like in the following examples: 

In [None]:
# vertex count
vcount(g)

In [None]:
# edge count
ecount(g)

In [None]:
# neighbors of the first vertex 
neighbors(g, V(g)[1], mode = 1)

In [None]:
# statistics about neighbors
incident(g,V(g)[2], mode=c("all", "out", "in", "total"))

In [None]:
# are these vertices connected ? 
are.connected(g, V(g)[1], V(g)[3])

In [None]:
# get a list of graph edges 
get.edgelist(g)

In [None]:
# List of vertices 
V(g)

### YOUR TURN: 

**Create the first graph** from the lab notebook with five edges using the `igraph` functions. 

In [None]:
< YOUR CODE HERE >

**Can you also create the second graph with the directed edges?** 


--- 

## Reading Graphs from File

We can also read an **edge list from a plain text file** and convert it to a graph. 

In this example, the vertices (nodes) are labeled by letters, and each row represents an edge between two vertices. 

In [None]:
# This is the file content:
# A,B
# A,G
# A,Y
# G,Y

t <-read.csv("/dsa/data/all_datasets/networks/graph1.txt", header=FALSE)

# t is a data frame
head(t)

In [None]:
# g2 is a graph
g2 <- graph.data.frame(t)
plot(g2)

### YOUR TURN: 

**Compute the same statistics** for `g2` as before done for `g`. 

In [None]:
< YOUR CODE HERE >


---


We can also read an **adjacency matrix** from a text file as opposed to a list of vertex pairs. 


Remember what an adjacency matrix is from the lab material. This is a sample row from the file showing the **connectivity of the first vertex** to the other vertices:

```
 0 1 1 1 0 0 0 0 1 0 0 1 0
```

In [None]:
adj_matrix <- as.matrix(read.table("/dsa/data/all_datasets/networks/SAcountries.txt",header=FALSE, sep=" "))
adj_matrix

We also need **names/labels for vertices (nodes). The above matrix data actually represents **neighborhood relationships between the countries in South America**. 

Let's read the country names from a different file:

In [None]:
SAnames <- read.csv("/dsa/data/all_datasets/networks/SAcountrylist.txt", header=FALSE)
head(SAnames)

Let's build an **undirected** simple (nonweighted) graph from the above matrix. **Study the following code:**

In [None]:
gSA <-  graph_from_adjacency_matrix(adj_matrix, weighted = NULL,  mode = "undirected")

# remove loops
gSA <- simplify(gSA)
# set labels and degrees of vertices
gSA <- set_vertex_attr(gSA, "label", value= as.vector(SAnames$V1))

gSA

**You can create new attributes for each vertex in the graph like this:** 

In [None]:
# create an attribute to hold degree for each vertex
V(gSA)$degree <- degree(gSA)

In [None]:
# do a force-directed layout 
layout1 <- layout.fruchterman.reingold(gSA)

# draw the newtwork 
plot(gSA, layout=layout1)

In [None]:
# another layout 
plot(gSA, layout=layout.kamada.kawai)


In [None]:
# this is an example of a more complicated graph, it'll take a while to draw:
gb <- barabasi.game(1000, power=1)
l1 <- layout.fruchterman.reingold(gb)
l2 <- layout.kamada.kawai
plot(gb, layout=l1, vertex.size=2, vertex.label=NA, edge.arrow.size=.2)
plot(gb, layout=l2, vertex.size=2, vertex.label=NA, edge.arrow.size=.2)

## networkD3 Library 

`networkD3` library is an R interface to the **D3 Javascript library**. 


Let's do similar layouts in networkD3. **This library produces interactive graphs, you can click on a node and drag it, for example.** 

It requires two sets of vertices as input to create a graph from them: source vertices, and target vertices. 

In [None]:
library(networkD3)

# Create edge data
src <- c("A", "A", "A", "A",
        "B", "B", "C", "C", "D")

target <- c("B", "C", "D", "J",
            "E", "F", "G", "H", "I")

networkData <- data.frame(src, target)

# Plot - may need to run twice 
simpleNetwork(networkData, height=300)

**We can also convert the `igraph` graphs to `D3` graphs.**


In [None]:
gSA_d3 <- igraph_to_networkD3(gSA)

# Create force directed network plot - you can zoom into this graph with mouse 
forceNetwork(Links = gSA_d3$links, Nodes = gSA_d3$nodes, Source = 'source', Target = 'target', 
             NodeID = 'name', Group = 'name', height=500, zoom=TRUE, fontSize=20)

### IS THIS GRAPH PLANAR ? MOVE THE VERTICES WITH MOUSE TO GET RID OF ALL EDGE CROSSINGS IF YOU CAN.


---




Let's look at a more complex data set. This dataset contains the **co-occurance network of characters in Victor Hugo's novel *Les Misérables*.** 


**A vertex represents a character**, and an **edge between two vertices shows that these two characters appeared in the same chapter** of the the book. The **weight** of each edge indicates how often such a co-appearance occured.

In [None]:
# This is the Les Miserables data set that comes with the library 
data(MisLinks)
data(MisNodes)

Edges (links) have **weights** that are represented by the `value` column in the data frame as below: 

In [None]:
head(MisLinks) # EDGE DATA - has weights 

Vertices (nodes) have two attributes: `group` and `size`. 

In [None]:
head(MisNodes) # VERTEX DATA - has size and grouping 

**Can you identify how many different visual channels represent attributes about the data set below?** 

In [None]:
# Plot - Move mouse over VERTICES to see character names 

forceNetwork(Links = MisLinks, Nodes = MisNodes,
            Source = "source", Target = "target",
            Value = "value", NodeID = "name",
            Group = "group", opacity = 0.8, height=600, zoom=TRUE, fontSize=30)

**Look at [here](https://www.rdocumentation.org/packages/networkD3/versions/0.4/topics/forceNetwork) for all the options you can use with `forceNetwork()` function.**

### Here are some links that show alternative visualizations for the same data set: 
 - https://bost.ocks.org/mike/miserables/
 - https://studentwork.prattsi.org/infovis/labs/character-networks-visualization-for-les-miserables/
 - https://studentwork.prattsi.org/infovis/visualization/les-miserables-character-network-visualization/

### YOUR TURN: 

**Create an interactive network** for the South America data from above. You'll need to create two data frames similar to the `Mis` example. 

In [None]:
< YOUR CODE HERE >

Now, **find neighbors of Bolivia** using the `neighbors()` function. For that, you'll need an `igraph` graph. 

In [None]:
< YOUR CODE HERE >


---


**We can also read graphs in GML format.** 

This example contains an undirected social network of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand, as compiled by Lusseau et al. (2003). 

In [None]:
gml_data <- read.graph("/dsa/data/all_datasets/networks/dolphins.gml", format=c("gml"))
head(gml_data)

In [None]:
gml_data <- simplify(gml_data)

dol <- igraph_to_networkD3(gml_data)

forceNetwork(Links = dol$links, Nodes = dol$nodes, NodeID = "name", Group = "name", height=600, zoom=TRUE)

**With this data set, we can experiment with community detection functions of `igraph` by clustering vertices like this:** 

In [None]:
cfg <- cluster_fast_greedy(as.undirected(gml_data))
plot(cfg, gml_data)

In [None]:
ceb <- cluster_edge_betweenness(gml_data)
plot(ceb, gml_data)

In [None]:
clp <- cluster_label_prop(gml_data)
plot(clp, gml_data)

### Radial Networks

We can also read network data in JSON format like in the example below: 

In [None]:
# example of a radial network given in Javascript's JSON format.
library(jsonlite)
URL <- "https://raw.githubusercontent.com/christophergandrud/networkD3/master/JSONdata/flare.json"

## Convert the data to list format
Flare <- fromJSON(URL, simplifyDataFrame = FALSE)

In [None]:
# JSON format
Flare

In [None]:
# Use subset of data for a more readable diagram
Flare$children = Flare$children[1:3]

radialNetwork(List = Flare, fontSize = 20, opacity = 1, height=600)

We can **visualize clusters** with a radial network like this: 

In [None]:
hc <- hclust(dist(USArrests), "ave")
radialNetwork(as.radialNetwork(hc))