## Roman Networks in R

This notebook is just to show you how the R programming language would handle the same tasks as what you did in [[roman-networks.ipynb]] . Compare how Python does things, versus R. A good [igraph tutorial can be found here](https://robwiederstein.github.io/network_analysis/igraph.html). You might prefer the output and the syntax of R compared to Python; whatever you choose to use, make sure you understand how it does things!

We've already got the edges.csv and nodes.csv data, so...

In [None]:
# install igraph; this might take a long time
# you only run this line the first time you install igraph:
install.packages('igraph')
# a lot of stuff gets downloaded and installed.

In [None]:
# now tell RStudio you want to use the igraph pacakge and its functions:
library('igraph')

# now let's load up the data by putting the csv files into nodes and links.
# we're keeping the first row as a 'header'

nodes <- read.csv("nodes.csv", header=T, as.is=T)
links <- read.csv("edges.csv", header=T, as.is=T)

In [None]:
#examine data
head(nodes)
head(links)

In [None]:
length(unique(nodes$id))
# which gives the number of nodes in our data

nrow(unique(links[,c("source", "target")]))
# which gives the number of sources, and number of targets: the number of routes between two cities

# let's make a net

Notice that we are telling igraph that the network is directed, that the relationship in eg a letter writing network where Alice to Bob is different than Bob's to Alice (Alice is the _sender_, and Bob is the _receiver_)  or in a Patron-Client relationship Pompeii -> Rome is a different dynamic than Rome -> Pompeii. This isn't always a critical distinction to make and depends on your dataset.

AND - we're going to do this _just_ from the edge data

Create network from edges only - igraph will infer the nodes

`net <- graph_from_data_frame(d=links, directed=T)`

(if we wanted to include the node data specifically, we could do this:

`net <- graph_from_data_frame(d=links, vertices=nodes, directed=T)`

see the difference?

In [None]:
# type 'net' again and run the line to see how the network is represented.
net

In [None]:
# let's visualizae it

plot(net, vertex.label=NA)

That's hard to make sense of. Let's make it nicer:

In [None]:
# Nicer colors and layout
plot(net, 
     layout = layout_with_fr,           # Fruchterman-Reingold layout
     vertex.color = "lightblue",        # Node color
     vertex.size = 8,                   # Node size
     vertex.frame.color = "white",      # Node border color
     edge.color = "gray50",             # Edge color
     edge.arrow.size = 0.5,             # Arrow size
     vertex.label = NA)                 # No labels

In [None]:
# Create histogram of degree distribution
hist(deg, 
     breaks = 20,
     main = "Distribution of Node Degrees",
     xlab = "Degree (Number of Connections)",
     ylab = "Frequency",
     col = "lightblue",
     border = "white")

# Add some summary stats
abline(v = mean(deg), col = "red", lwd = 2, lty = 2)
legend("topright", paste("Mean degree:", round(mean(deg), 2)), 
       col = "red", lty = 2, lwd = 2)

In [None]:
# Color nodes based on their degree (number of connections)
deg <- degree(net)
colors <- colorRampPalette(c("lightblue", "darkred"))(max(deg))
V(net)$color <- colors[deg]

plot(net, 
     layout = layout_with_fr,
     vertex.size = 8,
     vertex.frame.color = "white",
     edge.color = "gray50",
     edge.arrow.size = 0.5,
     vertex.label = NA)

In [None]:
# Scale node size by degree (add minimum size so small nodes are still visible)
node_sizes <- deg * 2 + 5  # Scale factor of 2, minimum size of 5

plot(net, 
     layout = layout_with_fr,
     vertex.size = node_sizes,          # Size scaled by degree
     vertex.frame.color = "white",
     edge.color = "gray50",
     edge.arrow.size = 0.5,
     vertex.label = NA,
     main = "Network: Node Size and Color by Degree")

# Optional: Add a legend for the color scale
legend("topright", 
       legend = c(paste("Min degree:", min(deg)), 
                  paste("Max degree:", max(deg))),
       fill = c("lightblue", "darkred"),
       title = "Degree")

Here are some options for better layouts:

+ Fruchterman-Reingold (most similar to Force Atlas)
`layout_with_fr(net)`

+ Kamada-Kawai
`layout_with_kk(net)`

+ GraphOpt (similar algorithm)
`layout_with_graphopt(net)`

+ Large Graph Layout (good for big networks)
`layout_with_lgl(net)`

Try modifying some of those plot layouts. Remember, the best layout is the one that helps you interpret what you're looking at.

In [None]:
# Calculate closeness centrality
closeness_cent <- closeness(net, normalized = TRUE)

# Histogram
hist(closeness_cent, 
     breaks = 20,
     main = "Distribution of Closeness Centrality",
     xlab = "Closeness Centrality",
     ylab = "Frequency",
     col = "lightgreen",
     border = "white")
abline(v = mean(closeness_cent), col = "red", lwd = 2, lty = 2)

# Network plot colored by closeness
close_colors <- colorRampPalette(c("lightblue", "darkgreen"))(100)
V(net)$color <- close_colors[as.numeric(cut(closeness_cent, breaks = 100))]

plot(net, 
     layout = layout_with_fr,
     vertex.size = closeness_cent * 50 + 5,  # Scale by closeness
     vertex.color = V(net)$color,
     vertex.frame.color = "white",
     edge.color = "gray50",
     edge.arrow.size = 0.5,
     vertex.label = NA,
     main = "Network: Closeness Centrality")

In [None]:
# Calculate betweenness centrality
betweenness_cent <- betweenness(net, normalized = TRUE)

# Histogram
hist(betweenness_cent, 
     breaks = 20,
     main = "Distribution of Betweenness Centrality",
     xlab = "Betweenness Centrality",
     ylab = "Frequency",
     col = "orange",
     border = "white")
abline(v = mean(betweenness_cent), col = "red", lwd = 2, lty = 2)

# Network plot colored by betweenness
between_colors <- colorRampPalette(c("lightblue", "darkorange"))(100)
V(net)$color <- between_colors[as.numeric(cut(betweenness_cent, breaks = 100))]

plot(net, 
     layout = layout_with_fr,
     vertex.size = sqrt(betweenness_cent) * 10 + 5,  # Square root scaling
     vertex.color = V(net)$color,
     vertex.frame.color = "white",
     edge.color = "gray50",
     edge.arrow.size = 0.5,
     vertex.label = NA,
     main = "Network: Betweenness Centrality")

In [None]:
# Detect communities using modularity
communities <- cluster_louvain(as.undirected(net))  # Convert to undirected for community detection
modularity_score <- modularity(communities)

# Print modularity score
cat("Modularity score:", modularity_score, "\n")
cat("Number of communities:", length(communities), "\n")

# Histogram of community sizes
community_sizes <- sizes(communities)
hist(community_sizes, 
     breaks = 10,
     main = paste("Distribution of Community Sizes\nModularity =", round(modularity_score, 3)),
     xlab = "Community Size",
     ylab = "Frequency",
     col = "purple",
     border = "white")

# Network plot colored by community
community_colors <- rainbow(length(communities))
V(net)$color <- community_colors[membership(communities)]

plot(net, 
     layout = layout_with_fr,
     vertex.size = 8,
     vertex.color = V(net)$color,
     vertex.frame.color = "white",
     edge.color = "gray50",
     edge.arrow.size = 0.5,
     vertex.label = NA,
     main = paste("Network: Communities (Modularity =", round(modularity_score, 3), ")"))

In [None]:
# Clean the nodes data (remove rows with missing values)
nodes_clean <- nodes[complete.cases(nodes), ]

# Match and add attributes to existing vertices
# Get the current vertex names in the network
current_vertices <- V(net)$name

# Match nodes data to current vertices
matched_indices <- match(current_vertices, nodes_clean$id)  #'id' column in nodes

# Add labels (and other attributes) to vertices
V(net)$label <- nodes_clean$label[matched_indices]

# You can also add other node attributes if they exist
# V(net)$other_attribute <- nodes_clean$other_column[matched_indices]

In [None]:
# Calculate degree with labels
deg <- degree(net)
colors <- colorRampPalette(c("lightblue", "darkred"))(max(deg))
V(net)$color <- colors[deg]

plot(net, 
     layout = layout_with_fr,
     vertex.size = deg * 2 + 5,
     vertex.color = V(net)$color,
     vertex.frame.color = "white",
     edge.color = "gray50",
     edge.arrow.size = 0.5,
     vertex.label = V(net)$label,
     vertex.label.cex = 0.7,        # Label size
     vertex.label.color = "black",
     vertex.label.dist = 1,         # Distance from vertex
     main = "Network: Degree (with labels)")

In [None]:
#closeness with labels
closeness_cent <- closeness(net, normalized = TRUE)
close_colors <- colorRampPalette(c("lightblue", "darkgreen"))(100)
V(net)$color <- close_colors[as.numeric(cut(closeness_cent, breaks = 100))]

plot(net, 
     layout = layout_with_fr,
     vertex.size = closeness_cent * 50 + 5,
     vertex.color = V(net)$color,
     vertex.frame.color = "white",
     edge.color = "gray50",
     edge.arrow.size = 0.5,
     vertex.label = V(net)$label,
     vertex.label.cex = 0.7,
     vertex.label.color = "black",
     vertex.label.dist = 1,
     main = "Network: Closeness Centrality (with labels)")

In [None]:
# betweeness with labels
betweenness_cent <- betweenness(net, normalized = TRUE)
between_colors <- colorRampPalette(c("lightblue", "darkorange"))(100)
V(net)$color <- between_colors[as.numeric(cut(betweenness_cent, breaks = 100))]

plot(net, 
     layout = layout_with_fr,
     vertex.size = sqrt(betweenness_cent) * 10 + 5,
     vertex.color = V(net)$color,
     vertex.frame.color = "white",
     edge.color = "gray50",
     edge.arrow.size = 0.5,
     vertex.label = V(net)$label,
     vertex.label.cex = 0.7,
     vertex.label.color = "black",
     vertex.label.dist = 1,
     main = "Network: Betweenness Centrality (with labels)")

In [None]:
#modularity with labels
communities <- cluster_louvain(as.undirected(net))
community_colors <- rainbow(length(communities))
V(net)$color <- community_colors[membership(communities)]

plot(net, 
     layout = layout_with_fr,
     vertex.size = 8,
     vertex.color = V(net)$color,
     vertex.frame.color = "white",
     edge.color = "gray50",
     edge.arrow.size = 0.5,
     vertex.label = V(net)$label,
     vertex.label.cex = 0.7,
     vertex.label.color = "black",
     vertex.label.dist = 1,
     main = "Network: Communities (with labels)")

If the labels are too crowded, you can:

+ Make labels smaller: `vertex.label.cex = 0.5`
+ Only show labels for high-centrality nodes: `vertex.label = ifelse(deg > quantile(deg, 0.9), V(net)$label, NA)`
+ Use different label positioning: `vertex.label.dist = 0` (on the vertex) or `vertex.label.dist = 2` (further away)