# Genres and modularity

Modularity describes the structure of a network. It can be used to measure how strongly a network is divided into communities.
High modularity implies that the communities in the network is dense, while the connections between communities are sparse.

-- Genres

In [37]:
import networkx as nx
import numpy as np
import requests
import matplotlib.pyplot as plt
from collections import Counter
from io import BytesIO

url = "https://raw.githubusercontent.com/fridapfrandsen/network-data/main/rock_network.gexf"

response = requests.get(url)
response.raise_for_status()

G = nx.read_gexf(BytesIO(response.content))

True

Getting the undirected grahp and removing nodes without genre information

In [39]:
G = G.to_undirected()

nodes_without_genre = [node for node, data in G.nodes(data=True) if "genres" not in data or not data["genres"]]

G.remove_nodes_from(nodes_without_genre)

False

## Using genres to partition network into communities

In [40]:
genre_partition = {}
for node, data in G.nodes(data=True):
    genre_partition[node] = data["genres"][0]

communities = {}
for node, genre in genre_partition.items():
    communities.setdefault(genre, set()).add(node)


Compute modularity

In [69]:
# Number of links in the network
L = G.number_of_edges()

# Function to calculate modularity
def Modularity(communities):
    M = 0
    for genre, nodes in communities.items():
        subG = G.subgraph(nodes)
        L_c = subG.number_of_edges()
        k_c = sum(G.degree(n) for n in nodes)
        M += (L_c / L) - (k_c / (2 * L)) ** 2
    return M

In [81]:
# Calculating number of links and nodes in each community and finding the sum
M = Modularity(communities)

print(f"Modularity (manual calc): {M:.4f}")

from networkx.algorithms.community.quality import modularity
M_check = modularity(G, communities)

Modularity (manual calc): 0.0347


NotAPartition: ['f', '*', 'r', 't', 'h', 'a', 'c', 'm', 's', 'p', 'g', 'b', 'l', 'i', 'e', 'j', 'n'] is not a valid partition of the graph Graph with 474 nodes and 5765 edges

The modularity is only 0.0347, meaning that the partitioning of the network into the first genre on the list of genres, don't give any clear community structure.
Instead we try to divide after a random genre on the artist's list:

In [78]:
import random

# random choice of genre from the list
partition_random = {}
for n, d in G.nodes(data=True):
    if "genres" in d and d["genres"]:
        partition_random[n] = random.choice(d["genres"])

communities_random = {}
for node, genre in partition_random.items():
    communities_random.setdefault(genre, set()).add(node)

M_random = Modularity(communities_random)

print(f"Modularity (first genre): {M:.4f}")
print(f"Modularity (random genre): {M_random:.4f}")

Modularity (first genre): 0.0347
Modularity (random genre): 0.0054


It turns out the modularity is actually lower when we partition using a random genre instead. This is because we decrease the big hub that is the "rock" community. 

Both partitionings is so close to zero, that they resemble a random network.

## Communities

Using the Louvain algortihm to partition the network into communities

In [67]:
from networkx.algorithms.community import louvain_communities

comm = louvain_communities(G, seed=42)

print(f"Number of communities: {len(comm)}")


Number of communities: 7


In [80]:
# Convert the list of communities to a dictionary
communities_dict = {i: comm for i, comm in enumerate(communities)}

M_louvain = Modularity(communities_dict)

print(M_louvain)

TypeError: unsupported operand type(s) for +: 'int' and 'DegreeView'