# **General Network Evolution Analysis: Anime and User Graphs (2006–2018)**

# **1. Overview**

This notebook conducts analysis of the evolution of the community from 2006 to 2018. We examine two distinct network topologies to understand how the ecosystem has matured over time:

The Anime Graph: Represents the relationships between anime titles (nodes). Edges represent similarities, recommendations, or co-occurrence. Graph is weighted, weights are represented as user votes for anime

The User Graph: Represents the social structure of the community. Edges represent interactions (votes) or shared interests between users. Graph is weighted, weights are represented as strenght of the connection between users vited the same anime

Necessary imports:

In [4]:
import os
import igraph as ig
import numpy as np
from pathlib import Path
from dotenv import load_dotenv

# Import your custom modules
from graph_metrics import GraphMetricsIG
from graph_actions_ig import GraphActionsIG
from graph_io import GraphIO

# 1. Load Environment Variables
load_dotenv(override=True)

# 2. Get the directories from .env
# We use Path() here so we can easily add filenames later with the / operator
ANIME_GRAPHS_DIR = Path(os.getenv("ANIME_GRAPHS_DIR"))
USER_GRAPHS_DIR = Path(os.getenv("USER_GRAPHS_DIR"))

# 3. Define Anime Paths (Using .env directory)
anime_jacth005_2006_pkl = str(ANIME_GRAPHS_DIR / "2006.pkl")
anime_jacth005_2007_pkl = str(ANIME_GRAPHS_DIR / "2007.pkl")
anime_jacth005_2008_pkl = str(ANIME_GRAPHS_DIR / "2008.pkl")
anime_jacth005_2009_pkl = str(ANIME_GRAPHS_DIR / "2009.pkl")
anime_jacth005_2010_pkl = str(ANIME_GRAPHS_DIR / "2010.pkl")
anime_jacth005_2011_pkl = str(ANIME_GRAPHS_DIR / "2011.pkl")
anime_jacth005_2012_pkl = str(ANIME_GRAPHS_DIR / "2012.pkl")
anime_jacth005_2013_pkl = str(ANIME_GRAPHS_DIR / "2013.pkl")
anime_jacth005_2014_pkl = str(ANIME_GRAPHS_DIR / "2014.pkl")
anime_jacth005_2015_pkl = str(ANIME_GRAPHS_DIR / "2015.pkl")
anime_jacth005_2016_pkl = str(ANIME_GRAPHS_DIR / "2016.pkl")
anime_jacth005_2017_pkl = str(ANIME_GRAPHS_DIR / "2017.pkl")
anime_jacth005_2018_pkl = str(ANIME_GRAPHS_DIR / "2018.pkl")

# 4. Define User Paths (Using .env directory)
user_raw_2006_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2006_thr_3_q_low5_cut_1900.gpickle")
user_raw_2007_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2007_thr_3_q_low5_cut_1900.gpickle")
user_raw_2008_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2008_thr_3_q_low5_cut_1900.gpickle")
user_raw_2009_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2009_thr_3_q_low5_cut_1900.gpickle")
user_raw_2010_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2010_thr_3_q_low5_cut_1900.gpickle")
user_raw_2011_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2011_thr_3_q_low5_cut_1900.gpickle")
user_raw_2012_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2012_thr_3_q_low5_cut_1900.gpickle")
user_raw_2013_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2013_thr_3_q_low5_cut_1900.gpickle")
user_raw_2014_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2014_thr_3_q_low5_cut_1900.gpickle")
user_raw_2015_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2015_thr_3_q_low5_cut_1900.gpickle")
user_raw_2016_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2016_thr_3_q_low5_cut_1900.gpickle")
user_raw_2017_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2017_thr_3_q_low5_cut_1900.gpickle")
user_raw_2018_pkl = str(USER_GRAPHS_DIR / "raw" / "user_graph_2018_thr_3_q_low5_cut_1900.gpickle")

# 5. Build Dictionaries
graph_paths_anime = {
    2006: anime_jacth005_2006_pkl,
    2007: anime_jacth005_2007_pkl,
    2008: anime_jacth005_2008_pkl,
    2009: anime_jacth005_2009_pkl,
    2010: anime_jacth005_2010_pkl,
    2011: anime_jacth005_2011_pkl,
    2012: anime_jacth005_2012_pkl,
    2013: anime_jacth005_2013_pkl,
    2014: anime_jacth005_2014_pkl,
    2015: anime_jacth005_2015_pkl,
    2016: anime_jacth005_2016_pkl,
    2017: anime_jacth005_2017_pkl,
    2018: anime_jacth005_2018_pkl
}

graph_paths_users = {
    2006: user_raw_2006_pkl,
    2007: user_raw_2007_pkl,
    2008: user_raw_2008_pkl,
    2009: user_raw_2009_pkl,
    2010: user_raw_2010_pkl,
    2011: user_raw_2011_pkl,
    2012: user_raw_2012_pkl,
    2013: user_raw_2013_pkl,
    2014: user_raw_2014_pkl,
    2015: user_raw_2015_pkl,
    2016: user_raw_2016_pkl,
    2017: user_raw_2017_pkl,
    2018: user_raw_2018_pkl
}

# **2. Anime Graph Metrics Analysis (2006–2018)**

In this section, we generate the full metric reports for the Anime networks. For each year from 2006 to 2018, following metrics are calculated to track the evolution:

**Network Size & Connectivity**
* **Nodes & Edges:** Total count of anime titles and their similarity connections.
* **Density:** The ratio of actual connections to potential connections.
* **Connected Components:** The number of isolated groups within the network.
* **LCC Size:** The size of the Largest Connected Component.

**Path & Distance Metrics (Weighted)**
* **Weighted Diameter:** The longest shortest path (distance-based).
* **Avg Path Length:** The average number of steps to reach any anime from another (in the LCC).

**Centrality & Influence**
* **Average Degree & Std:** The average connectivity (popularity) and its variance.
* **Mean Closeness:** How fast a node can access all other nodes.
* **Mean Betweenness:** How often a node acts as a bridge between groups.
* **Mean Eigenvector:** Measure of influence based on connections to other highly connected nodes.

**Cohesion & Clustering**
* **Clustering Coefficient:** The tendency of nodes to form tight local groups ("bubbles").
* **Transitivity:** The global probability of triangles forming in the graph.
* **Core Number (Mean & Max):** A measure of how deep nodes are embedded in the network (finding the "mainstream" core).
* **Assortativity:** The correlation between linked nodes (do popular anime link to other popular anime?).

In [6]:
print("--- Starting Manual Analysis for Anime Graphs (2006-2018) ---\n")

# 2006
print(">>> Processing 2006...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2006_pkl), 
        "anime_jacth005_2006_pkl"
    )
)

# 2007
print("\n>>> Processing 2007...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2007_pkl), 
        "anime_jacth005_2007_pkl"
    )
)

# 2008
print("\n>>> Processing 2008...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2008_pkl), 
        "anime_jacth005_2008_pkl"
    )
)

# 2009
print("\n>>> Processing 2009...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2009_pkl), 
        "anime_jacth005_2009_pkl"
    )
)

# 2010
print("\n>>> Processing 2010...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2010_pkl), 
        "anime_jacth005_2010_pkl"
    )
)

# 2011
print("\n>>> Processing 2011...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2011_pkl), 
        "anime_jacth005_2011_pkl"
    )
)

# 2012
print("\n>>> Processing 2012...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2012_pkl), 
        "anime_jacth005_2012_pkl"
    )
)

# 2013
print("\n>>> Processing 2013...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2013_pkl), 
        "anime_jacth005_2013_pkl"
    )
)

# 2014
print("\n>>> Processing 2014...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2014_pkl), 
        "anime_jacth005_2014_pkl"
    )
)

# 2015
print("\n>>> Processing 2015...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2015_pkl), 
        "anime_jacth005_2015_pkl"
    )
)

# 2016
print("\n>>> Processing 2016...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2016_pkl), 
        "anime_jacth005_2016_pkl"
    )
)

# 2017
print("\n>>> Processing 2017...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2017_pkl), 
        "anime_jacth005_2017_pkl"
    )
)

# 2018
print("\n>>> Processing 2018...")
GraphMetricsIG.print_anime_graph_metrics(
    GraphMetricsIG.get_full_metrics_dict(
        GraphActionsIG.get_graph(anime_jacth005_2018_pkl), 
        "anime_jacth005_2018_pkl"
    )
)

--- Starting Manual Analysis for Anime Graphs (2006-2018) ---

>>> Processing 2006...
Loading file with graph: 2006.pkl...


Object is iGraph
Edges: 63857. Weights (min/max): 0.0526 / 1.0000

--- Starting Full Metrics Calculation for anime_jacth005_2006_pkl ---
  ... Calculating Basic Metrics (Nodes, Edges, Degree)...
  ... Calculating Path Metrics (Skipping slow ones)...
      > Diameter (Longest shortest path)...
      > Average Path Length...
  ... Calculating Centrality Metrics (Closeness, Betweenness, Eigenvector)...
      > Closeness Centrality...
      > Betweenness Centrality (this may take time)...
      > Eigenvector Centrality...
  ... Calculating Structural Metrics (Clustering, Transitivity, Coreness)...
--- Metrics Calculation Complete ---

Metric                                   | Value        | Description
-----------------------------------------------------------------------------------------------
Number of nodes                          | 732          | Total anime titles
Number of edges                          | 63857        | Total connections (shared audience)
Average degree (weighted

  assortativity_weighted = g.assortativity(strengths, strengths, directed=False)



--- Starting Full Metrics Calculation for anime_jacth005_2007_pkl ---
  ... Calculating Basic Metrics (Nodes, Edges, Degree)...
  ... Calculating Path Metrics (Skipping slow ones)...
      > Diameter (Longest shortest path)...
      > Average Path Length...
  ... Calculating Centrality Metrics (Closeness, Betweenness, Eigenvector)...
      > Closeness Centrality...
      > Betweenness Centrality (this may take time)...
      > Eigenvector Centrality...
  ... Calculating Structural Metrics (Clustering, Transitivity, Coreness)...
--- Metrics Calculation Complete ---

Metric                                   | Value        | Description
-----------------------------------------------------------------------------------------------
Number of nodes                          | 1837         | Total anime titles
Number of edges                          | 180838       | Total connections (shared audience)
Average degree (weighted)                | 16.9876      | Avg connectivity (popularity)
De

  eig = g.eigenvector_centrality(weights="weight", scale=True)


      > Average Path Length...
  ... Calculating Centrality Metrics (Closeness, Betweenness, Eigenvector)...
      > Closeness Centrality...
      > Betweenness Centrality (this may take time)...
      > Eigenvector Centrality...
  ... Calculating Structural Metrics (Clustering, Transitivity, Coreness)...
--- Metrics Calculation Complete ---

Metric                                   | Value        | Description
-----------------------------------------------------------------------------------------------
Number of nodes                          | 2294         | Total anime titles
Number of edges                          | 179075       | Total connections (shared audience)
Average degree (weighted)                | 13.4074      | Avg connectivity (popularity)
Degree std (weighted)                    | 17.0554      | Variance in popularity
Connected components                     | 1            | Isolated groups
LCC size                                 | 2294         | Size of main comp

# **3. User Graph Metrics Analysis (2006–2018)**

In this section, we analyze the evolution of the User social graph.

**Optimization Note:**
Unlike the Anime graph, the User graph is significantly larger and denser. To ensure calculations complete in a reasonable time, we have **disabled path-based metrics** (Betweenness Centrality, Closeness Centrality, Diameter, Radius, and Average Path Length). These metrics require traversing every possible path between nodes, which is computationally infeasible for large social networks in this context.

**Active Metrics for Users:**

**Network Scale**
* **Active Users (Nodes):** Total count of unique users participating in the network.
* **Interactions (Edges):** Total number of connections (friendships or shared interactions) between users.
* **Density:** The ratio of actual social connections to potential ones (network saturation).

**Social Structure & Fragmentation**
* **Connected Components:** The number of isolated user groups.
* **LCC Size:** The size of the Largest Connected Component (the main "giant" community).

**Social Cohesion (Clustering)**
* **Global Transitivity:** The probability that two of a user's friends are also friends with each other (global triangle ratio).
* **Avg Clustering Coefficient:** The average local "clique-ness" of user circles.

**Hierarchy & Homophily**
* **Avg Degree & Std:** The average number of friends per user and the inequality of popularity.
* **Core Number (Mean & Max):** A measure of how deep users are embedded in the social web (identifying the "inner circle").
* **Assortativity:** The correlation between linked users (do popular users tend to connect with other popular users?).

In [5]:
print("--- Starting Manual Analysis for User Graphs (2006-2018) ---\n")

# 2006
print(">>> Processing Users 2006...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2006_pkl), 
        "user_2006"
    ), "user_2006"
)

# 2007
print(">>> Processing Users 2007...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2007_pkl), 
        "user_2007"
    ), "user_2007"
)

# 2008
print(">>> Processing Users 2008...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2008_pkl), 
        "user_2008"
    ), "user_2008"
)

# 2009
print(">>> Processing Users 2009...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2009_pkl), 
        "user_2009"
    ), "user_2009"
)

# 2010
print(">>> Processing Users 2010...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2010_pkl), 
        "user_2010"
    ), "user_2010"
)

# 2011
print(">>> Processing Users 2011...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2011_pkl), 
        "user_2011"
    ), "user_2011"
)

# 2012
print(">>> Processing Users 2012...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2012_pkl), 
        "user_2012"
    ), "user_2012"
)

# 2013
print(">>> Processing Users 2013...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2013_pkl), 
        "user_2013"
    ), "user_2013"
)

# 2014
print(">>> Processing Users 2014...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2014_pkl), 
        "user_2014"
    ), "user_2014"
)

# 2015
print(">>> Processing Users 2015...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2015_pkl), 
        "user_2015"
    ), "user_2015"
)

# 2016
print(">>> Processing Users 2016...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2016_pkl), 
        "user_2016"
    ), "user_2016"
)

# 2017
print(">>> Processing Users 2017...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2017_pkl), 
        "user_2017"
    ), "user_2017"
)

# 2018
print(">>> Processing Users 2018...")
GraphActionsIG.print_user_graph_metrics(
    GraphActionsIG.get_user_metrics_light(
        GraphActionsIG.get_graph(user_raw_2018_pkl), 
        "user_2018"
    ), "user_2018"
)

--- Starting Manual Analysis for User Graphs (2006-2018) ---

>>> Processing Users 2006...
Loading file with graph: user_graph_2006_thr_3_q_low5_cut_1900.gpickle...
Object is iGraph
Edges: 2123. Weights (min/max): 4.0000 / 29.0000
Metric                              | Value      | Description
--------------------------------------------------------------------------------
Total Active Users                  | 308        | Unique users in the network
Total Interactions                  | 2123       | Votes for the same anime between users
Network Density                     | 0.0449     | Saturation (How connected is the community?)
Avg Connections                     | 13.7857    | Avg number of friends per user
Connection Variance                 | 19.3880    | Inequality in popularity
Isolated Groups                     | 1          | Fragmented sub-communities
LCC Size                            | 308        | Users in the main 'giant' community
Global Transitivity                 |