# Notebook 02 – Network Metrics

**Author:** Demetrios Agourakis  
**ORCID:** [0000-0002-8596-5097](https://orcid.org/0000-0002-8596-5097)  
**License:** MIT License  
**Code DOI:** [10.5281/zenodo.16752238](https://doi.org/10.5281/zenodo.16752238)  
**Data DOI:** [10.17605/OSF.IO/2AQP7](https://doi.org/10.17605/OSF.IO/2AQP7)  
**Version:** 1.0 – Last updated: 2025-08-07

This notebook loads the symbolic graph generated from the SWOW-EN dataset and computes key topological and influence metrics for each node.  
The resulting symbolic profile will be saved as a structured table for downstream analysis (embedding, clustering, cognitive manifold modeling).


In [1]:
import networkx as nx
import pandas as pd
from pathlib import Path


def get_root_path():
    current = Path.cwd()
    while current != current.parent:
        if (current / "README.md").exists():
            return current
        current = current.parent
    return Path.cwd()


ROOT = get_root_path()
DATA = ROOT / "data"
RESULTS = ROOT / "results"
DATA.mkdir(exist_ok=True)
RESULTS.mkdir(exist_ok=True)

graph_path = RESULTS / "word_network.graphml"
if not graph_path.exists():
    raise FileNotFoundError(f"Expected graph file not found at: {graph_path}")

G = nx.read_graphml(graph_path)
print(f"Graph loaded with {G.number_of_nodes()} nodes and {G.number_of_edges()} edges.")

Graph loaded with 77165 nodes and 542600 edges.


In [2]:
# Cálculo das métricas básicas de rede

metrics_df = pd.DataFrame(index=G.nodes)

# Grau (in/out)
metrics_df["in_degree"] = pd.Series(dict(G.in_degree()))
metrics_df["out_degree"] = pd.Series(dict(G.out_degree()))
metrics_df["total_degree"] = metrics_df["in_degree"] + metrics_df["out_degree"]

# Força (in/out com pesos)
metrics_df["in_strength"] = pd.Series(dict(G.in_degree(weight="weight")))
metrics_df["out_strength"] = pd.Series(dict(G.out_degree(weight="weight")))
metrics_df["total_strength"] = metrics_df["in_strength"] + metrics_df["out_strength"]

# PageRank
pagerank = nx.pagerank(G, weight="weight")
metrics_df["pagerank"] = pd.Series(pagerank)

# Closeness centrality
closeness = nx.closeness_centrality(G)
metrics_df["closeness"] = pd.Series(closeness)

# Betweenness centrality
betweenness = nx.betweenness_centrality(G)
metrics_df["betweenness"] = pd.Series(betweenness)

# Clustering coefficient (não aplicável para grafos direcionados, convertemos para não direcionado)
undirected = G.to_undirected()
clustering = nx.clustering(undirected)
metrics_df["clustering"] = pd.Series(clustering)

metrics_df = metrics_df.fillna(0)
print("Finished computing metrics.")
metrics_df.head()

Finished computing metrics.


Unnamed: 0,in_degree,out_degree,total_degree,in_strength,out_strength,total_strength,pagerank,closeness,betweenness,clustering
there,84,36,120,238.0,113.0,351.0,0.0001,0.056718,5.7e-05,0.050709
position,83,62,145,158.0,119.0,277.0,4.6e-05,0.057992,0.000128,0.051122
true,161,36,197,504.0,115.0,619.0,0.00015,0.059237,7.5e-05,0.050411
honest,108,51,159,453.0,112.0,565.0,8.4e-05,0.057231,9.9e-05,0.080012
beat,100,52,152,276.0,113.0,389.0,7.9e-05,0.057367,0.000127,0.036179


In [3]:
# Salvar resultados
output_path = DATA / "symbolic_metrics.csv"
metrics_df.to_csv(output_path)
print(f"Symbolic metrics saved to: {output_path}")

Symbolic metrics saved to: /Users/demetriosagourakis/Library/Mobile Documents/com~apple~CloudDocs/Biologia Fractal/entropic-symbolic-society/NHB_Symbolic_Mainfold/data/symbolic_metrics.csv


## ✅ Notebook Summary

In this notebook, we:

- Loaded the symbolic graph created from the SWOW-EN dataset,
- Calculated key network metrics for each node:
  - In/Out degree and strength,
  - PageRank,
  - Closeness and betweenness centrality,
  - Clustering coefficient (undirected projection),
- Saved the results as `symbolic_metrics.csv` in the data folder.

---

## ▶️ Next Step

Proceed to **Notebook 03 – Generate Embeddings**, where we will compute vector representations (embeddings) of each node based on the symbolic topology.
