Author: Keenan Manpearl

Date: 2024-09-10


This notebook explores the number of nodes and network density
for picking the optimal pecanpy graph implementation 

In [1]:
from pecanpy.graph import AdjlstGraph

def display_graph_properties(edg_file):
    g = AdjlstGraph()
    g.read(edg_file, weighted=True, directed=False)
    print(f'Number of nodes: {g.num_nodes}')
    print(f'Number of edges: {g.num_edges}')
    print(f'Edge density: {g.density}')

In [2]:
for fold in [1,2,3,4,5]:
    edg_file = f'../data/edg/tau/edge_list_fold_{fold}.tsv'
    print(f'Fold {fold}')
    display_graph_properties(edg_file)
    print()

Fold 1


Number of nodes: 14816
Number of edges: 2408958
Edge density: 0.010974799707552252

Fold 2
Number of nodes: 14586
Number of edges: 2365834
Edge density: 0.011120943291384317

Fold 3
Number of nodes: 14741
Number of edges: 2400564
Edge density: 0.011048132121552078

Fold 4
Number of nodes: 14737
Number of edges: 2395044
Edge density: 0.011028712105120419

Fold 5
Number of nodes: 14585
Number of edges: 2368768
Edge density: 0.011136261960313226



In [3]:
print('Full network')
display_graph_properties('../data/edg/all_features/edge_list_full.tsv')
print()
print('Missingness filter')
display_graph_properties('../data/edg/missingness/edge_list_full.tsv')
print()
print('Tau filter')
display_graph_properties('../data/edg/tau/edge_list_full.tsv')

Full network
Number of nodes: 27531
Number of edges: 5035102
Edge density: 0.006643242027482727

Missingness filter
Number of nodes: 10377
Number of edges: 1487394
Edge density: 0.01381415248077323

Tau filter
Number of nodes: 15053
Number of edges: 2458182
Edge density: 0.010849176209512819
