## The purpose of this file
Make summary statistics of flattened temporally aggregated network and save it.<br>The aggregate temporal network is made by projecting a bipartite graph of whole time span on hashtag graph.<br>**Note** that the very last time interval, in which the last interaction occured, is excluded.<br>The resultant network should be the same as the temporally aggregated network (**see section 2**) with all edge weight set to one.<br>The nodes to be removed should be determined in the **section 3**.

In [1]:
import sys
sys.path.append('../')
import networkx as nx
import toolbox as tb
import pandas as pd

In [2]:
tag = "momiji"
hashtag = "紅葉"
timespan = "21-29"
file = f"../data/datasets/{tag}/{tag}_{timespan}.pkl"
df = tb.get_dataframe(hashtag, file)

In [3]:
start = "2022-11-22T12:00+09:00"
end = "2022-11-22T23:59+09:00"
timespan = "22-22"
start = pd.to_datetime(start)
end = pd.to_datetime(end)
DF = df[(start <= df.index) & (df.index <= end)]

In [4]:
tau, snapshots = tb.get_snapshots_closed_intervals(DF, delta='minutes=10')
#tau, snapshots = tb.get_snapshots_closed_intervals(DF, 'hours=1')

In [5]:
G = tb.get_flattened_temporally_aggregated_network(DF, snapshots)

In [6]:
G.remove_node(hashtag)
print(f"Isolates after removal of the searchtag: \n{list(nx.isolates(G))}")
print("These isolate nodes are to be removed.")
G.remove_nodes_from(list(nx.isolates(G)))
assert len(list(nx.isolates(G))) == 0, "There is at least one isolate node left."
print("============================================")
print("The isolate nodes were successfully deleted.")

Isolates after removal of the searchtag: 
['冬がはじまるよ', '北海道大学', '霞間ケ渓', '熊本県阿蘇市赤水', '岐阜女子大学', 'スポーツスターｓ', '航空公園', '風車', '東京江戸たてもの園', '静寂の中で', '大宮氷川神社参道', '上田城跡公園', '清水坂公園', 'まだ真っ赤じゃなかった', 'いばらきガーデンオーチャードツーリズム']
These isolate nodes are to be removed.
The isolate nodes were successfully deleted.


In [7]:
N = G.number_of_nodes()
L = G.number_of_edges()
braket_k = (2*G.number_of_edges()) / G.number_of_nodes()
braket_C = nx.average_clustering(G)
density = nx.density(G)

In [8]:
fname = f'../data/temporal_network_summary_statistics/{tag}/ssftn_{timespan}_{tau}.pkl'
statistics = pd.DataFrame({r"$N$": N, r"$L$": L, r"<k>":braket_k, r"<C>":braket_C, r"$Density$": density}, index=[hashtag])
statistics.to_pickle(fname)
print(fname)
statistics

../data/temporal_network_summary_statistics/momiji/ssftn_22-22_66.pkl


Unnamed: 0,$N$,$L$,<k>,<C>,$Density$
紅葉,13710,226864,33.094675,0.884943,0.002414
