IngoScholtes/kdd2018-tutorial

Fetching contributors…
Cannot retrieve contributors at this time
59 lines (43 sloc) 2.87 KB
 #%% import markdown from IPython.core.display import display, HTML def md(str): display(HTML(markdown.markdown(str + "
"))) #%% md(""" # 1.8 Exploration: Multi-order analysis of [paths and time-stamped social networks](https://github.com/IngoScholtes/kdd2018-tutorial/tree/master/data) **Ingo Scholtes** Data Analytics Group Department of Informatics (IfI) University of Zurich **August 22 2018** In the last (open-ended) exploration, you get the chance to apply multi-order representation in the analysis of real data. In addition to the pathway data from session 1, we will consider data that we provide in the SQLite database temporal_networks.db. You can check which tables it contains by checking the metadata table: """) #%% In [2] import pathpy as pp import sqlite3 con = sqlite3.connect('data/temporal_networks.db',) con.row_factory = sqlite3.Row for row in con.execute('SELECT * from metadata'): print('{0} \t\t {1}'.format(row['tag'], row['name'])) #%% md(""" Details on the origin of these data can be found [here](https://github.com/IngoScholtes/kdd2018-tutorial/tree/master/data). Below, we include boilerplate code to load these data sets into the TemporalNetwork class in pathpy: """) #%% In [3] table = 'manufacturing_email' # Check whether network is directed or not directed_network = bool(con.execute("SELECT directed FROM metadata WHERE tag='{0}'".format(table)).fetchone()['directed']) t = pp.TemporalNetwork.from_sqlite(con.execute('SELECT source, target, time FROM ' + table), directed=directed_network) print(t) #%% md(""" Using these data and the methods introduced in our tutorial, we suggest to study the following problems (in ascending order of difficulty): - Generate higher-order visualisations of the US Flight and London Tube data and visually compare the graph layouts calculated for the first and optimal-order models. - Use the MultiOrderModel class to learn the optimal order of a temporal network. How does the detected optimal order change with the time scale $\delta$ that you use in the extraction of causal paths? - Use the MultiOrderModel class to learn the optimal order of the London Tube data set. How does the detected optimal order compare to the prediction performance studied in exploration 1.4? - Study the change in the algebraic connectivity between the second-order model and the second-order null model for (i) a temporal network data set and (ii) the US Flights data. - Perform a spectral clustering of a dynamic social network based on the Laplacian of higher-order networks at different orders. How does the clustering differ from a first-order clustering? Again, these are only suggestions and you are welcome to use the time to study other data sets or questions that come to your mind. We'll be happy to help you with the analysis. """)