Closeness centrality running without interruption #64

sopeKhadim · 2020-04-25T15:43:01Z

Hi,
I am using Teneto to calculate the closeness centrality. My code is running more than two days without stopping. what is the problem ?
Resources

Processor(s): 32
Threads par core: 2
Core per socket : 8
Model : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
Memory : 130 giga

Code:

import pandas as pd
import teneto as tn
import numpy as np
from teneto import TemporalNetwork

filename = 'data/Sociopatterns/primaryschool.csv'
df_PS = pd.read_csv(filename, names = ['t', 'i', 'j', 'Ci', 'Cj'], header=None, delimiter='\t')
df_PS_Day = df_PS[ df_PS['t'] < 63000] 
df_PS_final = df_PS_Day[[ 't', 'i', 'j']]

df_PS_final

	t	i	j
0	31220	1558	1567
1	31220	1560	1570
2	31220	1567	1574
3	31220	1632	1818
4	31220	1632	1866
...	...	...	...
60618	62300	1719	1859
60619	62300	1720	1843
60620	62300	1739	1872
60621	62300	1767	1787
60622	62300	1800	1820

60623 rows × 3 columns

# The time resolution between two signal is 20 seconds.
# This function normalize it. 
def contact_graph(data, cfg=None, resolution=0): 
    
    nodes = pd.concat([data['i'], data['j']]) 
    nodes = nodes.sort_values()
    nodes = nodes.unique()
    nb_nodes = len(nodes)
    
    ## changes the label of each node to be the index of np.array
    keys = nodes
    values = list(range(nb_nodes))
    nodeslabels = dict(zip(keys, values))
   
    if resolution == 0 :
        times = data['t']
        times = times.sort_values()
        times = times.unique()
        n_times = len(times)
    else:
        times = range(min(data['t']), max(data['t'])+resolution, resolution)
        n_times = len(times)
    
    ## changes the time counter 
    keys_times = times
    items_times = list(range(n_times))
    timesCounter  = dict(zip(keys_times,items_times ))          
    
    df = data.copy()
    frame =pd.DataFrame()
    frame['i'] = df['i'].map(nodeslabels)
    frame['j'] = df['j'].map(nodeslabels)
    frame['t'] = df['t'].map(timesCounter)
                  
    contacts = [tuple([frame.iloc[i,0], frame.iloc[i,1], frame.iloc[i,2]]) for i in range(len(frame))]
    contacts = np.array(contacts)
    contacts = contacts[contacts[:, 2].argsort()]
    
    return contacts

contacts_PS = contact_graph(df_PS_final, resolution=20)
contacts_PS[:50]

array([[ 58,  63,   0],
       [ 59,  64,   0],
       [ 63,  66,   0],
       [ 85, 185,   0],
       [ 85, 209,   0],
       [101, 114,   0],
       [186, 194,   0],
       [186, 209,   0],
       [186, 194,   1],
       [183, 189,   1],
       [141, 187,   1],
       [101, 114,   1],
       [ 85, 185,   1],
       [ 63,  66,   1],
       [ 58,  63,   1],
       [ 85, 209,   1],
       [101, 114,   2],
       [179, 191,   2],
       [179, 181,   2],
       [159, 168,   2],
       [141, 187,   2],
       [ 85, 209,   2],
       [ 58,  62,   2],
       [ 62,  66,   2],
       [ 62,  63,   2],
       [ 59,  64,   2],
       [ 58,  63,   2],
       [ 63,  66,   2],
       [ 85, 186,   3],
       [179, 181,   3],
       [150, 152,   3],
       [101, 114,   3],
       [ 85, 185,   3],
       [ 85, 209,   3],
       [ 62,  66,   3],
       [ 62,  63,   3],
       [ 58,  63,   3],
       [ 58,  62,   3],
       [ 36,  51,   3],
       [ 63,  66,   3],
       [183, 189,   4],
       [150, 153,   4],
       [101, 114,   4],
       [ 85, 209,   4],
       [ 63,  66,   4],
       [ 58,  63,   4],
       [ 17,  39,   4],
       [ 80,  87,   4],
       [162, 164,   5],
       [158, 169,   5]])

tnetHS = TemporalNetwork(from_edgelist=list(contacts_PS))

tnetHS.network.head()

	i	j
0	58	63
1	59	64
2	63	66
3	85	185
4	85	209

 closeness = tn.networkmeasures.shortest_temporal_path(tnetHS)

The text was updated successfully, but these errors were encountered:

wiheto · 2020-04-26T13:25:14Z

Hi,

Unfortunately the current code in shortest_temporal_paths is currently quite slow in its current implementation, especially for larger networks (exponentially grows with size). I've only really used in on networks about 200 nodes in size and 400 time points. It is something on the todo list to increase the speed.

wiheto · 2020-04-26T13:25:44Z

So the code should finish eventually, but I can't say how long it will take.

sopeKhadim · 2020-04-27T00:54:37Z

Thank you for your response,
Can we hope soon that you will improve the performance of the shortest path code for large temporal networks?
I think other measures depend on it, like betweeness.

wiheto · 2020-04-30T17:15:15Z

Can we hope soon that you will improve the performance of the shortest path code for large temporal networks?

I'll boost it up the priority queue cause it is something I've wanted to fix for sometime.

zhao-snoe · 2020-12-23T03:51:07Z

Have you solved your problem?At present, I also met the problem of using the same as you in the calculation

zhao-snoe · 2020-12-23T03:59:45Z

In your article, I saw that you used more than 200 nodes and more than 400 time points. How long did it take you to calculate the closness centrality? I think this may have time-reference significance for my current calculations. Because I calculate 60 nodes, 51 networks, and 10% sparsity. My code is running more than four days without stopping.

wiheto · 2020-12-23T17:16:04Z

Development on the toolbox stalled this year because of other reasons.

The original article used an older version of this function which was quick, but could not handle larger networks. The speed issue was introduced to the shortest temporal paths function when modifying the rest of the toolbox to being able to handle larger networks.

You can use an older version of the code which should be quicker:

I think the quicker version can be found here: https://github.com/wiheto/teneto/releases/tag/0.3.5

Until then, despite being a high priority when developing this toolbox, this is currently only something which is done in my spare time due to other obligations. I cannot give a time estimate when I can push to complete the the next version where the shortest path estimation will be substantially quicker.

zhao-snoe · 2020-12-24T00:42:51Z

Thank you for your prompt reply. I will try with the code you recommended.

…

------------------ 原始邮件 ------------------ 发件人: "wiheto/teneto" <notifications@github.com>; 发送时间: 2020年12月24日(星期四) 凌晨1:16 收件人: "wiheto/teneto"<teneto@noreply.github.com>; 抄送: "咹靜"<508609839@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [wiheto/teneto] Closeness centrality running without interruption (#64) Development on the toolbox stalled this year because of other reasons. The original article used an older version of this function which was quick, but could not handle larger networks. The speed issue was introduced to the shortest temporal paths function when modifying the rest of the toolbox to being able to handle larger networks. You can use an older version of the code which should be quicker: I think the quicker version can be found here: https://github.com/wiheto/teneto/releases/tag/0.3.5 Until then, despite being a high priority when developing this toolbox, this is currently only something which is done in my spare time due to other obligations. I cannot give a time estimate when I can push to complete the the next version where the shortest path estimation will be substantially quicker. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

wiheto · 2021-02-17T15:41:42Z

So now I have time to solve this issue. I'm gathering all the issues regarding speed into one issue. Follow #74 for more info about this.

wiheto added the High Priority label Apr 30, 2020

wiheto closed this as completed Feb 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Closeness centrality running without interruption #64

Closeness centrality running without interruption #64

sopeKhadim commented Apr 25, 2020 •

edited

Loading

wiheto commented Apr 26, 2020

wiheto commented Apr 26, 2020

sopeKhadim commented Apr 27, 2020

wiheto commented Apr 30, 2020

zhao-snoe commented Dec 23, 2020

zhao-snoe commented Dec 23, 2020

wiheto commented Dec 23, 2020

zhao-snoe commented Dec 24, 2020 via email

wiheto commented Feb 17, 2021

Closeness centrality running without interruption #64

Closeness centrality running without interruption #64

Comments

sopeKhadim commented Apr 25, 2020 • edited Loading

wiheto commented Apr 26, 2020

wiheto commented Apr 26, 2020

sopeKhadim commented Apr 27, 2020

wiheto commented Apr 30, 2020

zhao-snoe commented Dec 23, 2020

zhao-snoe commented Dec 23, 2020

wiheto commented Dec 23, 2020

zhao-snoe commented Dec 24, 2020 via email

wiheto commented Feb 17, 2021

sopeKhadim commented Apr 25, 2020 •

edited

Loading