<b> This notebook convert a mailing list (or a set of mailing lists) into a network of interaction</b>

What it does:
-it creates a network of interaction between senders and receivers of emails, on one or more mailing lists
-it generates a .gexf file that can be imported in Gephi for visualization and analysis

Parameters to set options:
-it can look in one or more mailing lists, according to how many urls are set in the ‘urls’ variable; networks are aggregated across mailing lists
-it can filter the network by date; set the variable 'date_from' and 'date_to' with a date frame consistent with the data


In [1]:
%matplotlib inline

In [2]:
from bigbang.archive import Archive
from bigbang.archive import load as load_archive
import bigbang.parse as parse
import bigbang.graph as graph
import bigbang.mailman as mailman
import bigbang.process as process
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
from pprint import pprint as pp
import pytz
import os 

In [3]:
#Insert a list of archive names
archives_names = ["6lo"]

cwd = os.getcwd()  

archives_paths = list()
for archive_name in archives_names:
    archives_paths.append('../../archives/'+archive_name+'.csv')
    

archives_list = [load_archive(archive_path).data for archive_path in archives_paths]
    
archives = Archive(pd.concat(archives_list))

archives_data = archives.data

Set a valid date frame for building the network. 

In [4]:
#The oldest date and more recent date for the whole mailing lists are displayed, so you WON't set an invalid time frame 
print(archives_data['Date'].min())
print(archives_data['Date'].max())

2013-05-24 18:18:40
2018-03-16 12:15:45


In [5]:
#set the date frame
date_from = pd.datetime(2000,11,1,tzinfo=pytz.utc)
date_to = pd.datetime(2111,12,1,tzinfo=pytz.utc)

Filter data according to date frame and export to .gexf file

In [6]:
def filter_by_date(df,d_from,d_to):
    return df[(df['Date'] > d_from) & (df['Date'] < d_to)]

In [7]:
#create filtered network
archives_data_filtered = filter_by_date(archives_data, date_from, date_to)
network = graph.messages_to_interaction_graph(archives_data_filtered)

In [8]:
#export the network in a format that you can open in Gephi. 

#insert a file name
file_name = 'architecture_discuss_for_gephi.gexf'

network = nx.write_gexf(network, cwd+file_name)
    
    

Brian E Carpenter
Brian E Carpenter
sent
17
johui
johui
sent
3
The IESG
The IESG
sent
23
Rene Struik
Rene Struik
sent
11
=?utf-8?B?7LWc7JiB7ZmY?=
=?utf-8?B?7LWc7JiB7ZmY?=
sent
5
Joosang Youn
Joosang Youn
sent
1
Thagadur Prakash Shiva
Thagadur Prakash Shiva
sent
1
Adrian Farrel
Adrian Farrel
sent
9
Richard Kelsey
Richard Kelsey
sent
8
<yoshihiro.ohba@toshiba.co.jp
<yoshihiro.ohba@toshiba.co.jp
sent
4
PA62
PA62
sent
3
Rahul Jadhav
Rahul Jadhav
sent
8
Ben Campbell
Ben Campbell
sent
3
Tengfei Chang
Tengfei Chang
sent
17
Wang Qin
Wang Qin
sent
1
Tom Taylor
Tom Taylor
sent
1
Owen Kirby
Owen Kirby
sent
3
Benjamin Damm
Benjamin Damm
sent
3
peter van der Stok
peter van der Stok
sent
31
Dan
Dan
sent
8
<Paul_Koning@Dell.com
<Paul_Koning@Dell.com
sent
2
Randy Turner
Randy Turner
sent
5
Mahesh Jethanandani
Mahesh Jethanandani
sent
1
=?UTF-8?Q?David_Fern=C3=A1ndez_Ros?=
=?UTF-8?Q?David_Fern=C3=A1ndez_Ros?=
sent
1
Nokia-CTO/Tampere
Nokia-CTO/Tampere
sent
2
Ted Lemon
Ted Lemon
sent
2
Miguel Angel Rein

Niclas Granqvist
peter van der Stok
422
1
Niclas Granqvist
Cao,Zhen
423
1
Niclas Granqvist
Hannes Tschofenig
424
1
Stephen Farrell
Samita Chakrabarti
425
2
Ines  Robles
Michael Richardson
426
1
Ines  Robles
Gabriel Montenegro
427
1
Ines  Robles
pthubert
428
3
Ines  Robles
Ines  Robles
429
1
Ines  Robles
Carsten Bormann
430
2
Joe Touch
Joe Touch
431
1
Marcin Piotr Pawlowski
Gianluca Rizzo
432
1
Rashid Sangi
Tengfei Chang
433
1
sajjad akbar
YongGeun Hong
434
2
sajjad akbar
Lijo Thomas
435
1
sajjad akbar
internet-drafts@ietf.org
436
1
sajjad akbar
Thomas Watteyne
437
1
sajjad akbar
Martin
438
1
sajjad akbar
Michael Richardson
439
1
sajjad akbar
samita Chakrabarti
440
1
sajjad akbar
Dale R. Worley
441
1
van de Logt, Marco
<teemu.savolainen@nokia.com
442
2
van de Logt, Marco
Hannes Tschofenig
443
1
Cao,Zhen
Samita Chakrabarti
444
1
Cao,Zhen
<teemu.savolainen@nokia.com
445
1
pwetterw
AbdurRashidSangi
446
1
pwetterw
Prof. Diego Dujovne
447
1
Yong-Geun Hong
Kerry Lynn
448
1
Yong-Geun Hong
Alex