No description, website, or topics provided.
Branch: master
Clone or download
Latest commit e4ab4ed Feb 7, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitattributes Initial commit Oct 20, 2018
PMIA.py Initial commit Oct 20, 2018
Readme.md - Feb 7, 2019
_1_extract_active_network.py Initial commit Oct 20, 2018
_2_diffusion_greedy.py Initial commit Oct 20, 2018
_2_train.py Initial commit Oct 20, 2018
_3_extract_bernouli_and_time.py Initial commit Oct 20, 2018
_3_extract_weighted_cascade.py Initial commit Oct 20, 2018
_3_rank_nodes.py Initial commit Oct 20, 2018
_4_reform_cascades.py Initial commit Oct 20, 2018
_5_call_netrate.m Initial commit Oct 20, 2018
_6_run_pmia.py - Oct 21, 2018
_7_evaluate.py removed extra argument Dec 3, 2018
_8_plot_results.R Initial commit Oct 20, 2018
runIAC.py Initial commit Oct 20, 2018

Readme.md

DiffuGreedy Influence Maximization

Code and instructions to reproduce the analysis of the paper DiffuGreedy: An Influence Maximization Algorithm Based on Diffusion Cascades

Folder structure

Root folders: Code, Data, Figures

Code: Contains the contents of this folder and the code of NETRATE. You will also need code for IMM and SIMPATH. PMIA.py and runIAC.py are taken from python PMIA implementation.

Data -> Init Data: Contains the cascades and the follower network from Sina Weibo i.e. total.txt and graph_170w_1month.txt
Data ->Empty folder Logs
Data ->Empty folder Netrate
Data ->Empty folder Seeds
Data ->Empty folder Results

Requirements

gcc version >=4.7

MATLAB 2017b

Python 2.7, packages: igraph, pandas, numpy, networkx

R packages :ggplot, reshape2

Code

The scripts follow the order indicated by the number in their title.
Below is an explanation on how each influence maximization technique is implemented through the scripts.

Diffusion Greedy

  • _2_diffusion_greedy.py runs diffusion-based influence maximization using the train cascades.

Ranking by K-core decomposition

  • _2_train.py runs k-core decomposition for each node in the active graph and stores it at kcores.csv.
  • _3_rank_nodes.py derives the top nodes based on it and stores them at folder Seeds.

Influence Maximization via Martingales

  • _2_train.py extracts the active network for the first 25 days at train_network.pickle.
  • _3_extract_weighted_cascade.py adds edge weights to the network based on weighted cascade and stores it at follower_weighted.txt. It also creates the attribute file required for the IMM algorithm.
  • Use the IMM code to produce the seed set of follower_weighted.txt and store it in a file with the same name in Data\Seeds.

PMIA on the Diffusion-based Network

  • _4_reform_cascades.py uses top_nodes.csv created by _3_rank_nodes.py to filter the training cascades to include only top nodes based on degree and follow the format required for NETRATE. The cascade file is stored at Data\Netrate.
  • _5_call_netrate.m calls NETRATE algorithm for each cascade file and stores the resulting adjacency list at Data\Netrate.
  • _6_run_pmia.py creates a network out of the adjecency matrix, weighs it based on weighted cascade and computes NETRATE's accuracy in retrieving follow relationships. It then uses PMIA to derive the seed set.

SIMPATH on the Data-based weighted Network

  • _2_train.py extracts the active network for the first 25 days at train_network.pickle.
  • _3_extract_bernouli_and_time.py extracts three weighted networks, with edge weights based on influence strength (literature's Bernoulli-ic), the inverse of average influence delay, and their product.
  • Use the SIMPATH code and the .inf files from the previous step to derive the seed sets and store them in text files with the same name as the .inf, with format "seed1 seed2 seed3 etc..", in Data\Seeds.