Repository contains Python code used to generate the results in the paper entitled A Numerical Evaluation of the Accuracy of Influence Maximization Algorithms by Kingi, Wang, Shafer et al., which has the following abstract:
We develop an algorithm to compute exact solutions to the influence maximization problem using concepts from reverse influence sampling (RIS). We implement the algorithm using GPU resources to evaluate the empirical accuracy of theoretically-guaranteed greedy and RIS approximate solutions. We find that the approximation algorithms yield solutions that are remarkably close to optimal - usually achieving greater than 99% of the optimal influence spread. These results are consistent across a wide range of network structures.
Python File Descriptions:
function_file.pydefines the various functions required to implement the IM algorithms described in the paper on a GPU architecture.
1.create_networks.pyis self-contained, and generates the network csv files used in the paper, which are stored in the
network_datafolder. The optional keyword flags are
-n, which is the number of nodes in the generated networks, and
-vers, which is the number of versions of each graph to create. Run this first, in the usual fashion:
python3 1.create_networks.py -n 100 -vers 10
where the values given above are the default values used in the paper.
2.run_simulations.pyconducts the analyses. It calls the functions defined in
function_file.py. Run it from the command line as follows:
python3 2.run_simulations.py -t SF -p 0.01 -k 4 -th 100000 -mcg 100000 -n 100
-tis either 'SF' 'ER' or 'WS' depending on which network type is desired,
-pspecifies the propagation probability,
-kthe seet set size,
-ththe number of RRR sets to generate for the RIS procedures,
-mcgthe number of MC iterations to perform to compute the spread of the IC function, and
-nan optional parameter that allows the user to specify a maximum number of graphs to simulate in one run. This file needs to be run several times (one for each network type and propagation probability combination) to generate the results in the paper. This takes approximately 4 days. The results from these runs are stored in the
3.make_graphs.ipynbis a Jupyter notebook that produces the two graphs used in the paper as well as a number of exploratory analyses. The graphs are saved in the
The code was run on an AWS instance. The below notes were useful reminders to myself.
- Launch instance on the AWS website: Deep Learning Base AMI (Amazon Linux) Version 19.1 (ami-00a1164673faf2ac3), p2.xlarge
- Login to AWS instance via:
ssh -i path/to/amazonkey.pem email@example.com
- Setup AWS instance with:
sudo pip3 install numba
- Transfer file to instance:
scp -i amazonkey.pem file_name firstname.lastname@example.org:
- Transfer folder to instance:
scp -i amazonkey.pem -r folder_name email@example.com:
- Transfer files back to local machine:
scp -i amazonkey.pem -r firstname.lastname@example.org: .
- Tip: Use
tmuxcommand before running a script to open a new screen. Transition back to main screen with
ctrl+b,dand then back again using
tmux attach -d. This allows you to log out of AWS while keeping a script running.