Dynamic Eigenvector Centralities
The original code for this project was developed by Neela Avudaiappan. This version of the code is a fork from Grace Glenn's implementation, who fixed bugs and restructured the code.
Paper / Purpose
This code serves as an implementation of the work on calculating keywords and their emerging importance outlined in:
Neela Avudaiappan, Alexander Herzog, Sneha Kadam, Yuheng Du, Jason Thatcher, and Ilya Safro, " Detecting and summarizing emergent events in microblogs and social media streams by dynamic centralities", in Proceedings of the 2017 IEEE International Conference on Big Data, 2017
The results in the paper have been replicated on the Boston dataset using time intervals of 60 and 15 minutes, located in
- All code is run in Python 3.6 (Anaconda 4.3.0)
- Data to be processed should be stored in ordered text files (i.e., file1.txt, file2.txt, ... fileN.txt for N intervals, or some other numbered format.)
- Text files should contain one-document (i.e., one tweet) per line
- Ensure all requirements are satisfied. The program can be run as follows.
# after repo has been downloaded cd dynamic_eigenvector_centralities pip install requirements.txt python dec_main.py --input_folder /home/username/time_series_data/ --P 6 --output_folder /home/username/dec_results/
dec_main.pyRuns the full algorithm to compute DEC values described in the
dec_graph.pycontains code for the graph logic of the algorithm
dec_text.pycontains code for preprocessing and cleaning the data
break_files.pya useful script for dividing time-series CSV data