Replication data and code for "Prestige drives epistemic inequality in the diffusion of scientific ideas"
Contains pickles of epidemic size and length. The script
summary.py will return how many simulations were run for each transmission probability, and each particular starting node. Each cache of the SI model contains 1000 trials for each node, transmission probability pair. Each cache of the SI model allowing for random jumps contains 500 trials for each node, transmission probability pair.
updatehistoryresults.py will add more runs of each epidemic simulation to the files in
epidemic.py describes the SI simulation we've implemented.
importhistory.py generate networkx networks and parse prestige metadata from the edge and vertex lists from
The files called
topic_modeling_titles.txt contain the titles extracted under our choice of keywords for each topic. These titles have been selected from the computer science bibliography, dblp. This data is available under the Open Data Commons ODC-BY 1.0 license. The pickle files
topic_modeling.p contain the fraction of transmissions due to hiring under 10,000 permutation tests. The notebook
spread_of_research_ideas.ipynb documents our permutation test.
Contains all of the plots from the paper. Code to generate these plots can be found in
getplots.py. The file
plot_utils.py has been reproduced from
samplotlib under the
BSD 2-Clause "Simplified" license.