This code contains scripts and notebooks to:
- Reproduce the analysis presented in Nesta's report about AI and the fight against COVID-19 (AI-C19)
- Update data collections (this requires access to Nesta's data production system)
- Reproduce future analyses based on updated data collections
- Create a conda virtual environment with the packages we use in our analysis:
conda env create -f conda_environment.yaml
- Install scripts as a package:
pip install -e .
- If you have access to Nesta DAPS and are planning to collect data from there, install the
data_getters
package:
pip install -r nesta_packages.txt
You may need to pip install tornado --upgrade
after.
You can collect the processed data we used in the AI-C19 report from figshare by running:
python ai_covid_19/fetch_dataset.py
The downloaded files also include data dictionaries.
You can make a new dataset with (probably updated data) by running:
python ai_covid_19/make_rxiv_data.py
python ai_covid_19/make_citation_data.py
And train a new hierarchical topic model by running
python ai_covid_19/train_topsbm.py
Note: This requires putting your credentials in a .env
file that will be read by the relevant scripts
Each notebooks in the notebooks/ai-c19
folder refers to a section in the report.
You can re-run them individually. All visual outputs will be saved as html files in report/figures/nesta_report_figures
.
Project based on the Nesta cookiecutter data science project template.