Overview

This repository provides replication materials for the paper: (final paper name)

Step 1: Data Downloading

For replication purposes, you will need to download the following large files from Google Drive

Download data/url_to_flag_updatedtoFeb24_obitfilter.csv from here
Download data/theta_Feb27update_withdomtopic.csv from here

Replicating Paper Results

Results for the main text of the paper can then be replicated using the R files figures_1_2_3.R and figures_4_5.R. If you would also like to replicate the supplement, you can do so with the file plots_for_supplement.R.

Data Sources

A reminder that the paper uses data from the following sources:

The NYTimes COVID Case/Death Rate Repository. This is pulled directly from the repository in util.R
MEDSL's Election and County Data. This is pulled directly from the repository in util.R
Kieran Healy's 2020 Election Results. This is pulled directly from the repository in util.R
2019 population estimates of U.S. counties from the Census Bureau. This is contained in the file data/co-est2019-alldata.csv.
Community Resilience Estimates provided by the U.S. Census to estimate the percentage of individuals in each county that had 0 risk factors, 1-2 risk factors, or 3+ risk factors for COVID. This is contained in the file data/cre-2018-a11.csv.

Replicating the Topic Model

Given the cleaned and preprocessed CSV data file (download it here), the STM model and Theta output from the model can be replicated by using code_for_stm/run_stm.R.

Given the Theta output from the STM model, in some of our analysis, we map each article to one topic by finding the dominate topic in the topic distribution per article. This code can be found here: code_for_stm/add_domtopic_to_theta.py.

If one is interested in how we preprocessed the data and decided in the parameter k, the files code_for_stm/preprocess_text_for_topicmodel.py and code_for_stm/findK.py provide those details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
code_for_stm		code_for_stm
data		data
.gitignore		.gitignore
Readme.md		Readme.md
figures_1_2_3.R		figures_1_2_3.R
figures_4_5.R		figures_4_5.R
plots_for_supplement.R		plots_for_supplement.R
util.r		util.r

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code_for_stm

code_for_stm

data

data

.gitignore

.gitignore

Readme.md

Readme.md

figures_1_2_3.R

figures_1_2_3.R

figures_4_5.R

figures_4_5.R

plots_for_supplement.R

plots_for_supplement.R

util.r

util.r

Repository files navigation

Overview

Step 1: Data Downloading

Replicating Paper Results

Data Sources

Replicating the Topic Model

About

Releases

Packages

Contributors 2

Languages

kennyjoseph/covid_localnews_public

Folders and files

Latest commit

History

Repository files navigation

Overview

Step 1: Data Downloading

Replicating Paper Results

Data Sources

Replicating the Topic Model

About

Resources

Stars

Watchers

Forks

Languages