This repo scrapes data from PDF reports (now also located here) put together by the Gw____tt School District, packages it with a shiny web app built by a collaborator, then deploys it to https://shinyapps.io. We also automatically upload up-to-date consolidated datasets to the project's Open Science Framework page --- if you like, hop over there to grab a copy of the raw data.
Check out live the Covid Dashboard app here.
We use Travis CI to build. Travis automatically launches jobs in response to fresh commits to certain branches on this repo. You can peek at this repo's current build jobs here.
- A pdf gets added to the
input/
folder on theraw-data
branch. - Travis launches a build on the
raw-data
branch to extract tables from all pdfs ininput/
, generating one csv file per dataset in the pdf. (All content from datasets split over multiple pages end up in the same csv.) - A copy of the
data-joiner
branch is cloned down and the csv files are loaded into itsinput/
folder. - The loaded-up
data-joiner
branchis force-pushed to theparsed-data
branch. - Travis launches a job on the
parsed-data
branch to join csv files corresponding to different report dates. - A copy of the
app-builder
branch is cloned down and the joined csv files are loaded up into its root directory. - The loaded-up
parsed-data
branch branch is force-pushed to thebuilt-app
. - Travis launches a job on the
parsed-data
branch to build our shiny app and deploy it.
The idea of scraping data out of pdfs for this project was inspired by @jaredmoore's covid_tracker project.