Undergraduate thesis: The Economy (Taylor's Version): Pop Concerts Helped The Economy Shake Off a Pandemic Slump

Project Overview

This project explores the effects of a Taylor Swift concert on local business traffic and sales. It utilizes data from taylorswift.com to index her concert dates and locations, and merges this information with local business data on traffic and sales.

The project consists of two main parts: data collection and analysis. Initially, the concert dates and locations are scraped from taylorswift.com using web scraping techniques. The local business data on traffic and sales is collected from various sources. The collected data is then cleaned and merged to create a unified dataset for analysis. The analysis is performed using R, and the findings are documented in an RMarkdown report.

To replicate the project, an internet connection and access to R are required. Running the code files in order will generate the final report. It is important to run the scripts before attempting to knit the RMarkdown document.

Steps to Replicate

Note: See below for version that uses Stata instead.

Setting up the project

Clone this repository to your local machine. (Alternatively, fork and clone the repository if you would like to suggest changes.)
Set your working directory to the main project folder. All file paths are set relative to this folder.

Running scripts (R versions)

The file that runs all other files is main_script.R. Run this file to execute all other scripts in the correct order.
main_script.R first builds the data using numbered scripts in the build folder. Inside each scripts there is a docstring with additional information on what they do and what to run next. Follow the numbers and run the scripts in order. The scripts will download and clean the data.
After running all files in build, main_script.R runs the analysis folder (within the code folder) scripts. The scripts in analysis are numbered, but they can be run in any order. All are based on variations of the final data created in build. Ideally, all should run in the process of knitting the RMarkdown, so they do not need to be run individually. Currently the relative file paths are set to run in the RMarkdown file, so they would need to be altered to reflect a different working directory.
Knit the RMarkdown. This will produce the final paper by combining all of the other RMarkdown files from the draft folder. The RMarkdown final draft can be found in the final folder within the writing folder.

Running scripts (Stata version)

The file that runs all other files is main_script.do. Run this file to execute all other scripts in the correct order.
main_script.do first builds the data using numbered scripts in the build folder. Inside each script, there is a docstring with additional information on what they do and what to run next. Follow the numbers and run the scripts in order. The scripts will download and clean the data.
After running all files in build, main_script.do runs the analysis folder (within the code folder) scripts. The scripts in analysis are numbered, but they can be run in any order. All are based on variations of the final data created in build. Ideally, all should run in the process of knitting the .tex file, so they do not need to be run individually. Currently, the relative file paths are set to run in the .tex file, so they would need to be altered to reflect a different working directory.
Compile the .tex file. This will produce the final paper by combining all of the other .tex files from the draft folder. The .tex final draft can be found in the final folder within the writing folder.

Important folder locations to know:

code, data, output, literature, presentations, writing are all folders within the main project folder.

Critical to replication:

code folder: build for scripts to build the data and analysis for scripts to analyze the data
- build: scripts to build the dataset for analysis
- analysis: scripts to analyze the data
data folder: raw for raw data, temp for various saves throughout building, clean for cleaned data, and final for final data
- raw: raw data
- temp: temporary data saves
- clean: cleaned data
output folder: figures for figures produced in analysis and tables for tables produced in analysis
- figures: figures produced in analysis
- tables: tables produced in analysis

Other folders:

literature folder: for any literature used in the project
presentations folder: for any presentations given on the project
writing folder: drafts for RMarkdown sections, final for the RMarkdown final draft. housekeeping.r is an R script in the main directory that sets relative file paths and loads all packages. It is run at the beginning of all other R scripts.

Relevant files (R version, Stata version is similar)

main_script.R or main_script.do: runs all other scripts in the correct order
housekeeping.r: sets relative file paths and loads all packages
code/build/clean_functions.R
analysis/analysis_functions.R
code/build/01_import_census.R
code/build/02_import_admin_data.R
code/build/03_clean_census.R
code/build/04_clean_admin_data.R
code/build/05_merge_census_admin.R
code/analysis/06_summary_stats.R
code/analysis/07_basic_regression.R
code/analysis/08_make_sum_figures.R
code/analysis/09_make_reg_figures.R
code/analysis/10_make_sum_tables.R
code/analysis/11_make_reg_tables.R

Data sources

Here are the main data sources in the data project. Descriptions to come!

taylorswift.com: concert dates and locations
Yelp scrapes: traffic and sales
Google Maps mobility data: traffic
International database of daily sales tax payment records: sales data from local businesses

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
code		code
data		data
literature		literature
presentations		presentations
writing		writing
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
housekeeping.R		housekeeping.R
housekeeping.do		housekeeping.do
main_script.R		main_script.R
main_script.do		main_script.do

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Undergraduate thesis: The Economy (Taylor's Version): Pop Concerts Helped The Economy Shake Off a Pandemic Slump

Project Overview

Steps to Replicate

Setting up the project

Running scripts (R versions)

Running scripts (Stata version)

Critical to replication:

Other folders:

Relevant files (R version, Stata version is similar)

Data sources

About

Releases

Packages

Languages

License

Bates-ECON456-Thesis-Seminar/thesis-sample

Folders and files

Latest commit

History

Repository files navigation

Undergraduate thesis: The Economy (Taylor's Version): Pop Concerts Helped The Economy Shake Off a Pandemic Slump

Project Overview

Steps to Replicate

Setting up the project

Running scripts (R versions)

Running scripts (Stata version)

Critical to replication:

Other folders:

Relevant files (R version, Stata version is similar)

Data sources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages