Skip to content

NicoleGolden/OC_covid_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Scrapping:
Why Were Covid Cases Lower During Holidays in Orange County?

Author: Nicole Golden
Date: April 20, 2022

This data visualization project is inspired by my machine learning class. The question is that we observe low cases during Thanksgiving and Christmas in Orange County, CA. First, I try to visualize raw data. Then, after showing this "paradox," I use the loess model to make some smoothness predictions.

(You can find the data and code on my GitHub page.)

The data comes from OC Health Care Agency Table 1 and Table 2. To collect the data, I used the web scrapping technique using R. The period covers from 2020-08-16 to 2022-04-09.

After some basic data cleaning, I can visualize the raw data for each table.

Figure 1 (Image Source: Nicole Golden. Data Source: OC Health Care Agency

Figure 2 (Image Source: Nicole Golden. Data Source: OC Health Care Agency

The plots above show that covid cases during Thanksgiving and Christmas were very low. They were lower because people traveled to other areas to spend holidays. We can use the loess model to make some smooth predictions.

ggplot2 package from R provides plot using loess model. I used two methods to plot recorded vs. predicted cases: (i) Plot recorded cases against predicted cases; (ii) Plot recorded cases against three separate waves: the original wave, the delta wave, and the omicron wave.

Figure 3 (Image Source: Nicole Golden. Data Source: OC Health Care Agency

Figure 4 (Image Source: Nicole Golden. Data Source: OC Health Care Agency

Figure 3 and Figure 4 are the plots for raw data vs. predicted data. Now we can see that the covid cases during holidays are higher than recorded.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published