# **Directing Resources to Address COVID-19 Health Inequities**

Team B04 Fanfei Zhao, Sixuan Wang, Shu Wang, Yuesen Zhang, Jiadai Yu

### Project Proposal
Our goal is to establish a strategy for equal distribution of health care resources for the treatment of COVID-19. We will analyze the impact of inequities on community health status, find the hardest hit areas, and map out how to allocate medical resources (vaccines, tests, treatments, etc.) equally and to efficiently protect the safety of all communities.

### Data Source
California Health and Human Services (https://data.chhs.ca.gov/dataset/covid-19-equity-metrics)

### Summary
COVID-19 reveals health inequities, which in turn, through population mixing, are impacting the health of the entire society. Therefore, we analyzed the influence of social determinants, race, age and gender, and geographic location on the COVID to identify the populations most affected:

Essential workers with incomes of $40k-60k had the highest number of confirmed cases; between race and ethnicity, Native Hawaiian and other Pacific Islander had the highest case rate per 100k of 238.89, while Asian american had the lowest rate of only 57.49; In September 2022, there were no deaths among those aged 0-49 years, while 65+ had 290 death records; Among the different counties, Santa Cruz had a positive test rate of only 3.55%, while Imperial had a high rate of 22.21%. 

https://public.tableau.com/views/COVID-equity-dashboard-1017/Dashboard1?:language=zh-CN&:display_count=n&:origin=viz_share_link
<div class='tableauPlaceholder' id='viz1666122793564' style='position: relative'><noscript><a href='#'><img alt='Dashboard 1 ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-1017&#47;Dashboard1&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='COVID-equity-dashboard-1017&#47;Dashboard1' /><param name='tabs' value='no' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-1017&#47;Dashboard1&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='language' value='zh-CN' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1666122793564');                    var vizElement = divElement.getElementsByTagName('object')[0];                    if ( divElement.offsetWidth > 800 ) { vizElement.style.width='1800px';vizElement.style.height='1327px';} else if ( divElement.offsetWidth > 500 ) { vizElement.style.width='1800px';vizElement.style.height='1327px';} else { vizElement.style.width='100%';vizElement.style.height='2027px';}                     var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

### Contents
1. Introduction
    
    1.1 Motivation
    1.2 Data Dictionary
2. Explanatory Analysis
    
    2.1 Compare 7-Day Cases in Different Social Classes
    2.2 Difference of Cases and Death Rate between Races 
    2.3 Death Rate Between Different Age Groups and Genders
    2.4 Geographic Difference of Test Postivity Rate
    
3. Conclusion

4. References

These are the data fields' explanations we will cover in the following paragraphs 
## **1. Introduction**

### **1.1 Motivation**

Covid-19 has been effected the world nowadays and a significant amount of the population had covid positive, and human health problems are threatening. The social inequality problem has become more severe during the pandemic. In other words, social resources can not be distributed to people or countries equally and COVID makes some intangible phenomenons figurative. Thus, we want to analyze more information about the topic and we choose California data as the example here. 

The reason why we choose California as the target data is because the data in California state is more detailed than other states online and the diversity here is high enough to support us do the various analysis on the topic


### **1.2 Data Dictionary**

These are the explanations of data fields that we will cover in the following paragraphs and provide readers with a brief understanding of the concepts we use.

**Date**: The exact date of the records taken.

**Metric**: The variable of interest we record when doing analysis. Different metrics include positive_rate, 7-day cases rate.

**Demographic**: The demographic characteristics that we use when doing analysis. Race-ethnity,age,gender,among others.

**Demographic category**: A detailed division(a subcategory) of records based on the chosen demographic characteristics.

**Social determinants**: conditions in the environments where people are born, live, learn, work, play, worship, and age that affect a wide range of health, functioning, and quality-of-life outcomes and risks.Including income,income_cumulative,crowding,insurance among others.

**Social Tier**: A detailed division(A subcategory) of records based on the chosen social determinants.

**Location**: The name in county level in California.

**Value**: The exact value based on given metrics.For convenience,the rate in these datasets are based on per 100,K. 

## **2. Explanatory Analysis**

In the analysis part, we will first talk about the social class influence on COVID and then explain the race and ethnicity differences. We will discuss the death rate and test positive rate in the following parts.

### **2.1 Compare 7-Day Cases in Different Social Classes**

COVID-19 brings a huge impact all over the world and different social class people feel more deeply about the influence. What is the relationship between social tier and COVID cases and what do they tell us? In the beginning, let's use the 7-day case data to help us know the basic information in the state of California.

In [20]:
%%bigquery
SELECT EXTRACT(MONTH FROM date) AS Month_of_2021, social_det,social_tier,AVG(case_rate_per_100k) AS avg_case_rate_per_100k
FROM `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-case-rate-by-social-det`
WHERE EXTRACT( YEAR FROM date) = 2021
GROUP BY Month_of_2021,social_det,social_tier,sort
ORDER BY Month_of_2021,social_det,sort;

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 373.96query/s] 
Downloading: 100%|██████████| 288/288 [00:01<00:00, 208.21rows/s]


Unnamed: 0,Month_of_2021,social_det,social_tier,avg_case_rate_per_100k
0,1,crowding,less than 2%,44.695407
1,1,crowding,2% - 5%,56.781871
2,1,crowding,5% - 10%,75.676776
3,1,crowding,10% - 15%,97.046162
4,1,crowding,15% - 20%,115.130432
...,...,...,...,...
283,12,insurance,5% - 10%,10.157431
284,12,insurance,10% - 15%,11.496048
285,12,insurance,15% - 25%,12.661453
286,12,insurance,25% - 35%,13.047660


In [21]:
%%bigquery
SELECT EXTRACT(MONTH FROM date) AS Month_of_2022, social_det,social_tier,AVG(case_rate_per_100k) AS avg_case_rate_per_100k
FROM `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-case-rate-by-social-det`
WHERE EXTRACT( YEAR FROM date) = 2022
GROUP BY Month_of_2022,social_det,social_tier,sort
ORDER BY Month_of_2022,social_det,sort;

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 747.38query/s] 
Downloading: 100%|██████████| 216/216 [00:01<00:00, 161.31rows/s]


Unnamed: 0,Month_of_2022,social_det,social_tier,avg_case_rate_per_100k
0,1,crowding,less than 2%,130.443291
1,1,crowding,2% - 5%,145.866734
2,1,crowding,5% - 10%,166.359762
3,1,crowding,10% - 15%,190.601379
4,1,crowding,15% - 20%,205.739715
...,...,...,...,...
211,9,insurance,5% - 10%,11.431043
212,9,insurance,10% - 15%,12.656302
213,9,insurance,15% - 25%,13.464187
214,9,insurance,25% - 35%,13.857676


<div class='tableauPlaceholder' id='viz1666122818145' style='position: relative'><noscript><a href='#'><img alt='7 Day Cases in Different Social Class ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-5&#47;3F7DayCasesinDifferentSocialClass&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='COVID-equity-dashboard-5&#47;3F7DayCasesinDifferentSocialClass' /><param name='tabs' value='no' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-5&#47;3F7DayCasesinDifferentSocialClass&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='language' value='zh-CN' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1666122818145');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

From the graph, we can see that people in the lower middle and middle classes have the highest number of diagnoses which makes sense. Back in real life, COVID accelerates inequality in society. To be more specific, the graph also shows that colored people have a higher rate of getting COVID than White Amerian because colored people will have a greater likelihood of being exposed to risk factors in COVID like crowded workplaces, face-to-face talk, and low protection awareness [1].

Lower and middle-class people are essential workers, they have to go to work to earn money and the crowding also tells us that their family members are large and so the infection rate will increase. That’s why the 7 cases number are high in these social diets. COVID is controlled a little by the government in 2022, so the number of 7 cases number decreased by 50% but the lower and middle classes still have a higher number of diagnosed cases. The case numbers tell us that COVID-19 impacts not only people’s health but also the social environment of the limited resources under the pandemic like money, food, education, job, and healthcare, which makes the lower class people become more vulnerable [1]. 

### **2.2 Difference of Cases and Death Rate between Races**

In this section,we are going to analyze the death rate and confirmed rate among different ethnity groups
in state of California to invest how COVID influeces the different races and ethnicities. 

In [22]:
%%bigquery
select*
from `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-race-ethnicity-timeseries`
limit 5

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 506.74query/s] 
Downloading: 100%|██████████| 5/5 [00:01<00:00,  3.73rows/s]


Unnamed: 0,DATE,LOCATION,LOCATION_LEVEL,DEMOG,DEMOG_CAT,METRIC_CAT,METRIC,VALUE
0,2022-09-03,CA,state,race_ethnicity,White,cases,cases_week_rate,63.15528
1,2022-08-27,CA,state,race_ethnicity,White,cases,cases_week_rate,76.77461
2,2022-08-20,CA,state,race_ethnicity,White,cases,cases_week_rate,87.06567
3,2022-08-13,CA,state,race_ethnicity,White,cases,cases_week_rate,98.50019
4,2022-08-06,CA,state,race_ethnicity,White,cases,cases_week_rate,118.29279


From the above preview of this dataset, this dataset contains weekly case rate and death rate for each ethnity group, on a basis of cases/deaths per 100,K from 2020 to 2022.Is there any difference in confirmed/death rate between different ethnity groups? How do we use this finding to help reduce the spread of Covid-19?

In [23]:
%%bigquery
select distinct DEMOG_CAT
from `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-race-ethnicity-timeseries`
#An overview of different ethnicity categories

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 332.62query/s] 
Downloading: 100%|██████████| 7/7 [00:01<00:00,  5.19rows/s]


Unnamed: 0,DEMOG_CAT
0,White
1,Latino
2,Multi-Race
3,Asian American
4,American Indian
5,African American
6,Native Hawaiian and other Pacific Islander


In [24]:
%%bigquery
select distinct METRIC_CAT
from `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-race-ethnicity-timeseries`
#An overview of different metric categories

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 558.27query/s] 
Downloading: 100%|██████████| 3/3 [00:01<00:00,  2.50rows/s]


Unnamed: 0,METRIC_CAT
0,cases
1,tests
2,deaths


In [25]:
%%bigquery
select sum(value) as total_cases_per100K,DEMOG_CAT
from `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-race-ethnicity-timeseries`

where METRIC_CAT = 'cases'
and EXTRACT(year from DATE) = 2021

group by DEMOG_CAT
order by total_cases_per100K desc

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 506.56query/s] 
Downloading: 100%|██████████| 7/7 [00:01<00:00,  5.19rows/s]


Unnamed: 0,total_cases_per100K,DEMOG_CAT
0,12422.54331,Native Hawaiian and other Pacific Islander
1,7091.57208,Latino
2,6137.70867,African American
3,6056.66728,American Indian
4,4606.08161,White
5,4062.35234,Multi-Race
6,2989.49561,Asian American


The above results show significant difference in total cases within different ethnicity groups. A relative high number of confirmed cases in Native Hawaiian may due to the fact that this ethnicity group lives in a certain area and is easily get infected if the virus has spread to their living habitat. The relative low number of confirmed cases in Asian American may due to the fact that they tend to be more cautious facing this virus and did more protective measures compared to other ethnicity groups.

In [26]:
%%bigquery
select sum(value) as total_deaths_per100K,DEMOG_CAT
from `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-race-ethnicity-timeseries`

where METRIC_CAT = 'deaths'
and EXTRACT(year from DATE) = 2021

group by DEMOG_CAT
order by total_deaths_per100K desc

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 609.64query/s] 
Downloading: 100%|██████████| 7/7 [00:01<00:00,  5.51rows/s]


Unnamed: 0,total_deaths_per100K,DEMOG_CAT
0,239.46186,Native Hawaiian and other Pacific Islander
1,136.76972,American Indian
2,134.06235,African American
3,133.65878,Latino
4,109.7033,White
5,85.59798,Multi-Race
6,81.66826,Asian American


Overall,the death number is in positive relationship to confirmed cases.A higher confirmed case rate indicates 
higher death rate within groups. Combined with comparison of death rate between 2021 and 2022,it is clear that 
there's no significant pattern that affects the death rate between ethnity groups other than case rate.

<div class='tableauPlaceholder' id='viz1666133561417' style='position: relative'><noscript><a href='#'><img alt='Cases Among Race and Ethnicity ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-3&#47;2Aracetime&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='COVID-equity-dashboard-3&#47;2Aracetime' /><param name='tabs' value='no' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-3&#47;2Aracetime&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='language' value='zh-CN' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1666133561417');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

<div class='tableauPlaceholder' id='viz1666133578899' style='position: relative'><noscript><a href='#'><img alt='Death Rate Among Race and Ethnicity ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-4&#47;2Aracesbar&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='COVID-equity-dashboard-4&#47;2Aracesbar' /><param name='tabs' value='no' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-4&#47;2Aracesbar&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='language' value='zh-CN' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1666133578899');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

The graph of confirmed case rate from 2021 to 2022 within different groups shows that the spread of the COVID-19 
is consistent.The peak occurs at January 2022,which is the starting time of Omicron. Native Hawaiian occupies only 0.38% of total population in California[2],with the highest confirmed rate.  

How do we use these findings to guide our strategy for dealing with Covid-19?
Overall,we need to slow down the spread of Covid-19. For native hawaiian or pacific islanders,it is crucial to publicize the precautions against Covid-19 in their neighbourhood and take further steps to ensure the safety of the whole living area.

For African Americans and African Indians,they have a slightly higher death-case ratio and we might reallocate the
medical resources, provide primary health care access to neighbourhood of these ethnity groups and increase their chances of treatment when they get Covid-19.[3]

### **2.3 Death Rate Between Different Age Groups and Genders**

We will analyze the death rate between ages and gender here to do a more detailed analysis. Before going through the content, the question here is that is there any differences in death rates between several age groups and different genders.

In [27]:
%%bigquery
WITH age AS(
SELECT date,demographic_set,demographic_set_category,metric,sum(metric_value) as total_sum
FROM `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-demographic-rate-cumulative`
WHERE demographic_set in ("age_gp4",'gender')
AND county!="California"
AND demographic_set_category IS NOT NULL
AND metric_value!=0
GROUP BY date,demographic_set,demographic_set_category,metric
ORDER BY date,demographic_set,demographic_set_category,metric)


SELECT date,A.demographic_set,A.demographic_set_category,A.total_sum as Confirmed,B.total_sum AS Deaths,(B.total_sum/A.total_sum) AS death_rate
FROM
(SELECT date,demographic_set,demographic_set_category,total_sum
FROM age
WHERE metric = 'cases' ) A
LEFT JOIN(
SELECT demographic_set,demographic_set_category,total_sum
FROM age
WHERE metric = 'deaths'
) B ON A.demographic_set_category=B.demographic_set_category
AND A.demographic_set=B.demographic_set

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 818.08query/s] 
Downloading: 100%|██████████| 7/7 [00:01<00:00,  5.46rows/s]


Unnamed: 0,date,demographic_set,demographic_set_category,Confirmed,Deaths,death_rate
0,2022-09-28,age_gp4,0-17,16923.0,,
1,2022-09-28,age_gp4,18-49,61807.0,,
2,2022-09-28,age_gp4,50-64,25137.0,23.0,0.000915
3,2022-09-28,age_gp4,65+,22144.0,290.0,0.013096
4,2022-09-28,gender,F,68030.0,143.0,0.002102
5,2022-09-28,gender,M,57771.0,144.0,0.002493
6,2022-09-28,gender,U,1568.0,,


According to the data grouping by age, we can see that there is no one age under 49 dead in the past 30 days. However, there are 25 people died whose age are between 50 to 64; and 310 people died whose age are above 65. The death rate for these two groups are 0.088% and 1.293% respectively. We can conclude that the death rate increases with the age.In terms of gender, the death rate for female is 0.18%, and the death rate for male is 0.24%. The difference is not that significant enough for us to conclude whether the gender affacts the death rate.

As the elderly are the most vulnerable group, we should introduce some ways to protect them.

The first is to reduce the likelihood of their being infected, i.e., to reduce the amount of time they are exposed to crowded places. Walworth supermarkets in Australia have introduced special shopping hours. From 7 a.m. to 8 a.m., only elderly and disabled people can go shopping. The British government is coordinating with retailers to ensure that older people can still get supplies from a remote location, like buying items from the grocery store by making a phone call.

The second is to improve the cure rate for the elderly after infection. Possible approaches include developing vaccine or special drugs specifically designed for the elderly. Also, we should set aside some medical resources for the elderly only.

### **2.4 Geographic Difference of Test Positivity Rate**

In the last part, we will explain the test positivity rate in California. Covid-19 spreads with population movements, therefore geographic factors are very relevant to the epidemic, for example, the average 30-day death rate in Los Angeles reached 4.37%, while in Alameda it was only 1.43%. The graph below shows the distribution of positive tests by county, labeled with the confirmation number and the death rate.

<div class='tableauPlaceholder' id='viz1666122361140' style='position: relative'><noscript><a href='#'><img alt='Geographic View of Test Positivity Rate ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-1&#47;5Dpositiverate&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='COVID-equity-dashboard-1&#47;5Dpositiverate' /><param name='tabs' value='no' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-1&#47;5Dpositiverate&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='language' value='zh-CN' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1666122361140');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

Within each county, we can have more detailed discussion of the differences of COVID-19 between neighborhoods.
The CDPH defines the healthy place index (HPI) as "a composite measure of socioeconomic opportunity applied to census tracts that includes 25 individual indicators across economic, social, education, transportation, housing, environmental and neighborhood sectors.", [4]and divides each county's population into quartiles based on HPI.
To illustrate the inequalities between neighborhoods, we compared the differences in test positivity rates between county overall and low-hpi areas:

In [28]:
%%bigquery
WITH all_nopris AS(
    SELECT 
        county,
        date,
        metric_value AS positive_rate_all_nopris
    FROM `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-health-equity-metric-pos-30-day-by-cnt`
    WHERE metric = 'county_positivity_all_nopris'),

 low_hpi AS(
    SELECT 
        county,
        date,
        metric_value AS positive_rate_low_hpi
    FROM `ba775-team-project-04.ba775_team04_covid_19_equity_metrics_v1kpo4.covid-19-health-equity-metric-pos-30-day-by-cnt`
    WHERE metric = 'county_positivity_low_hpi')

SELECT 
    all_nopris.county,
    all_nopris.date,
    positive_rate_all_nopris,
    positive_rate_low_hpi,
    (positive_rate_low_hpi - positive_rate_all_nopris) as positive_rate_diff
FROM all_nopris
INNER JOIN low_hpi
ON all_nopris.county = low_hpi.county
    AND all_nopris.date = low_hpi.date
LIMIT 5;

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 468.22query/s] 
Downloading: 100%|██████████| 5/5 [00:01<00:00,  4.20rows/s]


Unnamed: 0,county,date,positive_rate_all_nopris,positive_rate_low_hpi,positive_rate_diff
0,Alameda,2022-08-20,0.08609,0.09233,0.00624
1,Alameda,2022-08-21,0.085055,0.090442,0.005387
2,Alameda,2022-08-22,0.082273,0.084493,0.00222
3,Alameda,2022-08-23,0.079097,0.079782,0.000684
4,Alameda,2022-08-24,0.078397,0.078357,-4e-05


Positive rate indicates the severity of the COVID-19. We picked Imperial, a county with a high positivity rate (around 20%), and Santa Cruz, a county with a positivity rate of less than 10%, as representatives of the two types of counties that are strongly and moderately hit by COVID-19. 
The graph below shows the change in test positivity over time for the two types of counties, where the thick line represents the average positive rate for the whole county, and the thin line represents the positive rate in low-hpi areas.

<div class='tableauPlaceholder' id='viz1666122497921' style='position: relative'><noscript><a href='#'><img alt='Healthy Place Index Quartile Test Positivity Rate ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-2&#47;5Dtimeseriesbestandworst&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='COVID-equity-dashboard-2&#47;5Dtimeseriesbestandworst' /><param name='tabs' value='no' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;CO&#47;COVID-equity-dashboard-2&#47;5Dtimeseriesbestandworst&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='language' value='zh-CN' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1666122497921');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

From the graph we can tell that neighborhoods with low healthy places index (HPI) tends to have higher positive rate than the county average, indicating more severe epidemic and inadequate testing. Further, the higher the positivity rate, the more significant the differences in positivity rates across neighborhoods, while counties with overall lower positivity rates had smaller differences in positivity rates across regions.
The Covid-19 epidemic reveals health inequalities across neighborhoods, and the worse the epidemic, the more significant the inequalities

## **3.Conclusion**

From the analysis we did above, the most vulnerable groups are the lower class people, the Native Hawaiians, the elderly, and the people with a low healthy places index (HPI) neighborhood. Potential reasons may include living environment, protection awareness, and level of immunity (for the elderly). The four datasets we provided seem like individually talking their stories but due to the overlap of historical reasons, minorities, low-income people, and residents of low hpi neighborhoods, the mission of the analysis is finding the groups most affected by COVID instead of separation analysis 

The suggestions for government policy are: reallocating the medical resources for these vulnerable groups, granting funds at least at certain percentage to interrupt disease transmission in these populations. Although the medical inequality issue is difficult to eradicate, it is still meaningful for us to locate vulnerable groups and try to find possible solutions for further improvement.

## **References**

[1]Kayitsinga, J., &amp; Martinez, R. O. (2021, August 8). Social and Structural Inequalities and COVID-19 in the United States. Social and Structural Inequalities and COVID-19 in the United States - Julian Samora Research Institute - Michigan State University. Retrieved October 18, 2022, from https://jsri.msu.edu/publications/nexo/vol/no-1-fall-2020/social-and-structural-inequalities-and-covid-19-in-the-united-states 

[2]California population 2022. California Population 2022 (Demographics, Maps, Graphs). (n.d.). Retrieved October 18, 2022, from https://worldpopulationreview.com/states/california-population 

[3]Harris, M. (2020, June 9). To address COVID-19 disparities, PCDC urges New York State to invest in Primary Care. Primary Care Development Corporation. Retrieved October 18, 2022, from https://www.pcdc.org/covid-19-disparities-new-york-testimony/?creative=490896925902&amp;keyword=covid-19+racial+disparities&amp;matchtype=e&amp;network=g&amp;device=c&amp;gclid=Cj0KCQjwnbmaBhD-ARIsAGTPcfXq-M9l49l5jql7F0aEgiklG73BvC7CeeCZBjkwl2zwsst4lXNrgOIaAtu-EALw_wcB 

[4]Blueprint For a Safer Economy: Equity Focus 
https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/COVID-19/CaliforniaHealthEquityMetric.aspx