# How does COVID-19 affect the crime in Chicago, Illinois?

**Business Problem:**
Analyze the relationship between COVID-19 and its effects on types of crime, locations and frequencies of crime. Understand the patterns of crime that were taking place in Chicago and how they changed in relation to city shut-downs. In addition, we are working on obtaining a dataset regarding the police excessive use of force and complaints and connection with the crimes and crime rates. *note: we did not receive the data set in time.*

 > **Used data sets:** 
 > - Chicago Crime Data Set: [chicago_crime_2014_2020](https://console.cloud.google.com/bigquery?project=ba775-project-291117&p=bigquery-public-data&d=chicago_crime&t=crime&page=table) (Crime cases in Chicago between 2014 and 2020)
 > - Chicago Covid-19 Data Set: [chicago_city_covis19](https://console.cloud.google.com/bigquery?project=ba775-project-291117&p=bigquery-public-data&d=covid19_open_data&t=covid19_open_data&page=table) (Corona cases in Chicago since March 2020)
 > - Chicago Covid-19 areas: [Chicago_Covid_with_zips](https://console.cloud.google.com/bigquery?project=ba775-team-8b&p=ba775-team-8b&d=data_covid_19_chicago_city&t=Chicago_Covid_with_zips&page=table) (Corona cases in different areas in Chicago)
 
 > **Steps in advance:**
 > - Extracted data from larger data sets and saved them in new tables (named above)
 > - Converted the timestamp in the chicago_crime_2014_2020 into a date format to match the format with the other tables
 
 > **Restrictions and Limitations:**
 > - Chicago Crime Data Set: The addresses of the crimes are appromixate as part of the address is marked out with Xs, so analysis of different situations and zip codes are approximated over various areas. Another limitation is the documentation of in primary types of crime because there are two labels, "Criminal Sexual Assualt" and "Crim Sexual Assualt" that presumably indicate the same type of crime but becuase they are seperated it is hard to know if they were actually meant to represent the same crime. Last point is that data is directly reported about crimes by police, so data could be biased.
 > - Chicago Covid-19 Data Set: The reporting date of the Covid-19 case is not the date of illness, and there is also the incubation period when the person may already be contagious but does not yet know that they have the virus. It is therefore possible that, especially with regard to the weekly comparison, there is a time lag that cannot be identified.
 

## Approach:
1. Get an overview over each data set on its own
2. Ask multiple questions about every data set and answer them
3. Understand and discuss the results
4. Figure out where to combine and what to compare between the different data sets
5. Formulate questions to answer
6. Answer the questions and analyze the results
7. Summary of the findings

## Excutive Summary

The analysis of the different datasets has shown that a connection between Covid-19 and the crimes in Chicago can indeed be established. In general, the following analysis shows that the crimes in 2020 are significantly lower compared to the average crimes of the last 6 years. Especially in the months when the corona virus was new in the USA, the number of crimes decreased up to 69.7%. By comparing the number of different types of crimes in 2019 and 2020, we found that the top 5 crime types in 2019 and 2020 are the same. However, in 2020, the number of crimes of the top 5 crimes type has dropped significantly, with theft dropping the most by 31%. The same thing occurred at the crime location. By comparing the number of crimes in different locations in 2019 and 2020, the number of crimes on sidewalk dropped the most by 33%. It should be noted that the analysis was not comprehensive enough to attribute the decline in crimes exclusively to the corona virus.In addition, arrested rate decreased caused by Covid-19 since police strike and avoiding the spread of Covid-19 in the prison. Lastly, while overall crime rates went down, we found that domestic violence overall increased by 3% from 2019 to 2020 as well as specific crimes associated with domestic violence also increased.

## Tableau Dashboard
https://prod-useast-a.online.tableau.com/t/soltaniehha/views/Team8b/covid-19vs_crimes?:showAppBanner=false&:display_count=n&:showVizHome=n&:origin=viz_share_link

![](https://drive.google.com/uc?id=1tEHjvt2vIgSJ7mlV-XvNt0fYVaxn2Emc)

### Overview Chicago Crime data set:

- **date** - Date when the incident occurred. this is sometimes a best estimate.
- **case_number** - The Chicago Police Department RD Number (Records Division Number), which is unique to the incident.
- **block** - The partially redacted address where the incident occurred, placing it on the same block as the actual address.
- **primary_type** - The primary description of the IUCR code.
- **description** - The secondary description of the IUCR code, a subcategory of the primary description.
- **location_description** - Description of the location where the incident occurred.
- **arrest** - Indicates whether an arrest was made.
- **domestic** - Indicates whether the incident was domestic-related as defined by the Illinois Domestic Violence Act.
- **beat** - Indicates the beat where the incident occurred. A beat is the smallest police geographic area – each beat has a dedicated police beat car. Three to five beats make up a police sector, and three sectors make up a police district.
- **district** - Indicates the police district where the incident occurred. 
- **ward** - The ward (City Council district) where the incident occurred.
- **community_area** - Indicates the community area where the incident occurred. Chicago has 77 community areas.
- **fbi_code** - Indicates the crime classification as outlined in the FBI's National Incident-Based Reporting System (NIBRS).
- **year** - Year the incident occurred.
- **updated_on** - Date and time the record was last updated.
- **location** - The location where the incident occurred in a format that allows for creation of maps and other geographic operations on this data portal.

>

In [2]:
%%bigquery
SELECT * FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
LIMIT 5
#chicago crime

Unnamed: 0,date,case_number,block,primary_type,description,location_description,arrest,domestic,beat,district,ward,community_area,fbi_code,year,updated_on,location
0,2019-07-15,JC350657,010XX E 133RD ST,OTHER OFFENSE,OTHER WEAPONS VIOLATION,SIDEWALK,True,False,533,5,9,54,26,2019,2019-07-22 16:17:57+00:00,"(41.653736665, -87.595510037)"
1,2019-07-28,JC371048,131XX S EXCHANGE AVE,OTHER OFFENSE,HARASSMENT BY ELECTRONIC MEANS,OTHER,False,False,433,4,10,55,26,2019,2019-08-04 16:06:55+00:00,"(41.656670884, -87.551985047)"
2,2019-08-12,JC389200,060XX N MENARD AVE,CRIMINAL TRESPASS,TO RESIDENCE,RESIDENCE,False,False,1611,16,39,11,26,2019,2019-08-19 16:24:12+00:00,"(41.991147242, -87.773184326)"
3,2019-09-11,JC429440,014XX E 94TH ST,NARCOTICS,POSS: CANNABIS MORE THAN 30GMS,STREET,True,False,413,4,8,48,18,2019,2019-09-18 16:24:45+00:00,"(41.723626269, -87.587392017)"
4,2019-10-23,JC483293,039XX W PETERSON AVE,WEAPONS VIOLATION,UNLAWFUL USE OTHER DANG WEAPON,"SCHOOL, PUBLIC, BUILDING",False,False,1711,17,39,13,15,2019,2019-10-30 15:56:31+00:00,"(41.990159786, -87.726830225)"


### Overview Chicago Covid-19 data set:

- **Date** - Date when the Covid-19 reported.
- **Cases___Total** - Number of cases in the day.
- **Deaths___Total** - Number of deaths in the day.
- **Hospitalizations___Total** - Number of hospitalizations in the day.
- **Casese___Age** - Number of cases of each age range.
- **Hospitalizations___** - Number of hospitalizations of each gender and race.

In [16]:
%%bigquery
SELECT * FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_city_covis19`
LIMIT 5
#chicago_city_covid19

Unnamed: 0,Date,Cases___Total,Deaths___Total,Hospitalizations___Total,Cases___Age_0_17,Cases___Age_18_29,Cases___Age_30_39,Cases___Age_40_49,Cases___Age_50_59,Cases___Age_60_69,...,Hospitalizations___Age_Unknown,Hospitalizations___Female,Hospitalizations___Male,Hospitalizations___Unknown_Gender,Hospitalizations___Latinx,Hospitalizations___Asian_Non_Latinx,Hospitalizations___Black_Non_Latinx,Hospitalizations___White_Non_Latinx,Hospitalizations___Other_Race_Non_Latinx,Hospitalizations___Unknown_Race_Ethnicity
0,2020-03-01,0,0,2,0,0,0,0,0,0,...,0,1,1,0,0,0,0,2,0,0
1,2020-03-02,0,0,2,0,0,0,0,0,0,...,0,2,0,0,1,0,1,0,0,0
2,2020-03-03,0,0,3,0,0,0,0,0,0,...,0,1,2,0,0,0,3,0,0,0
3,2020-03-04,0,0,4,0,0,0,0,0,0,...,0,1,3,0,0,0,3,1,0,0
4,2020-03-05,1,0,6,0,0,0,0,1,0,...,0,2,4,0,0,0,2,4,0,0


This is an overview of total crimes, total arrests, and total counts of domestic violence between 2014 and 2019. For most of the remainder of the notebook, we will compare 2019 and 2020 data as the data between 2014 and 2019 is very similar and points to 2020 being an outlier year.

In [10]:
%%bigquery
SELECT extract(year from date) as year, count(*) AS total_crimes, sum(case when arrest = TRUE THEN 1 END) AS total_arrested, SUM(CASE WHEN domestic = TRUE THEN 1 END) as total_domestic, FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020` 
where extract(month from date)  = 1 or extract(month from date)=2 or extract(month from date)=3 or extract(month from date)=4 or extract(month from date)=5 or extract(month from date)=6 or extract(month from date)=7 or extract(month from date)=8 or extract(month from date)=9 
group by year
order by year

Unnamed: 0,year,total_crimes,total_arrested,total_domestic
0,2014,209808,61387,30920
1,2015,199661,54650,32106
2,2016,203604,41513,33174
3,2017,203218,40583,32765
4,2018,202794,40976,33394
5,2019,198125,43172,33125
6,2020,157709,25477,30056


## Analyzing the correlation between Covid-19 cases and crime cases in Chicago:

The following query compares the average monthly crime cases calculated on the base of data from 2014 to 2019, compared to monthly crimes in 2020 *(January to October; note: October is not complete)* and compared to monthly COVID-19 cases in 2020 *(note: January and February are null because Chicago had no COVID cases in those months)*. In addtion, the difference between the average crimes and crimes in 2020 per month is shown in percentages *(perc_diff_vs_last_6yrs)*.

In [1]:
%%bigquery
#COVID and CRIME
SELECT 
  a.month, c.total_monthly_covid_cases_2020, a.total_monthly_crimes_2020, b.avg_monthly_crime_cases, 
  (a.total_monthly_crimes_2020-b.avg_monthly_crime_cases)/a.total_monthly_crimes_2020*100 as perc_diff_vs_last_6yrs
FROM
(SELECT 
  EXTRACT(MONTH from date) as month, 
  COUNT(case_number) as total_monthly_crimes_2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020`
WHERE year = 2020
GROUP BY month) as a
LEFT JOIN 
(SELECT
  EXTRACT(MONTH from date) as month,
  ROUND(COUNT(case_number)/6) as avg_monthly_crime_cases
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020` 
WHERE year BETWEEN 2014 AND 2019
GROUP BY  month
ORDER BY  month) as b
USING(month)
LEFT JOIN
(SELECT
  EXTRACT(MONTH from date) as month,
  SUM(Cases___Total) as total_monthly_covid_cases_2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_city_covis19` 
GROUP BY  month
ORDER BY  month) as c
USING(month)
ORDER BY month

Unnamed: 0,month,total_monthly_covid_cases_2020,total_monthly_crimes_2020,avg_monthly_crime_cases,perc_diff_vs_last_6yrs
0,1,,19664,20637.0,-4.948129
1,2,,17995,18022.0,-0.150042
2,3,4416.0,16520,21341.0,-29.182809
3,4,21576.0,12726,21596.0,-69.699827
4,5,21355.0,17356,23961.0,-38.056004
5,6,6310.0,17406,24047.0,-38.15351
6,7,8282.0,19262,25087.0,-30.240889
7,8,9988.0,19487,25014.0,-28.362498
8,9,9161.0,17293,23163.0,-33.944371
9,10,7172.0,6714,23024.0,-242.925231


### Weekly comparsion:
This query compares the the same things mentioned in the query above just on a weekly base. In addition this link provides the [restrictions](https://www.chicago.gov/city/en/sites/covid-19/home/health-orders.html) made by the city of Chicago.
Note that in week 13: Chicago city closed all parks, beaches, and some sports like soccer were not allowed anymore. COVID-19 cases still increased but crime cases decreased.
In week 22: very high crime rate and relatively low corona cases.
In week 31 and 32: significant crime increase within a week.
A statement in week 30 allowed businesses to open again - 2 weeks later there were more crimes but corona cases remained pretty stable.
Overall the crime rate significantly decreased since COVID-19, especially in the very beginnig of COVID-19 *(see weeks 12 to 18)*.

In [11]:
%%bigquery
#COVID and CRIME
SELECT a.week, c.total_weekly_covid_cases_2020, a.total_weekly_crimes_2020, b.avg_weekly_crime_cases,
(a.total_weekly_crimes_2020-b.avg_weekly_crime_cases)/a.total_weekly_crimes_2020*100 as perc_diff_weekly_to_avg
FROM
(SELECT 
  EXTRACT(WEEK from date) as week, 
  COUNT(case_number) as total_weekly_crimes_2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020`
WHERE year = 2020 
GROUP BY week
HAVING week BETWEEN 10 AND 40
ORDER BY week) as a
LEFT JOIN 
(SELECT
  EXTRACT(WEEK from date) as week,
  CAST(ROUND(COUNT(case_number)/6) as INT64) as avg_weekly_crime_cases
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020` 
WHERE year BETWEEN 2014 AND 2019
GROUP BY  week
HAVING week BETWEEN 10 AND 40
ORDER BY  week) as b
USING(week)
LEFT JOIN
(SELECT
  EXTRACT(WEEK from date) as week,
  SUM(Cases___Total) as total_weekly_covid_cases_2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_city_covis19`
GROUP BY  week
HAVING week BETWEEN 10 AND 40
ORDER BY  week) as c
USING(week)
ORDER BY week
limit 10

Unnamed: 0,week,total_weekly_covid_cases_2020,total_weekly_crimes_2020,avg_weekly_crime_cases,perc_diff_weekly_to_avg
0,10,115,4256,4956,-16.447368
1,11,840,3594,4845,-34.808013
2,12,2244,2931,4800,-63.766633
3,13,2900,3134,5039,-60.784939
4,14,3680,3037,4964,-63.450774
5,15,4314,2818,5043,-78.956707
6,16,6553,2942,5072,-72.399728
7,17,7226,3145,5246,-66.804452
8,18,6033,3120,5312,-70.25641
9,19,5404,3468,5281,-52.27797


### Linear regression 1: COVID and CRIME

This regression mainly indicates that the relationship between Chicago weekly COVID cases and Chicago weekly crimes is negative. The relationship coefficient is -0.225421. When COVID cases increase by 10, the expectation of Chicago crimes decreases by 2.25421. When COVID cases increase by 20, the expectation of Chicago crimes decreases by 4.50842.

In [18]:
%%bigquery
#COVID and CRIME
#weekly average
SELECT avg(total_weekly_covid_cases_2020) as covid_bar,
       avg(total_weekly_crimes_2020) as crime_bar
from `ba775-team-8b.data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`;

Unnamed: 0,covid_bar,crime_bar
0,2754.677419,3910.903226


In [2]:
%%bigquery
#COVID and CRIME
#Linear regression slope
SELECT SUM((total_weekly_covid_cases_2020 - covid_bar) * (total_weekly_crimes_2020 - crime_bar)) 
/ SUM((total_weekly_covid_cases_2020 - covid_bar) * (total_weekly_covid_cases_2020 - covid_bar)) as slope
from (
    SELECT total_weekly_covid_cases_2020, AVG(total_weekly_covid_cases_2020) over () as covid_bar,
           total_weekly_crimes_2020, AVG(total_weekly_crimes_2020) over () as crime_bar
    from `ba775-team-8b.data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`) s;

Unnamed: 0,slope
0,-0.225421


In [20]:
%%bigquery
#COVID and CRIME
#Linear regression slope and intercept
# Chicago weekly crimes = -0.225Chicago weekly COVID cases + 4531.8667
# Linear regression explanation: 
# 1.intercept=4531.866711: When there's no COVID case, the mean expected value of Chicago weekly crimes is 4531.866711.
# 2.slope=-0.225421: Holding other things constant, the coefficient is -0.225421 is negative, which indicates
# the relationship between Chicago weekly COVID cases and Chicago weekly crimes is negative. The releationship
# coefficient is -0.225421. For example when COVID cases take on 10, the expectation of Chicago crimes decreases by
# 2.25421. When COVID cases take on 20, the expetation of Chicago crimes decreases by 4.50842.

SELECT slope, 
       crime_bar_max - covid_bar_max * slope as intercept 
FROM (
    SELECT SUM((total_weekly_covid_cases_2020 - covid_bar) * (total_weekly_crimes_2020 - crime_bar))
    / SUM((total_weekly_covid_cases_2020	 - covid_bar) * (total_weekly_covid_cases_2020	 - covid_bar)) as slope,
           MAX(covid_bar) AS covid_bar_max,
           MAX(crime_bar) AS crime_bar_max    
    from (
        SELECT total_weekly_covid_cases_2020, avg(total_weekly_covid_cases_2020) over () as covid_bar,
               total_weekly_crimes_2020, avg(total_weekly_crimes_2020) over () as crime_bar
        from  `ba775-team-8b.data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`) s
) 

Unnamed: 0,slope,intercept
0,-0.225421,4531.866711


In [26]:
%%bigquery
#COVID and CRIME
#Plot the linear regression line (and forecast) using Tableau
SELECT total_weekly_covid_cases_2020, total_weekly_crimes_2020,
-0.225421 * total_weekly_covid_cases_2020 + 4531.866711 as crime_fit from (
    select total_weekly_covid_cases_2020, total_weekly_crimes_2020 
    from `data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`
) s
LIMIT 5

Unnamed: 0,total_weekly_covid_cases_2020,total_weekly_crimes_2020,crime_fit
0,2204,4097,4035.038827
1,4314,2818,3559.400517
2,1710,5636,4146.396801
3,840,3594,4342.513071
4,3119,4365,3828.778612


In [25]:
%%bigquery
#COVID and CRIME
#Correlation between Chicago weekly crimes and Chicago weekly COVID cases
SELECT CORR(total_weekly_covid_cases_2020, total_weekly_crimes_2020) as correlation_coefficient 
from `ba775-team-8b.data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`;

Unnamed: 0,correlation_coefficient
0,-0.609421


### Linear Regression 2: COVID and CRIME DIFFERENCE

This regression mainly indicates that the relationship between Chicago weekly COVID cases and the percentage differences between Chicago crimes in 2020 and the average of the previous 5 years of Chicago crimes is negative. The relationship coefficient is -0.007787. When weekly COVID cases increase by 1000, the crime rate goes down which makes the difference between the current crime rate and previous rate increase by 7.787%. When COVID cases increase by 2000, the crime rate goes down which makes the difference between the current crime rate and previous rate increases by 15.574%.

In [23]:
%%bigquery
#COVID and CRIME PERC DIFFERENCE
#weekly average
SELECT avg(total_weekly_covid_cases_2020) as covid_bar,
       avg(perc_diff_weekly_to_avg) as crime_difference_bar
from `ba775-team-8b.data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`;

Unnamed: 0,covid_bar,crime_difference_bar
0,2754.677419,-40.493746


In [24]:
%%bigquery
#COVID and CRIME PERC DIFFERENCE
#Linear regression slope
SELECT SUM((total_weekly_covid_cases_2020 - covid_bar) * (perc_diff_weekly_to_avg - crime_difference_bar)) 
/ SUM((total_weekly_covid_cases_2020 - covid_bar) * (total_weekly_covid_cases_2020 - covid_bar)) as slope
from (
    SELECT total_weekly_covid_cases_2020, AVG(total_weekly_covid_cases_2020) over () as covid_bar,
           perc_diff_weekly_to_avg, AVG(perc_diff_weekly_to_avg) over () as crime_difference_bar
    from `ba775-team-8b.data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`) s;

Unnamed: 0,slope
0,-0.007787


In [28]:
%%bigquery
#COVID and CRIME PERC DIFFERENCE
#Linear regression slope and intercept
# Chicago crime perc differences = -0.007787Chicago weekly COVID cases -19.044333
# Linear regression explanation: 
# 1.intercept=-19.044333: When there's no COVID case, the mean expected value of Chicago crime percentage differences is -19.044333.
# 2.slope=-0.007787: Holding other things constant, the coefficient is -0.007787 is negative, which indicates
# the relationship between Chicago weekly COVID cases and Chicago crime percentage differences is negative. The relationship
# coefficient is -0.007787. For example When weekly COVID cases increase by 1000, the crime rate goes down which makes the difference
# between the current crime rate and previous rate increases by 7.787%. When COVID cases take on 2000, the crime rate goes down which 
# makes the difference between the current crime rate and previous rate increases by 15.574%.

SELECT slope, 
       crime_difference_bar_max - covid_bar_max * slope as intercept 
FROM (
    SELECT SUM((total_weekly_covid_cases_2020 - covid_bar) * (perc_diff_weekly_to_avg - crime_difference_bar))
    / SUM((total_weekly_covid_cases_2020 - covid_bar) * (total_weekly_covid_cases_2020 - covid_bar)) as slope,
           MAX(covid_bar) AS covid_bar_max,
           MAX(crime_difference_bar) AS crime_difference_bar_max    
    from (
        SELECT total_weekly_covid_cases_2020, avg(total_weekly_covid_cases_2020) over () as covid_bar,
               perc_diff_weekly_to_avg, avg(perc_diff_weekly_to_avg) over () as crime_difference_bar
        from  `ba775-team-8b.data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`) s
) 

Unnamed: 0,slope,intercept
0,-0.007787,-19.044333


In [29]:
%%bigquery
#COVID and CRIME PERC DIFFERENCE
#Plot the linear regression line (and forecast) using Tableau
SELECT total_weekly_covid_cases_2020, perc_diff_weekly_to_avg,
-0.225421 * total_weekly_covid_cases_2020 + 4531.866711 as crime_difference_fit from (
    select total_weekly_covid_cases_2020, perc_diff_weekly_to_avg
    from `data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`
) s
LIMIT 5

Unnamed: 0,total_weekly_covid_cases_2020,perc_diff_weekly_to_avg,crime_difference_fit
0,2204,-40.346595,4035.038827
1,4314,-78.956707,3559.400517
2,1710,-0.141945,4146.396801
3,840,-34.808013,4342.513071
4,3119,-27.491409,3828.778612


In [1]:
%%bigquery
#COVID and CRIME PERC DIFFERENCE
#Correlation between Chicago weekly crimes and between Chicago weekly COVID cases and the percentage difference between 2020 Chicago crimes
# and the average of previous 5 years' Chicago crimes
SELECT CORR(total_weekly_covid_cases_2020, perc_diff_weekly_to_avg) as correlation_coefficient 
from `ba775-team-8b.data_covid_19_chicago_city.compare_weekly_covid_with_weekly_crime`;

Unnamed: 0,correlation_coefficient
0,-0.716634


### What are the top 5 of crime type in 2019 - 2020 and how do the percentages of those crimes change?

The top 5 crime types remained the same in 2019 and 2020 and are: Theft, Battery, Criminal Damege, Assault, and Deceptive Practice. Interestingly, the number of crimes among different crime types decreased over time. In 2020, since COVID-19 started, Theft has been decreasing by 31.06% which is the largest decrease among the top 5 crime types. Deceptive Practice has been decreasing by 25.16% which is the second biggest decrease. The third is Battery(-14.73%), the forth is Assault(-11.90%) and the fifth is Criminal Damage(-5.60%). The biggest impact of COVID-19 on daily life is that it reduced the frequency and time spent going out. We can see that under this condition, the theft, which is often related to going out has the biggest decrease among the top 5 crime types.

In [3]:
%%bigquery
#CHICAGOCRIME
SELECT A.primary_type, num_primary2019, num_primary2020, (num_primary2020 - num_primary2019)/num_primary2019*100 AS difference_change
FROM
(SELECT primary_type, count(primary_type) AS num_primary2019
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2019-01-01' AND '2019-08-31'
GROUP BY primary_type) AS A
INNER JOIN
(SELECT primary_type, count(primary_type) AS num_primary2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2020-01-01' AND '2020-08-31'
GROUP BY primary_type) AS B
ON A.primary_type = B.primary_type
GROUP BY A.primary_type, num_primary2019, num_primary2020
ORDER BY num_primary2019 DESC limit 5

Unnamed: 0,primary_type,num_primary2019,num_primary2020,difference_change
0,THEFT,40811,28137,-31.055353
1,BATTERY,33822,28841,-14.727101
2,CRIMINAL DAMAGE,17964,16958,-5.600089
3,ASSAULT,14081,12405,-11.902564
4,DECEPTIVE PRACTICE,12170,9108,-25.16023


### Where did crimes happen most in 2019 - 2020 and how did frequencies of locations change?

The top 5 locations that had the mose crimes in 2019 and 2020 are on or near a street, residence, apartment, sidewalk, or small retail store. The number of crimes happening among the different location are decreasing over time. In 2020, due to COVID-19, crimes on or near sidewalks have been decreasing by 32.54% which is the largest decrease among the top 5 locations. The crimes in or near residences have been decreasing by 16.43% which is the second largest decrease. The third is small retail stores(-15.93%), the fourth is on or near the street(-14.04%) and the fifth is apartments(-0.57%). Due to the outbreak of COVID-19, people are staying indoors most of the time, which reduces the number of crimes that occur outdoors or in public places such as sidewalks, small retail stores and streets. Moreover, COVID-19 requires social distancing between people and reduced gatherings which may also be the reason for the decline in crime rates in residence and apartment locations.

In [3]:
%%bigquery
#CHICAGOCRIME
#question4: number of crimes on different locations
SELECT A.location_description, num_location2019, num_location2020, (num_location2020 - num_location2019)/num_location2019*100 AS difference_change
FROM 
(SELECT location_description, count(location_description) AS num_location2019
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2019-01-01' AND '2019-08-31'
GROUP BY location_description) AS A
INNER JOIN
(SELECT location_description, count(location_description) AS num_location2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2020-01-01' AND '2020-08-31'
GROUP BY location_description) AS B
ON A.location_description = B.location_description
GROUP BY A.location_description, num_location2019, num_location2020
ORDER BY num_location2020 DESC limit 5

Unnamed: 0,location_description,num_location2019,num_location2020,difference_change
0,STREET,38373,32986,-14.038517
1,RESIDENCE,28928,24176,-16.426991
2,APARTMENT,23419,23286,-0.567915
3,SIDEWALK,13812,9318,-32.536924
4,SMALL RETAIL STORE,4570,3842,-15.929978


### How many people got arrested in 2019 and 2020 and how did the arrested rate change?

Between 2019 and 2020 the arrested rate for crimes went down. COVID-19 caused lots of changes in Chicago including a police strike and the decrease in arrested rates in Chicago. In order to make the conclusion that the arrested rate decreased due to COVID-19 we need to eliminate some other factors that may also cause the arrested rate to decrease such as the overall crime numbers decreasing and an increase in crimes which may not be very serious and often don't get arrested. Using the queries below, we can eliminate these factors because although primary types of crimes change and arrested rates change for all primary types that decreased, the arrested rate change was much higher than the primary type of crime change indicating that if two people made the same mistake in 2019 and 2020, there is a greater chance of being arrested in 2019. This is caused by COVID-19 and the reduced number of police. Another reason for the disproportionate decrease in arrest rates is nearly 16% of Illinois COVID-19 cases were linked to outbreaks from Chicago jail so the government reduced the arrested rate to avoid the spread of COVID-19 in the prison according to a study. 

In [21]:
%%bigquery
#percentage change of arrest and non arrest from 2019 to 2020
Select A.arrest, number_arrest2019, number_arrest2020, (number_arrest2020-number_arrest2019)/number_arrest2019*100 AS percent_change
From
(Select arrest, COUNT(arrest) AS number_arrest2019
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2019-01-01' AND '2019-08-31'
group by arrest) AS A
INNER Join 
(SELECT arrest, COUNT(arrest) AS number_arrest2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date` 
where date BETWEEN '2020-01-01' AND '2020-08-31'
group by arrest) AS B
ON A.arrest = B.arrest
GROUP BY A.arrest, number_arrest2019, number_arrest2020

Unnamed: 0,arrest,number_arrest2019,number_arrest2020,percent_change
0,True,38489,23091,-40.006236
1,False,136523,117288,-14.089201


In [7]:
%%bigquery
#primary change and arrested change from 2019 to 2020 of top five primary_type in 2019.
SELECT A.primary_type,(num_primary2020-num_primary2019)/num_primary2019*100 AS primary_change, (arrested2020-arrested2019)/arrested2019*100 AS arrested_change
FROM
(SELECT primary_type, count(primary_type) AS num_primary2019, SUM(CASE WHEN arrest THEN 1 ELSE 0 END) AS arrested2019
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2019-01-01' AND '2019-08-31'
GROUP BY primary_type) AS A
INNER JOIN
(SELECT primary_type, count(primary_type) AS num_primary2020, SUM(CASE WHEN arrest THEN 1 ELSE 0 END) AS arrested2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2020-01-01' AND '2020-08-31'
GROUP BY primary_type) AS B
ON A.primary_type = B.primary_type
GROUP BY A.primary_type, num_primary2020, num_primary2019, arrested2020, arrested2019
ORDER BY num_primary2019 DESC limit 5

Unnamed: 0,primary_type,primary_change,arrested_change
0,THEFT,-31.055353,-48.92513
1,BATTERY,-14.727101,-31.680247
2,CRIMINAL DAMAGE,-5.600089,-23.80523
3,ASSAULT,-11.902564,-36.249525
4,DECEPTIVE PRACTICE,-25.16023,-60.311958


### What primary types changed the most and how does that affect the domestic violence rate?

We find that between 2019 and 2020, the primary types of crime that had the greatest increases were Criminal Sexual Assault, Arson, Narcotic Violations, Homicide, and Weapons Violations. While overall crime rates went down, these specific types of crime increased 72.5%, 56.5%, 50%, 49.13% and 20.49% respectively. We also wanted to explore how COVID-19 affected rates of domestic violence and found that domestic violence overall increased by 3% from 2019 to 2020 as well as specific crimes associated with domestic violence also increased. 

In [2]:
%%bigquery
#chicagocrime
#top 5 primry types of crime that increased domestic violence the most during COvid in comparison to 2019
SELECT A.primary_type, ((num_type_crime2020 - num_type_crime2019)/num_type_crime2019)*100 as perc_change, num_type_crime2019, num_type_crime2020,
FROM 
(SELECT primary_type, count(primary_type) as num_type_crime2019, 
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020`
where date BETWEEN '2019-01-01' AND '2019-08-31'
group by primary_type) as A
INNER JOIN
(SELECT primary_type, count(primary_type) as num_type_crime2020, 
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020`
where date BETWEEN '2020-01-01' AND '2020-08-31'
group by primary_type) as B
ON A.primary_type = B.primary_type
Group by A.Primary_type, num_type_crime2019, num_type_crime2020
order by perc_change desc limit 5

Unnamed: 0,primary_type,perc_change,num_type_crime2019,num_type_crime2020
0,CRIMINAL SEXUAL ASSAULT,72.5,400,690
1,ARSON,56.504065,246,385
2,OTHER NARCOTIC VIOLATION,50.0,4,6
3,HOMICIDE,49.132948,346,516
4,WEAPONS VIOLATION,20.49218,4348,5239


In [8]:
%%bigquery
#chicagocrime
#top 5 primry types of crime that increased domestic violence the most during COvid in comparison to 2019
SELECT A.primary_type, ((num_type_crime2020 - num_type_crime2019)/num_type_crime2019)*100 as perc_change, num_type_crime2019, num_type_crime2020,
domestic2019, domestic2020, ((domestic2020-domestic2019)/domestic2019)*100 as perc_change_domestic
FROM 
(SELECT primary_type, count(primary_type) as num_type_crime2019, SUM(CASE WHEN domestic THEN 1 ELSE 0 END) AS domestic2019
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020`
where date BETWEEN '2019-01-01' AND '2019-08-31'
group by primary_type) as A
INNER JOIN
(SELECT primary_type, count(primary_type) as num_type_crime2020, SUM(CASE WHEN domestic THEN 1 ELSE 0 END) AS domestic2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020`
where date BETWEEN '2020-01-01' AND '2020-08-31'
group by primary_type) as B
ON A.primary_type = B.primary_type
where domestic2019 >0 
Group by A.Primary_type, num_type_crime2019, num_type_crime2020, domestic2019, domestic2020
order by perc_change_domestic desc limit 5

Unnamed: 0,primary_type,perc_change,num_type_crime2019,num_type_crime2020,domestic2019,domestic2020,perc_change_domestic
0,CRIMINAL SEXUAL ASSAULT,72.5,400,690,60,137,128.333333
1,NARCOTICS,-53.251721,10456,4888,4,7,75.0
2,HOMICIDE,49.132948,346,516,22,32,45.454545
3,WEAPONS VIOLATION,20.49218,4348,5239,10,14,40.0
4,ARSON,56.504065,246,385,13,16,23.076923


In [10]:
%%bigquery
#CHICAGO CRIME
#Compares the number of crimes in 2019 and 2020 that did and din't include domestic violence
Select a.domestic, num_domestic2019, num_domestic2020
from (
SELECT domestic, COUNT(domestic) AS num_domestic2019
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2019-01-01' AND '2019-08-31'
GROUP BY domestic) as A
inner join
(SELECT domestic, COUNT(domestic) AS num_domestic2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2020-01-01' AND '2020-08-31'
GROUP BY domestic) as b
on a.domestic = b.domestic

Unnamed: 0,domestic,num_domestic2019,num_domestic2020
0,False,145517,113554
1,True,29495,26825


In [15]:
%%bigquery
#CHICAGO CRIME
#Shoes the percent increase in domestic crimes over all crimes for that year. Shows that in 2020, domestic violence incidinece went up
Select percent_of_domestic_2019, percent_of_domestic_2020
from (
SELECT  (sum(case when domestic then 1 else 0 end))/COUNT(*)*100 as percent_of_domestic_2019
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2019-01-01' AND '2019-08-31'
) as A
cross join
(SELECT (sum(case when domestic then 1 else 0 end))/COUNT(*)*100 as  percent_of_domestic_2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date`
WHERE date BETWEEN '2020-01-01' AND '2020-08-31'
) as b

Unnamed: 0,percent_of_domestic_2019,percent_of_domestic_2020
0,16.85313,19.108984


### How many crimes happened in each month on average in the last 6 years and 2020 and how many people got arrested?

In [34]:
%%bigquery
SELECT 
  a.month, a.total_monthly_crimes_2020, b.avg_monthly_crime_cases, 
  (a.total_monthly_crimes_2020-b.avg_monthly_crime_cases)/a.total_monthly_crimes_2020*100 as perc_diff_vs_last_6yrs, 
  a.total_arrest_2020, b.avg_total_arrest,
  (a.total_arrest_2020-b.avg_total_arrest)/a.total_arrest_2020*100 as arrested_perc_diff_vs_last_6yrs
FROM
(SELECT 
  EXTRACT(MONTH from date) as month, 
  COUNT(case_number) as total_monthly_crimes_2020,
  COUNT(CASE WHEN arrest = TRUE THEN 1 END) AS total_arrest_2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020`
WHERE year = 2020
GROUP BY month) as a
LEFT JOIN 
(SELECT
  EXTRACT(MONTH from date) as month,
  ROUND(COUNT(case_number)/6) as avg_monthly_crime_cases,
  ROUND(COUNT(CASE WHEN arrest = TRUE THEN 1 END)/6) AS avg_total_arrest
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020` 
WHERE year BETWEEN 2014 AND 2019
GROUP BY  month
ORDER BY  month) as b
USING(month)
ORDER BY month

Unnamed: 0,month,total_monthly_crimes_2020,avg_monthly_crime_cases,perc_diff_vs_last_6yrs,total_arrest_2020,avg_total_arrest,arrested_perc_diff_vs_last_6yrs
0,1,19664,20637.0,-4.948129,4450,4862.0,-9.258427
1,2,17995,18022.0,-0.150042,4074,4560.0,-11.929308
2,3,16520,21341.0,-29.182809,3047,5349.0,-75.549721
3,4,12726,21596.0,-69.699827,1409,5190.0,-268.346345
4,5,17356,23961.0,-38.056004,2468,5639.0,-128.484603
5,6,17406,24047.0,-38.15351,2238,5364.0,-139.678284
6,7,19262,25087.0,-30.240889,2614,5566.0,-112.930375
7,8,19487,25014.0,-28.362498,2800,5437.0,-94.178571
8,9,17293,23163.0,-33.944371,2377,5081.0,-113.756836
9,10,6714,23024.0,-242.925231,867,4950.0,-470.934256


### Did the number of crime cases change in 2020 compared to the average number of crimes in the last six years?


In [9]:
%%bigquery
SELECT 
  a.month, a.total_monthly_crimes_2020, b.avg_monthly_crime_cases, 
  (a.total_monthly_crimes_2020-b.avg_monthly_crime_cases)/a.total_monthly_crimes_2020*100 as perc_diff_vs_last_6yrs
FROM
(SELECT 
  EXTRACT(MONTH from date) as month, 
  COUNT(case_number) as total_monthly_crimes_2020
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020`
WHERE year = 2020
GROUP BY month) as a
LEFT JOIN 
(SELECT
  EXTRACT(MONTH from date) as month,
  ROUND(COUNT(case_number)/6) as avg_monthly_crime_cases
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_2014_2020` 
WHERE year BETWEEN 2014 AND 2019
GROUP BY  month
ORDER BY  month) as b
USING(month)
ORDER BY month

Unnamed: 0,month,total_monthly_crimes_2020,avg_monthly_crime_cases,perc_diff_vs_last_6yrs
0,1,19664,20637.0,-4.948129
1,2,17995,18022.0,-0.150042
2,3,16520,21341.0,-29.182809
3,4,12726,21596.0,-69.699827
4,5,17356,23961.0,-38.056004
5,6,17406,24047.0,-38.15351
6,7,19262,25087.0,-30.240889
7,8,19487,25014.0,-28.362498
8,9,17293,23163.0,-33.944371
9,10,6714,23024.0,-242.925231


### Additional Points of interest and exploration
Within the last two years, the 2 out of 5 of the top crime days happened on 5/30 and 5/31. 
This is of note because within the week of 5/29 - 6/04 there were 413 complaints against the Chicago
police department directly related to the Black Lives Matter protests and because public peace violation is the second highest reason for arrest on those 5/30 nd 5/31. Within the week of 5/29 - 6/04, Public Peace Violations made up 445 crimes. The relationship between this information points to the fact that data can have bias based on who is enter and configuring the data. The first table shows that the public peace violation was the second highest type of crime on 5/30 and 5/31, while the seond table shows an aggregate of the all the public peace violation crimes between the week of 5/29-6/04. 

##### What were the top 5 dates that had the highest crimes in 2019 - 2020?

In [30]:
%%bigquery
#CHICAGOCRIME
SELECT date, COUNT(*) AS number_of_crimes, 
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date` 
GROUP BY date
ORDER BY number_of_crimes DESC LIMIT 5

Unnamed: 0,date,number_of_crimes
0,2020-05-31,1893
1,2019-01-01,1014
2,2020-08-10,926
3,2020-05-30,917
4,2019-06-01,904


In [7]:
%%bigquery
#CHICAGOCRIME
SELECT date, COUNT(*) AS number_of_crimes, primary_type, description
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date` 
WHERE date = '2020-05-30' OR date = '2020-05-31'
GROUP BY date, primary_type, description
ORDER BY number_of_crimes DESC limit 5

Unnamed: 0,date,number_of_crimes,primary_type,description
0,2020-05-31,539,BURGLARY,FORCIBLE ENTRY
1,2020-05-31,287,PUBLIC PEACE VIOLATION,LOOTING
2,2020-05-31,247,CRIMINAL DAMAGE,TO PROPERTY
3,2020-05-30,197,BURGLARY,FORCIBLE ENTRY
4,2020-05-30,102,CRIMINAL DAMAGE,TO PROPERTY


In [6]:
%%bigquery
#CHICAGOCRIME
SELECT date, COUNT(*) AS number_of_crimes, primary_type
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_crime_date` 
WHERE date BETWEEN '2020-05-29' AND '2020-06-04' AND primary_type = 'PUBLIC PEACE VIOLATION'
GROUP BY date, primary_type
ORDER BY date, number_of_crimes DESC


Unnamed: 0,date,number_of_crimes,primary_type
0,2020-05-29,7,PUBLIC PEACE VIOLATION
1,2020-05-30,42,PUBLIC PEACE VIOLATION
2,2020-05-31,308,PUBLIC PEACE VIOLATION
3,2020-06-01,56,PUBLIC PEACE VIOLATION
4,2020-06-02,17,PUBLIC PEACE VIOLATION
5,2020-06-03,10,PUBLIC PEACE VIOLATION
6,2020-06-04,5,PUBLIC PEACE VIOLATION


### How Covid-19 affects different marginalized groups?

In [18]:
%%bigquery 
#COVID19 
SELECT sum(Cases___Total) as total_cases,
(sum(Cases___Latinx)/ sum(Cases___Total))*100 as percent_cases_Latinx, 
sum(Cases___Latinx) as cases_Latinx , 
(sum(Cases___Black_Non_Latinx)/sum(Cases___Total))*100 as percent_cases_Black, 
sum(Cases___Black_Non_Latinx) as cases_Black, 
(sum(Cases___Asian_Non_Latinx)/sum(Cases___Total))*100 as percent_cases_Asian, 
sum(Cases___Asian_Non_Latinx) as cases_Asian, 
(sum(Cases___White_Non_Latinx)/sum(Cases___Total))*100 as percent_cases_White, 
sum(Cases___White_Non_Latinx) as cases_White
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_city_covis19` 

Unnamed: 0,total_cases,percent_cases_Latinx,cases_Latinx,percent_cases_Black,cases_Black,percent_cases_Asian,cases_Asian,percent_cases_White,cases_White
0,88412,38.909876,34401,21.216577,18758,2.086821,1845,15.212867,13450


### How does Covid-19 affect different age groups?

In [17]:
%%bigquery
#COVID19
SELECT sum(Cases___Total) as total_cases, 
(sum(Cases___Age_0_17)/ sum(Cases___Total))*100 as percent_0_17, 
sum(Cases___Age_0_17) as cases_0_17 , 
(sum( Cases___Age_18_29 )/sum(Cases___Total))*100 as percent_18_29, 
sum( Cases___Age_18_29 ) as cases_18_29, 
(sum( Cases___Age_30_39 )/sum(Cases___Total))*100 as percent_30_39, 
sum( Cases___Age_40_49 ) as cases_40_49, 
(sum( Cases___Age_50_59 )/sum(Cases___Total))*100 as percent_50_59, 
sum( Cases___Age_50_59 ) as cases_50_59, 
(sum( Cases___Age_60_69 )/sum(Cases___Total))*100 as percent_60_69, 
sum( Cases___Age_60_69 ) as cases_60_69, 
(sum( Cases___Age_70_79 )/sum(Cases___Total))*100 as percent_70_79, 
sum( Cases___Age_70_79 ) as cases_70_79
FROM `ba775-team-8b.data_covid_19_chicago_city.chicago_city_covis19`

Unnamed: 0,total_cases,percent_0_17,cases_0_17,percent_18_29,cases_18_29,percent_30_39,cases_40_49,percent_50_59,cases_50_59,percent_60_69,cases_60_69,percent_70_79,cases_70_79
0,88412,7.501244,6632,22.781975,20142,18.811926,14783,14.916527,13188,10.141157,8966,5.312627,4697
