# Explorator Data Analysis on SARS outbreak of 2003

### - About SARS outbreak of 2003

Severe acute respiratory syndrome (SARS) is a viral respiratory disease of zoonotic origin caused by the SARS coronavirus (SARS-CoV). Between November 2002 and July 2003, an outbreak of SARS in southern China caused an eventual 8,098 cases, resulting in 774 deaths reported in 17 countries (9.6% fatality rate), with the majority of cases in mainland China and Hong Kong. No cases of SARS have been reported worldwide since 2004. In late 2017, Chinese scientists traced the virus through the intermediary of civets to cave-dwelling horseshoe bats in Yunnan province.

## - About this notebook

This notebook consists of EDA performed on dataset on SARS outbreak of 2003 provided by kaggle. This notebook consists of data analysis and data visualtion with graphs and plots.

In [None]:
import os
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import sqlite3

## Adding the data to SQL DB

In order to add any CSV file to the SQLITE database, we need to: 
1. Firstly read the CSV file using <code>pd.read_csv()</code>
2. Second, we have to create a connection with the sqlite DB using <code>sqlite.connect()</code>
3. Finally, we need to enter the CSV data read using Pandas to the SQL DB in form of SQL table using <code> .to_sql() </code>

In [None]:
# Reading CSV files

data_1 = pd.read_csv("../input/sars-outbreak-2003-complete-dataset/sars_2003_complete_dataset_clean.csv")
data_2 = pd.read_csv("../input/sars-outbreak-2003-complete-dataset/summary_data_clean.csv")

# Creating a SQL connection

conn = sqlite3.connect("sars.db")

# Inserting data as SQL tables

data_1.to_sql("sars", conn)
data_2.to_sql("sars_summ", conn)

## DATA ANALYSIS

Queries covered in this notebook:

1. Cases in each country
2. Plotting graph of cases in each country
3. Cases graph w.r.t time
4. Cases graph w.r.t time and country
5. Plotting chart for Death Reported in Countries
6. Deaths graph w.r.t time
7. Deaths graph w.r.t time and country
8. Probable first reported case
9. Probable last reported case
10. Time taken by countries to erradicate SARS from their territory
11. Recoveries graph w.r.t time
12. Recoveries graph w.r.t time and country

In [None]:
# Cases in each country

cases_con = pd.read_sql('select "Country/Region", "Cumulative total cases" as "Total Cases" from sars_summ order by "Total Cases" desc;', conn)
cases_con

In [None]:
# Plotting the graph for cases in each country

fig_1 = px.pie(cases_con, values="Total Cases", names="Country/Region", title="Cases in every impacted country")
fig_1.update_traces(textposition="inside", textinfo="percent+label")

In [None]:
# Cases graph w.r.t Date

cases_graph = pd.read_sql('select "Date", sum("Cumulative number of case(s)") as "Total Reported Cases" from sars group by "Date" order by "Date";', conn)
px.line(cases_graph, x="Date", y="Total Reported Cases")

In [None]:
# Cases graph w.r.t Date and Country

cases_conda = pd.read_sql('select Country, Date, sum("Cumulative number of case(s)") as "Total Reported Cases" from sars group by Country, Date order by "Date";', conn)
fig_2 = px.line(cases_conda, x="Date", y="Total Reported Cases", color="Country", line_group="Country", hover_name="Country", title="Cases graph w.r.t date and country")
fig_2.show()

In [None]:
# Plotting chart for deaths reported in countries

death_con = pd.read_sql('select "Country/Region", "No. of deaths" as "Reported Deaths" from sars_summ order by "Reported Deaths" desc;', conn)
fig_3 = px.pie(death_con, values="Reported Deaths", names="Country/Region", title="Deaths reported in every impacted country (Pie Chart)")
fig_3.update_traces(textposition="inside", textinfo="percent+label")
fig_3.show()

In [None]:
px.bar(death_con, x="Country/Region", y="Reported Deaths", title="Deaths reported in every impacted country (Bar Chart)", color="Reported Deaths")

In [None]:
# Deaths graph w.r.t date

death_graph = pd.read_sql('select Date, sum("Cumulative number of case(s)") as "Deaths Reported" from sars group by Date order by Date;', conn)
px.line(death_graph, x="Date", y="Deaths Reported", title="Deaths graph w.r.t date")

In [None]:
# Deaths grpah w.r.t date and country

death_conda = pd.read_sql('select Country, Date, sum("Number of deaths") as "Reported Deaths" from sars group by Country, Date order by Date;', conn)
death_conda
px.line(death_conda, x="Date", y="Reported Deaths", line_group="Country", hover_name="Country", color="Country", title="Deaths reported w.r.t date and country")

In [None]:
# Probable first reported cases

st_cases = pd.read_sql('select "Country/Region" as "Country", "Date onset first probable case" as "Probable first case" from sars_summ;', conn)
px.scatter(st_cases, x="Probable first case", y="Country", title="First reported case around the world", color="Country")

In [None]:
# Probable last reported case

en_cases = pd.read_sql('select "Country/Region" as "Country", "Date onset last probable case" as "Probable last case" from sars_summ;', conn)
px.scatter(en_cases, x="Probable last case", y="Country", title="Last reported case around the world", color="Country")

In [None]:
# Time taken by countries to erradicate SARS from their territory

time_err = pd.read_sql('select "Country/Region", cast((JulianDay("Date onset last probable case")-JulianDay("Date onset first probable case")) as Integer) as "Days took to erradicate SARS" from sars_summ order by "Days took to erradicate SARS";', conn)
px.bar(time_err, x="Country/Region", y="Days took to erradicate SARS", color="Days took to erradicate SARS", title="Days took by every country to erradicate SARS from their territory")

In [None]:
# Recovery graph with respect to time

rec_graph = pd.read_sql('select Date, sum("Number recovered") as "Recoveries" from sars group by Date order by Date;', conn)
px.line(rec_graph, x="Date", y="Recoveries", title="Recoveries grpah w.r.t date")

In [None]:
# Number of recoveries w.r.t date and country

rec_conda = pd.read_sql('select Country, Date, sum("Number recovered") as "Recoveries" from sars group by Country, Date order by Date;', conn)
px.line(rec_conda, x="Date", y="Recoveries", line_group="Country", hover_name="Country", color="Country", title="Recoveries w.r.t date and country")

In [None]:
# Ending the connection
conn.close()