# Exploratory Data Analysis on COVID-19 in Brazil

This notebook provides a comprehensive exploratory data analysis of COVID-19 in Brazil, offering valuable insights into the pandemic's impact on the country.

<center><img alt="covid-19-exploratory-data-analysis-pedro-henrique-figueiredo-magalhaes" width="50%" src="https://static.poder360.com.br/2022/12/coronavirus-imagem-1-848x477.jpg"></center>

The dataset is a collection of COVID-19 data maintained by [Our World in Data](https://ourworldindata.org/coronavirus). For more information on the dataset visit the [OWID repository](https://github.com/owid/covid-19-data/tree/master/public/data).

## Introduction

COVID-19, caused by the SARS-CoV-2 virus, has had profound effects globally since its emergence in late 2019. Primarily spread through respiratory droplets, the virus has exhibited a wide range of clinical manifestations, from mild to severe symptoms, and in some cases, leading to death. Over the past years, extensive research and data collection efforts have enhanced our understanding of the disease, though challenges remain due to its evolving nature.

### Context

As of the latest data from trusted sources like the World Health Organization (WHO), Johns Hopkins University, and the Brazilian Ministry of Health, it is observed that a significant portion of COVID-19 cases are either asymptomatic or mild, accounting for approximately 80% of infections. However, around 15% of infected individuals develop severe symptoms requiring oxygen support, while about 5% become critically ill, necessitating mechanical ventilation and intensive care.

Brazil has been one of the countries most affected by the COVID-19 pandemic, experiencing multiple waves of infection with varying intensity. To enhance situational awareness and guide public health responses, this notebook aims to perform an exploratory data analysis (EDA) on publicly available COVID-19 data in Brazil. The analysis will focus on understanding the spread, severity, and trends of the disease within the country, providing insights that could aid in policy-making and healthcare management.

## Conclusions and Insights

### Key Findings

1. KF 1
2. KF 2
3. KF 3

### Implications and Recommendations

1. asd
2. asd
3. asd

--- 

## Structure of the Notebook

1. [**Data Collection and Import**](#data-collection-and-import)
    - Importing necessary libraries
    - Loading the dataset

2. [**Data Preprocessing**](#data-preprocessing)
    - Handling missing values
    - Data cleaning and formatting

3. [**Exploratory Data Analysis**](#exploratory-data-analysis)
    - Summary statistics
    - Visualization of data
    - Analysis of trends and patterns

4. [**Conclusions and Insights**](#conclusions-and-insights)
    - Key findings from the analysis
    - Implications and recommendations
---
## Credits

The raw data was collected, aggregated, and documented by Edouard Mathieu, Hannah Ritchie, Lucas Rodés-Guirao, Cameron Appel, Daniel Gavrilov, Charlie Giattino, Joe Hasell, Bobbie Macdonald, Saloni Dattani, Diana Beltekian, Esteban Ortiz-Ospina, and Max Roser.

Mathieu, E., Ritchie, H., Ortiz-Ospina, E. et al. A global database of COVID-19 vaccinations. Nat Hum Behav (2021). https://doi.org/10.1038/s41562-021-01122-8

Hasell, J., Mathieu, E., Beltekian, D. et al. A cross-country database of COVID-19 testing. Sci Data 7, 345 (2020). https://doi.org/10.1038/s41597-020-00688-8

## Data Collection and Import

Importing necessary libraries and loading the dataset.


In [9]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [13]:
data = "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv"

df_raw = pd.read_csv(data,low_memory=False)

df_raw

Unnamed: 0,iso_code,continent,location,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,...,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,population,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
0,AFG,Asia,Afghanistan,2020-01-05,,0.0,,,0.0,,...,,37.746,0.5,64.83,0.511,41128772.0,,,,
1,AFG,Asia,Afghanistan,2020-01-06,,0.0,,,0.0,,...,,37.746,0.5,64.83,0.511,41128772.0,,,,
2,AFG,Asia,Afghanistan,2020-01-07,,0.0,,,0.0,,...,,37.746,0.5,64.83,0.511,41128772.0,,,,
3,AFG,Asia,Afghanistan,2020-01-08,,0.0,,,0.0,,...,,37.746,0.5,64.83,0.511,41128772.0,,,,
4,AFG,Asia,Afghanistan,2020-01-09,,0.0,,,0.0,,...,,37.746,0.5,64.83,0.511,41128772.0,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
409772,ZWE,Africa,Zimbabwe,2024-06-12,266365.0,0.0,0.0,5740.0,0.0,0.0,...,30.7,36.791,1.7,61.49,0.571,16320539.0,,,,
409773,ZWE,Africa,Zimbabwe,2024-06-13,266365.0,0.0,0.0,5740.0,0.0,0.0,...,30.7,36.791,1.7,61.49,0.571,16320539.0,,,,
409774,ZWE,Africa,Zimbabwe,2024-06-14,266365.0,0.0,0.0,5740.0,0.0,0.0,...,30.7,36.791,1.7,61.49,0.571,16320539.0,,,,
409775,ZWE,Africa,Zimbabwe,2024-06-15,266365.0,0.0,0.0,5740.0,0.0,0.0,...,30.7,36.791,1.7,61.49,0.571,16320539.0,,,,


## Data Preprocessing

### Handling Missing Values

### Data Cleaning and Formatting

## Exploratory Data Analysis

### Summary Statistics

### Visualization of Data

#### Time Series Analysis

#### Distribution of Cases

### Analysis of Trends and Patterns

## Conclusions and Insights

### Key Findings

1. KF 1
2. KF 2
3. KF 3

### Implications and Recommendations

1. asd
2. asd
3. asd