# Análise de dados de COVID-19: numeros de casos

Neste notebook será feita a análise de casos de COVID-19 relatados por boletins de Secretarias Estaduais de Saúde, com valores por municípios e Estados

Primeiro passo a ser realizado é importar as bibliotecas Pandas e Numpy

In [1]:
import pandas as pd
import numpy as np

A segunda etapa é realizar a leitura dos dados. Para este caso, foi usado o dataset "caso_full", disponível no <a href="https://brasil.io">Brasil.IO</a>

In [2]:
dados = pd.read_csv("caso_full.csv")
dados

Unnamed: 0,city,city_ibge_code,date,epidemiological_week,estimated_population,estimated_population_2019,is_last,is_repeated,last_available_confirmed,last_available_confirmed_per_100k_inhabitants,last_available_date,last_available_death_rate,last_available_deaths,order_for_place,place_type,state,new_confirmed,new_deaths
0,São Paulo,3550308.0,2020-02-25,9,12325232.0,12252023.0,False,False,1,0.00811,2020-02-25,0.0000,0,1,city,SP,1,0
1,,35.0,2020-02-25,9,46289333.0,45919049.0,False,False,1,0.00216,2020-02-25,0.0000,0,1,state,SP,1,0
2,São Paulo,3550308.0,2020-02-26,9,12325232.0,12252023.0,False,False,1,0.00811,2020-02-26,0.0000,0,2,city,SP,0,0
3,,35.0,2020-02-26,9,46289333.0,45919049.0,False,False,1,0.00216,2020-02-26,0.0000,0,2,state,SP,0,0
4,São Paulo,3550308.0,2020-02-27,9,12325232.0,12252023.0,False,False,1,0.00811,2020-02-27,0.0000,0,3,city,SP,0,0
5,,35.0,2020-02-27,9,46289333.0,45919049.0,False,False,1,0.00216,2020-02-27,0.0000,0,3,state,SP,0,0
6,São Paulo,3550308.0,2020-02-28,9,12325232.0,12252023.0,False,False,2,0.01623,2020-02-28,0.0000,0,4,city,SP,1,0
7,,35.0,2020-02-28,9,46289333.0,45919049.0,False,False,2,0.00432,2020-02-28,0.0000,0,4,state,SP,1,0
8,São Paulo,3550308.0,2020-02-29,9,12325232.0,12252023.0,False,False,2,0.01623,2020-02-29,0.0000,0,5,city,SP,0,0
9,,35.0,2020-02-29,9,46289333.0,45919049.0,False,False,2,0.00432,2020-02-29,0.0000,0,5,state,SP,0,0


Existem linhas onde a coluna "city" está vazia. Isso é porque trata-se de registros que se referem ao Estado, não ao Município. Para ajudar a identificar mais facilmente, vamos etiquetar estes casos com a palavra "ESTADO".

In [3]:
dados["city"] = dados["city"].replace(np.nan, 'ESTADO', regex=True)
dados

Unnamed: 0,city,city_ibge_code,date,epidemiological_week,estimated_population,estimated_population_2019,is_last,is_repeated,last_available_confirmed,last_available_confirmed_per_100k_inhabitants,last_available_date,last_available_death_rate,last_available_deaths,order_for_place,place_type,state,new_confirmed,new_deaths
0,São Paulo,3550308.0,2020-02-25,9,12325232.0,12252023.0,False,False,1,0.00811,2020-02-25,0.0000,0,1,city,SP,1,0
1,ESTADO,35.0,2020-02-25,9,46289333.0,45919049.0,False,False,1,0.00216,2020-02-25,0.0000,0,1,state,SP,1,0
2,São Paulo,3550308.0,2020-02-26,9,12325232.0,12252023.0,False,False,1,0.00811,2020-02-26,0.0000,0,2,city,SP,0,0
3,ESTADO,35.0,2020-02-26,9,46289333.0,45919049.0,False,False,1,0.00216,2020-02-26,0.0000,0,2,state,SP,0,0
4,São Paulo,3550308.0,2020-02-27,9,12325232.0,12252023.0,False,False,1,0.00811,2020-02-27,0.0000,0,3,city,SP,0,0
5,ESTADO,35.0,2020-02-27,9,46289333.0,45919049.0,False,False,1,0.00216,2020-02-27,0.0000,0,3,state,SP,0,0
6,São Paulo,3550308.0,2020-02-28,9,12325232.0,12252023.0,False,False,2,0.01623,2020-02-28,0.0000,0,4,city,SP,1,0
7,ESTADO,35.0,2020-02-28,9,46289333.0,45919049.0,False,False,2,0.00432,2020-02-28,0.0000,0,4,state,SP,1,0
8,São Paulo,3550308.0,2020-02-29,9,12325232.0,12252023.0,False,False,2,0.01623,2020-02-29,0.0000,0,5,city,SP,0,0
9,ESTADO,35.0,2020-02-29,9,46289333.0,45919049.0,False,False,2,0.00432,2020-02-29,0.0000,0,5,state,SP,0,0


A próxima etapa é remover colunas não necessárias para a análise.

In [4]:
dados = dados.drop(columns=['city_ibge_code'])
dados

Unnamed: 0,city,date,epidemiological_week,estimated_population,estimated_population_2019,is_last,is_repeated,last_available_confirmed,last_available_confirmed_per_100k_inhabitants,last_available_date,last_available_death_rate,last_available_deaths,order_for_place,place_type,state,new_confirmed,new_deaths
0,São Paulo,2020-02-25,9,12325232.0,12252023.0,False,False,1,0.00811,2020-02-25,0.0000,0,1,city,SP,1,0
1,ESTADO,2020-02-25,9,46289333.0,45919049.0,False,False,1,0.00216,2020-02-25,0.0000,0,1,state,SP,1,0
2,São Paulo,2020-02-26,9,12325232.0,12252023.0,False,False,1,0.00811,2020-02-26,0.0000,0,2,city,SP,0,0
3,ESTADO,2020-02-26,9,46289333.0,45919049.0,False,False,1,0.00216,2020-02-26,0.0000,0,2,state,SP,0,0
4,São Paulo,2020-02-27,9,12325232.0,12252023.0,False,False,1,0.00811,2020-02-27,0.0000,0,3,city,SP,0,0
5,ESTADO,2020-02-27,9,46289333.0,45919049.0,False,False,1,0.00216,2020-02-27,0.0000,0,3,state,SP,0,0
6,São Paulo,2020-02-28,9,12325232.0,12252023.0,False,False,2,0.01623,2020-02-28,0.0000,0,4,city,SP,1,0
7,ESTADO,2020-02-28,9,46289333.0,45919049.0,False,False,2,0.00432,2020-02-28,0.0000,0,4,state,SP,1,0
8,São Paulo,2020-02-29,9,12325232.0,12252023.0,False,False,2,0.01623,2020-02-29,0.0000,0,5,city,SP,0,0
9,ESTADO,2020-02-29,9,46289333.0,45919049.0,False,False,2,0.00432,2020-02-29,0.0000,0,5,state,SP,0,0
