<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Importando-Bibliotecas" data-toc-modified-id="Importando-Bibliotecas-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Importando Bibliotecas</a></span></li><li><span><a href="#Definindo-Funções" data-toc-modified-id="Definindo-Funções-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Definindo Funções</a></span></li><li><span><a href="#Lendo-os-Dados" data-toc-modified-id="Lendo-os-Dados-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Lendo os Dados</a></span></li><li><span><a href="#DataPrep" data-toc-modified-id="DataPrep-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>DataPrep</a></span></li><li><span><a href="#Exploração-Gráfica" data-toc-modified-id="Exploração-Gráfica-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Exploração Gráfica</a></span></li></ul></div>

Este notebook tem como objetivo propor uma análise da evolução do COVID-19 ao redor do mundo, enfatizando casos confirmados, vítimas e casos recuperados no `Brasil`.
Utilizando uma [base de dados](https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset) do Kaggle com atualização diária, serão apresentados gráficos e análises sobre o impacto do Corona Virus na sociedade como um todo.

# Importando Bibliotecas

In [1]:
# Biblliotecas utilizadas no projeto
import pandas as pd
import numpy as np
import os
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns
%matplotlib inline
from warnings import filterwarnings
filterwarnings('ignore')
import urllib.request, json

import plotly as py
import plotly.graph_objs as go
py.offline.init_notebook_mode(connected=True)

# Definindo Funções

In [2]:
# Formatando eixos do matplotlib
def format_spines(ax, right_border=True):
    """
    This function sets up borders from an axis and personalize colors
    
    Input:
        Axis and a flag for deciding or not to plot the right border
    Returns:
        Plot configuration
    """    
    # Setting up colors
    ax.spines['bottom'].set_color('#CCCCCC')
    ax.spines['left'].set_color('#CCCCCC')
    ax.spines['top'].set_visible(False)
    if right_border:
        ax.spines['right'].set_color('#CCCCCC')
    else:
        ax.spines['right'].set_color('#FFFFFF')
    ax.patch.set_facecolor('#FFFFFF')

# Lendo os Dados

In [3]:
# Lendo a base mais recente sobre o virus
data_path = r'D:\Users\thiagoPanini\github_files\kaggle_challenges\kernels\08_corona_virus\data'
df_corona = pd.read_csv(data_path + f'\covid_19_data.csv')
df_corona.columns = [c.lower().replace(' ', '_').replace('/', '_') for c in df_corona.columns]
df_corona.head()

Unnamed: 0,sno,observationdate,province_state,country_region,last_update,confirmed,deaths,recovered
0,1,01/22/2020,Anhui,Mainland China,1/22/2020 17:00,1.0,0.0,0.0
1,2,01/22/2020,Beijing,Mainland China,1/22/2020 17:00,14.0,0.0,0.0
2,3,01/22/2020,Chongqing,Mainland China,1/22/2020 17:00,6.0,0.0,0.0
3,4,01/22/2020,Fujian,Mainland China,1/22/2020 17:00,1.0,0.0,0.0
4,5,01/22/2020,Gansu,Mainland China,1/22/2020 17:00,0.0,0.0,0.0


De acordo com a documentação da base disponibilizada no Kaggle, as colunas presentes na base são:

- **sno:** serial number;
- **observationdate:** data da observação no formato MM/DD/YYYY;
- **province_state:** cidade referente ao registro (pode ser vazio ("") quando nulo);
- **country_region:** país referente ao registro;
- **last_update:** tempo (UTC) na qual o registro foi atualizado para o estado e país em questão;
- **confirmed**: quantidade cumulativa de casos confirmados até a data em questão;
- **deaths:** quantidade cumulativa de vítimas até a data em questão;
- **recovered:** quantidade cumulativa de pacientes recuperados até a data em questão

In [4]:
# Volumetria
df_corona.shape

(5890, 8)

# DataPrep

Aqui será proposto um fluxo de preparação da base lida envolvendo:

**1)** Transformação das colunas de data e eliminação das inconsistências;

**2)** Criação de nova coluna com casos `ativos`

In [5]:
# Limpando colunas de data
df_corona['last_update_cleaned'] = pd.to_datetime(df_corona['last_update']).dt.date
df_corona['obs_date_cleaned'] = pd.to_datetime(df_corona['observationdate']).dt.date
df_corona.drop(['last_update', 'observationdate'], axis=1, inplace=True)
df_corona.columns = ['sno', 'province_state', 'country_region', 'confirmed', 
                     'deaths', 'recovered', 'observation_date', 'last_update']

# Criação de coluna com casos ativos
df_corona['actives'] = df_corona['confirmed'] - df_corona['deaths'] - df_corona['recovered']

df_corona.head()

Unnamed: 0,sno,province_state,country_region,confirmed,deaths,recovered,observation_date,last_update,actives
0,1,Anhui,Mainland China,1.0,0.0,0.0,2020-01-22,2020-01-22,1.0
1,2,Beijing,Mainland China,14.0,0.0,0.0,2020-01-22,2020-01-22,14.0
2,3,Chongqing,Mainland China,6.0,0.0,0.0,2020-01-22,2020-01-22,6.0
3,4,Fujian,Mainland China,1.0,0.0,0.0,2020-01-22,2020-01-22,1.0
4,5,Gansu,Mainland China,0.0,0.0,0.0,2020-01-22,2020-01-22,0.0


In [6]:
# Range dos dados
print(f'Range de observação: de {df_corona["observation_date"].min()} até {df_corona["observation_date"].max()}\n')
print(f'Range de atualização: de {df_corona["last_update"].min()} até {df_corona["last_update"].max()}')

Range de observação: de 2020-01-22 até 2020-03-15

Range de atualização: de 2020-01-22 até 2020-03-15


# Exploração Gráfica

Nesta sessão, vamos iniciar as análises gráficas a partir de plotagens com `matplotlib`, `seaborn` e `plotly`. O intuito é retirar insights da base de dados e proporcionar uma visão ampla sobre os impactos do COVID-19.

In [18]:
# Agrupando dados
corona_sum = df_corona.groupby(by='last_update', as_index=False).sum()
china_sum = df_corona.query('country_region == "Mainland China"').groupby(by='last_update', as_index=False).sum()

In [144]:
# Criando figure
fig = go.Figure()

# Criando linha - Casos confirmados ao longo do mundo
trace0 = go.Scatter(
    x=corona_sum['last_update'],
    y=corona_sum['confirmed'],
    line=dict(
        color='black',
        width=4
    ),
    name='Mundo'
)

# Criando linha - Casos confirmados na China
trace1 = go.Scatter(
    x=china_sum['last_update'],
    y=china_sum['confirmed'],
    line=dict(
        color='crimson',
        width=2
    ),
    name='China'
)

# Criando ponto específico
trace2 = go.Scatter(
    x=corona_sum['last_update'][-1:],
    y=corona_sum['confirmed'][-1:],
    mode='markers',
    marker=dict(
        color='black',
        size=12
    ),
    showlegend=False
)

# Adicionando traços às figuras
fig.add_trace(trace0)
fig.add_trace(trace1)
fig.add_trace(trace2)

# Formatando layout da plotagem
fig.update_layout(
    
    # Eixo x
    xaxis=dict(
        showline=True,
        showgrid=False,
        showticklabels=True,
        linecolor='rgb(204, 204, 204)',
        linewidth=2,
        ticks='outside',
        tickfont=dict(
            family='Raleway, sans-serif',
            size=12,
            color='rgb(82, 82, 82)',
        ),
    ),
    
    # Eixo y
    yaxis=dict(
        showgrid=True,
        zeroline=True,
        showline=True,
        showticklabels=True,
        linecolor='rgb(204, 204, 204)',
        linewidth=2,
        ticks='outside',
        tickfont=dict(
            family='Raleway, sans-serif',
            size=12,
            color='rgb(82, 82, 82)',
        ),
    ),
    
    
    showlegend=True,
    plot_bgcolor='white',
    
    # Título do gráfico
    title=dict(
        text='Evolução COVID-19 no Mundo',
        font=dict(
            family='Franklin Gothic',
            size=25,
            color='dimgrey'
        )
    ),
    
    # Título do eixo y
    yaxis_title=dict(
        text='Casos Confirmados',
        font=dict(
            family='Raleway, sans-serif',
            size=12,
            color='rgb(150,150,150)'
        )
    )
)

# Criando anotações
annotations = []
annotations.append(
    dict(
        xref='paper', 
        yref='paper', 
        x=0.5, 
        y=-0.1,
        xanchor='center', 
        yanchor='top',
        text='Fonte: <a href=https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset>' +
                                   'Novel COVID-19 Kaggle dataset</a>',
        font=dict(
            family='Raleway, sans-serif',
            size=12,
            color='rgb(150,150,150)'
        ),
        showarrow=False
    )
)

annotations.append(
    dict(
        x=corona_sum['last_update'][31],
        y=corona_sum['confirmed'][31],
        xref='x',
        yref='y',
        text='Espalhamento drástico <br>pelo mundo',
        ax=-20,
        ay=-50,
        showarrow=True,
        arrowhead=2,
        font=dict(
            color='dimgrey'
        )
    )
)

annotations.append(
    dict(
        x=corona_sum['last_update'][len(corona_sum)-1],
        y=corona_sum['confirmed'][len(corona_sum)-1],
        xref='x',
        yref='y',
        text=f'{int(corona_sum["confirmed"].max())}' + 
                '<br>casos confirmados<br> ao todo',
        ax=-90,
        ay=0,
        showarrow=True,
        arrowhead=2,
        font=dict(
            family='Raleway, sans-serif',
            color='dimgrey'
        )
    )
)
 
fig.update_layout(annotations=annotations)
fig.show()

In [36]:
list(corona_sum['last_update'].values)[25]

datetime.date(2020, 2, 16)

In [136]:
# Get this figure: fig = py.get_figure("https://plot.ly/~Sripada/148/")
# Get this figure's data: data = py.get_figure("https://plot.ly/~Sripada/148/").get_data()
# Add data to this figure: py.plot(Data([Scatter(x=[1, 2], y=[2, 3])]), filename ="plot from API (24)", fileopt="extend")
# Get y data of first trace: y1 = py.get_figure("https://plot.ly/~Sripada/148/").get_data()[0]["y"]

# Get figure documentation: https://plot.ly/python/get-requests/
# Add data documentation: https://plot.ly/python/file-options/

# If you're using unicode in your file, you may need to specify the encoding.
# You can reproduce this figure in Python with the following code!

# Learn about API authentication here: https://plot.ly/python/getting-started
# Find your api_key here: https://plot.ly/settings/api

from plotly.graph_objs import *
py.sign_in('username', 'api_key')
trace1 = {
  "uid": "2021d2", 
  "name": "Perceived Tax Distribution", 
  "type": "bar", 
  "x": ["Bottom 10%", "Average", "Top 10%"], 
  "y": ["23.9", "29.8", "35.9"], 
  "marker": {"color": "rgb(135, 186, 0)"}
}
trace2 = {
  "uid": "e5bb3c", 
  "name": "Preferred Tax Distribution", 
  "type": "bar", 
  "x": ["Bottom 10%", "Average", "Top 10%"], 
  "y": ["14.8", "22.4", "38.6"], 
  "marker": {"color": "rgb(60, 120, 216)"}
}
trace3 = {
  "uid": "2e840f", 
  "name": "Actual Tax Distribution", 
  "type": "bar", 
  "x": ["Bottom 10%", "Average", "Top 10%"], 
  "y": ["43.0", "35.0", "35.0"], 
  "marker": {"color": "rgb(140, 0, 75)"}
}
data = Data([trace1, trace2, trace3])
layout = {
  "font": {
    "size": 12, 
    "color": "#444", 
    "family": "Raleway, sans-serif"
  }, 
  "title": "Sample Data Distribution<br>A Simple % Representation", 
  "width": 800, 
  "xaxis": {
    "type": "category", 
    "dtick": 1, 
    "range": [-0.5, 2.5], 
    "tick0": 0, 
    "ticks": "", 
    "title": "", 
    "anchor": "y", 
    "domain": [0, 1], 
    "mirror": False, 
    "nticks": 0, 
    "ticklen": 5, 
    "position": 0, 
    "showgrid": False, 
    "showline": True, 
    "tickfont": {
      "size": 0, 
      "color": "", 
      "family": ""
    }, 
    "tickmode": "auto", 
    "zeroline": False, 
    "autorange": True, 
    "gridcolor": "#eee", 
    "gridwidth": 1, 
    "linecolor": "#444", 
    "linewidth": 1, 
    "rangemode": "normal", 
    "tickangle": "auto", 
    "tickcolor": "#444", 
    "tickwidth": 1, 
    "titlefont": {
      "size": 0, 
      "color": "", 
      "family": ""
    }, 
    "overlaying": False, 
    "showexponent": "all", 
    "zerolinecolor": "#444", 
    "zerolinewidth": 1, 
    "exponentformat": "B", 
    "showticklabels": True
  }, 
  "yaxis": {
    "type": "linear", 
    "dtick": 10, 
    "range": [0, 45.26315789473684], 
    "tick0": 0, 
    "ticks": "", 
    "title": "", 
    "anchor": "x", 
    "domain": [0, 1], 
    "mirror": False, 
    "nticks": 0, 
    "ticklen": 5, 
    "position": 0, 
    "showgrid": True, 
    "showline": True, 
    "tickfont": {
      "size": 0, 
      "color": "", 
      "family": ""
    }, 
    "tickmode": "auto", 
    "zeroline": False, 
    "autorange": True, 
    "gridcolor": "#eee", 
    "gridwidth": 1, 
    "linecolor": "#444", 
    "linewidth": 1, 
    "rangemode": "normal", 
    "tickangle": "auto", 
    "tickcolor": "#444", 
    "tickwidth": 1, 
    "titlefont": {
      "size": 0, 
      "color": "", 
      "family": ""
    }, 
    "overlaying": False, 
    "showexponent": "all", 
    "zerolinecolor": "#444", 
    "zerolinewidth": 1, 
    "exponentformat": "B", 
    "showticklabels": True
  }, 
  "bargap": 0.2, 
  "boxgap": 0.3, 
  "height": 500, 
  "legend": {
    "x": 0.6796875, 
    "y": 1.09375, 
    "font": {
      "size": 0, 
      "color": "", 
      "family": ""
    }, 
    "bgcolor": "rgba(255, 255, 255, 0)", 
    "xanchor": "left", 
    "yanchor": "top", 
    "traceorder": "normal", 
    "bordercolor": "#444", 
    "borderwidth": 0
  }, 
  "margin": {
    "b": 80, 
    "l": 80, 
    "r": 80, 
    "t": 100, 
    "pad": 0, 
    "autoexpand": True
  }, 
  "barmode": "group", 
  "boxmode": "overlay", 
  "autosize": False, 
  "dragmode": "zoom", 
  "hovermode": "x", 
  "titlefont": {
    "size": 0, 
    "color": "", 
    "family": ""
  }, 
  "separators": ".,", 
  "showlegend": True, 
  "annotations": [
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": -14, 
      "ay": -60.39166259765625, 
      "font": {
        "size": 14, 
        "color": "rgb(255, 255, 255)", 
        "family": ""
      }, 
      "text": "23.9%", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }, 
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": 45, 
      "ay": 6.60833740234375, 
      "font": {
        "size": 14, 
        "color": "rgb(255, 255, 255)", 
        "family": ""
      }, 
      "text": "14.8%", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }, 
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": 100, 
      "ay": -194.39166259765625, 
      "font": {
        "size": 14, 
        "color": "rgb(255, 255, 255)", 
        "family": ""
      }, 
      "text": "43.0%", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }, 
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": 202, 
      "ay": -102.39166259765625, 
      "font": {
        "size": 14, 
        "color": "rgb(255, 255, 255)", 
        "family": ""
      }, 
      "text": "29.8%", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }, 
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": 258, 
      "ay": -48.39166259765625, 
      "font": {
        "size": 14, 
        "color": "rgb(255, 255, 255)", 
        "family": ""
      }, 
      "text": "22.4%", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }, 
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": 315, 
      "ay": -139.39166259765625, 
      "font": {
        "size": 14, 
        "color": "rgb(255, 255, 255)", 
        "family": ""
      }, 
      "text": "35.0%", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }, 
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": 416, 
      "ay": -144.39166259765625, 
      "font": {
        "size": 14, 
        "color": "rgb(255, 255, 255)", 
        "family": ""
      }, 
      "text": "35.9%", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }, 
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": 471, 
      "ay": -163.39166259765625, 
      "font": {
        "size": 14, 
        "color": "rgb(255, 255, 255)", 
        "family": ""
      }, 
      "text": "38.6%", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }, 
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": 528, 
      "ay": -139.39166259765625, 
      "font": {
        "size": 14, 
        "color": "rgb(255, 255, 255)", 
        "family": ""
      }, 
      "text": "35.0%", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }, 
    {
      "x": 0.10000000000000002, 
      "y": 0.3, 
      "ax": -51, 
      "ay": 140.60833740234375, 
      "font": {
        "size": 0, 
        "color": "", 
        "family": ""
      }, 
      "text": "Source: <a href=\"http://www.equalitytrust.org.uk/sites/default/files/attachments/resources/Unfair%20and%20Unclear.pdf\">The Equality Trust</a>", 
      "xref": "paper", 
      "yref": "paper", 
      "align": "center", 
      "bgcolor": "rgba(0, 0, 0, 0)", 
      "opacity": 1, 
      "xanchor": "auto", 
      "yanchor": "auto", 
      "arrowhead": 1, 
      "arrowsize": 1, 
      "borderpad": 1, 
      "showarrow": True, 
      "textangle": 0, 
      "arrowcolor": "rgba(68, 68, 68, 0)", 
      "arrowwidth": 0, 
      "bordercolor": "", 
      "borderwidth": 1
    }
  ], 
  "bargroupgap": 0.05, 
  "boxgroupgap": 0.3, 
  "hidesources": False, 
  "plot_bgcolor": "#fff", 
  "paper_bgcolor": "#fff"
}
fig = Figure(data=data, layout=layout)
plot_url = py.plot(fig)

AttributeError: module 'plotly' has no attribute 'sign_in'