<h1>SARS-nCoV-2 in Romania</h1>


I added my dataset on daily county-level confirmed cases for Romania.  

## Load packages

We will use mostly Plotly and Folium for visualization.

In [None]:
import numpy as np 
import pandas as pd
import os
import matplotlib.pyplot as plt
import seaborn as sns 
import datetime as dt
%matplotlib inline
import datetime as dt
import plotly.graph_objs as go
import plotly.figure_factory as ff
from plotly import tools
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
from shapely.geometry import shape, Point, Polygon
import folium
from folium.plugins import HeatMap, HeatMapWithTime
init_notebook_mode(connected=True)

# Load and process the data

There is only one file in the dataset, updated daily.


## Cumulative data

We glimpse the data, looking to shape of the data and some samples from the head and tail of the dataset.

In [None]:
county_data_df = pd.read_csv("/kaggle/input/covid19-romania-county-level/ro_covid_19_time_series.csv")
country_data_df = pd.read_csv("/kaggle/input/covid19-romania-county-level/ro_covid_19_country_data_time_series.csv")

In [None]:
county_data_df.shape

In [None]:
county_data_df.head()

In [None]:
country_data_df.shape

In [None]:
country_data_df.head()

In [None]:
country_data_df.tail()

Let's fix the issue with *2020-06-04*. The data is missing. We will just fill in the average of the days before and after.

In [None]:
for feature in ['ati', 'quarantine', 'isolation', 'tests', 'confirmed', 'recovered', 'deaths']:
    country_data_df.loc[country_data_df.date=="2020-06-04",feature] = \
    int((country_data_df.loc[country_data_df.date=="2020-06-05", feature].values[0] +\
         country_data_df.loc[country_data_df.date=="2020-06-03", feature].values[0])/2)

Let's fix the issue with 2020-06-26. The data is missing. We will just fill in the average of the days before and after.

In [None]:
for feature in ['ati', 'quarantine', 'isolation', 'tests', 'confirmed', 'recovered', 'deaths']:
    country_data_df.loc[country_data_df.date=="2020-06-26",feature] = \
    int((country_data_df.loc[country_data_df.date=="2020-06-25", feature].values[0] +\
         country_data_df.loc[country_data_df.date=="2020-06-27", feature].values[0])/2)

Let's fix the issue for 2020-06-29.

In [None]:
for feature in ['ati', 'quarantine', 'isolation', 'tests', 'confirmed', 'recovered', 'deaths']:
    country_data_df.loc[country_data_df.date=="2020-06-29",feature] = \
    int((country_data_df.loc[country_data_df.date=="2020-06-28", feature].values[0] +\
         country_data_df.loc[country_data_df.date=="2020-06-30", feature].values[0])/2)

In [None]:
country_data_df.tail()

Convert the string storing the date to an actual date.

In [None]:
country_data_df['date'] = country_data_df['date'].apply(lambda x: dt.datetime.strptime(x, "%Y-%m-%d"))

Let's calculate the active cases as well.  

The number of current active cases is very important, because this is the number that tests the capacity of the health system to respond to the crisis. This crisis is not only a medical crisis, it is also a resources crisis: supply and logistic resources, human resources, managmement resources. Limiting the number of current active cases or finding effective measures to distribute the effort, so that the capacity of health system will not be overhealmed, is of first priority.


## Daily data

Let's use pandas diff function to calculate the daily data from the cumulative data.


# Daily data

In [None]:
def plot_bars_time_variation(d_df, feature, title, color='Red'):
    
    hover_text = []
    for index, row in d_df.iterrows():
        hover_text.append(('Date: {}<br>'+
                          'Confirmed cases: {}<br>'+
                          'Recovered cases: {}<br>'+
                          'Deaths: {}<br>'+
                          'Tests: {}').format(row['date'],row['cases'], 
                                                   row['recovered'], row['deaths'], row['tests']))
    d_df['hover_text'] = hover_text

    d_df['text'] = hover_text
    trace = go.Bar(
        x = d_df['date'],y = d_df[feature],
        name=feature,
        marker=dict(color=color),
        text = hover_text
    )

    data = [trace]
    layout = dict(title = title,
              xaxis = dict(title = 'Date', showticklabels=True), 
              yaxis = dict(title = title),
              hovermode = 'closest'
             )
    fig = dict(data=data, layout=layout)
    iplot(fig, filename='cases-covid19')

## County-level Data - Time Evolution

In [None]:
d_df = county_data_df.copy()
d_df = d_df.loc[d_df['Confirmed']>0]
counties = list(d_df.County.unique())

data = []
for county in counties:
    dc_df = d_df.loc[d_df.County==county]
    traceC = go.Scatter(
        x = dc_df['Date'],y = dc_df['Confirmed'],
        name=county,
        mode = "markers+lines",
        text=dc_df['Confirmed']
    )
    data.append(traceC)

layout = dict(title = 'Confirmed cases per County (log scale)',
          xaxis = dict(title = 'Date', showticklabels=True), 
          yaxis = dict(title = 'Confirmed cases (log scale)'),
          yaxis_type="log",
          hovermode = 'y',
          height=1000
         )

fig = dict(data=data, layout=layout)
iplot(fig, filename='covid-cases_7')