# Coronavirus propagation visualization and forecast modeling. Part 1: History of Coronavirus.

<img src="https://s5.gifyu.com/images/ezgif.com-crop-27005491cdf6bfa9a.gif" width="1153px">

I have recently watched a documentary about the breakouts of viruses during the human history. It seems that with every new virus appearance we are getting better and better equipped and ready to fight back. However there are still more to learn, during the current outbreak of coronavirus there are already hundreds of people who died and most likely there will be many more.

The big question is what we can do? What Kaggle society have to offer as an answer to a natural threat of this kind? I believe we can do various things:

    - research genome of the virus, in order to classify and find the weakness of it.
    - develop forecast model that would help to understand where should we be ready more then in other places to face the danger.

As a data scientist, I am more on the forecast site, this and the next few notebooks would be about my attempts to understand how the propagation of our new enemy would look like in the near future.

## UPDATES:
1. Added choropleth layer.
2. Updated data to 2 Feb.
3. Updated choropleth color to 'yellow'.
4. Added possible flights paths.

### Imports

In [None]:
import numpy as np
import pandas as pd
import geopandas as gpd
from geopandas.tools import geocode
import math
from collections import namedtuple

import folium
from folium import Choropleth, Circle, Marker
from folium.plugins import HeatMap, MarkerCluster, TimestampedGeoJson

import datetime
import os

### Reading data
Big thanks to [Brenda So](https://www.kaggle.com/brendaso) for her [dataset](https://www.kaggle.com/brendaso/2019-coronavirus-dataset-01212020-01262020). It really saved me lots of time. As the first thing I found was this: https://docs.google.com/spreadsheets/d/1yZv9w9zRKwrGTaR-YzmAqMefw4wMlaXocejdxZaTs6w/htmlview?usp=sharing&sle=true# Which is not usable.

In [None]:
import os
for dirname, _, filenames in os.walk('../input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In [None]:
# loading the data
chinaVectors = "../input/china-regions-map/china.json"
df = pd.read_csv("../input/coronavirus-latlon-dataset/CV_LatLon_21Jan_12Mar.csv", index_col = 0)

## Visualization

In [None]:
# lets see how the big picture of virus propagation looked on 12th of March
df12OfMarch = df.loc[df.date == '3/12/20', 
                         ['state', 
                          'country', 
                          'confirmed', 
                          'recovered', 
                          'death', 
                          'lat', 
                          'lon']]

### The map using data from 12 th March

In [None]:
def radiusMinMaxer(radius):
    radiusMin = 2
    radiusMax = 40
    if radius != 0:
        if radius < radiusMin:
            radius = radiusMin
        if radius > radiusMax:
            radius = radiusMax
    return radius

In [None]:
# and now the fun part, getting it all on the map. I borrowed the style and some ideas from this article: https://towardsdatascience.com/visualizing-bike-mobility-in-london-using-interactive-maps-for-absolute-beginners-3b9f55ccb59
colorConfirmed = '#ffbf80'
colorRecovered = '#0A5E2AFF'
colorDead = '#E80018'
circleFillOpacity = 0.2

map = folium.Map(location=[15.632909, 14.911222], 
                 tiles = "CartoDB dark_matter",
                 detect_retina = True,
                 zoom_start=2)

# map layers
layerFlights = folium.FeatureGroup(name='<span style="color: black;">Flights</span>')
layerConfirmed = folium.FeatureGroup(name='<span style=\\"color: #EFEFE8FF;\\">Confirmed infected</span>')
layerDead = folium.FeatureGroup(name='<span style=\\"color: #E80018;\\">Dead</span>')
layerRecovered = folium.FeatureGroup(name='<span style=\\"color: #0A5E2AFF;\\">Recovered from virus</span>')  

# the choropleth idea togeather with circles was adviced by: https://www.kaggle.com/gpreda/tracking-the-spread-of-2019-coronavirus
folium.Choropleth(
                geo_data=chinaVectors,
                name='Choropleth',
                key_on='feature.properties.name',
                fill_color='yellow',
                fill_opacity=0.18,
                line_opacity=0.7
                ).add_to(map)

# coordinates of Huabei province, thats where first infected travelers were departuring from.
departurePoint = [df12OfMarch.loc[df12OfMarch.state == 'Hubei', 
                                  'lat'].values[0], df12OfMarch.loc[df12OfMarch.state == 'Hubei', 
                                                                    'lon'].values[0]]

for i, row in df12OfMarch.iterrows():
    lat = row.lat
    lon = row.lon
    country = row.country
    province = row.state
    recovered = row.recovered
    death = row.death
    confirmed = row.confirmed

    radiusConfirmed = radiusMinMaxer(np.sqrt(confirmed))
    radiusRecovered = radiusMinMaxer(np.sqrt(recovered))
    radiusDead = radiusMinMaxer(np.sqrt(death))
    
    # coordinates of infected travelers arrivals
    arrivalPoint = [lat, lon]

    if row.state != '0':
        popup = 'Flight:&nbsp;' + 'Hubei,&nbsp;China&nbsp;-&nbsp;' + row.state + ',&nbsp;' + row.country
    else:
        popup = 'Flight:&nbsp;' + 'Hubei,&nbsp;China&nbsp;-&nbsp;' + row.country
        
    folium.PolyLine(locations=[departurePoint, arrivalPoint], 
                      color='white', 
                      weight = 0.5,
                      opacity = 0.3,
                      popup = popup
                       ).add_to(layerFlights)

    popupConfirmed = str(country) + ' ' + str(province) + '(Confirmed='+str(row.confirmed) + ' Deaths=' + str(death) + ' Recovered=' + str(recovered) + ')'

    folium.CircleMarker(location = [lat,lon], 
                        radius = radiusConfirmed, 
                        popup = popupConfirmed, 
                        color = colorConfirmed, 
                        fill_opacity = 0.3,
                          weight = 1, 
                          fill = True, 
                          fillColor = colorConfirmed
                           ).add_to(layerConfirmed)
    
    if row.recovered != 0:
        popupRecovered = str(country) + ' ' + str(province) + '(Confirmed='+str(row.confirmed) + ' Deaths=' + str(death) + ' Recovered=' + str(recovered) + ')'

        folium.CircleMarker(location = [lat,lon], 
                            radius = radiusRecovered, 
                            popup = popupRecovered, 
                            color = colorRecovered, 
                            fill_opacity = circleFillOpacity,
                              weight = 1, 
                              fill = True, 
                              fillColor = colorRecovered
                               ).add_to(layerRecovered) 
        
    if row.death != 0:
        popupDead = str(country) + ' ' + str(province) + '(Confirmed='+str(row.confirmed) + ' Deaths=' + str(death) + ' Recovered=' + str(recovered) + ')'

        folium.CircleMarker(location = [lat,lon], 
                            radius = radiusDead, 
                            popup = popupDead, 
                            color = colorDead, 
                            fill_opacity = circleFillOpacity,
                              weight = 1, 
                              fill = True, 
                              fillColor = colorDead
                               ).add_to(layerRecovered) 

layerFlights.add_to(map)
layerConfirmed.add_to(map)
layerRecovered.add_to(map)
layerDead.add_to(map)

folium.map.LayerControl('bottomleft', collapsed=False).add_to(map)

map

The orange circles indicate the number of people that are confirmed to be sick with coronavirus. The green color indicates the number of those lucky ones who recovered from the deadly virus. The red color indicates number of dead. This map is interactive, and you can click on circles to see statistics for any territory which might interest you.

### Interactive map focused on Asia:

In [None]:
map = folium.Map(location=[32.902807, 101.089332], 
                 tiles = "CartoDB dark_matter",
                 detect_retina = True,
                 zoom_start=4)

# map layers
layerFlights = folium.FeatureGroup(name='<span style="color: black;">Flights</span>')
layerConfirmed = folium.FeatureGroup(name='<span style=\\"color: #EFEFE8FF;\\">Confirmed infected</span>')
layerDead = folium.FeatureGroup(name='<span style=\\"color: #E80018;\\">Dead</span>')
layerRecovered = folium.FeatureGroup(name='<span style=\\"color: #0A5E2AFF;\\">Recovered from virus</span>')  

# the choropleth idea togeather with circles was adviced by: https://www.kaggle.com/gpreda/tracking-the-spread-of-2019-coronavirus
folium.Choropleth(
                geo_data=chinaVectors,
                name='Choropleth',
                key_on='feature.properties.name',
                fill_color='yellow',
                fill_opacity=0.18,
                line_opacity=0.7
                ).add_to(map)



for i, row in df12OfMarch.iterrows():
    lat = row.lat
    lon = row.lon
    country = row.country
    province = row.state
    recovered = row.recovered
    death = row.death
    confirmed = row.confirmed

    radiusConfirmed = radiusMinMaxer(np.sqrt(confirmed))
    radiusRecovered = radiusMinMaxer(np.sqrt(recovered))
    radiusDead = radiusMinMaxer(np.sqrt(death))
    
    # coordinates of infected travelers arrivals
    arrivalPoint = [lat, lon]

    if row.state != '0':
        popup = 'Flight:&nbsp;' + 'Hubei,&nbsp;China&nbsp;-&nbsp;' + row.state + ',&nbsp;' + row.country
    else:
        popup = 'Flight:&nbsp;' + 'Hubei,&nbsp;China&nbsp;-&nbsp;' + row.country
        
    folium.PolyLine(locations=[departurePoint, arrivalPoint], 
                      color='white', 
                      weight = 0.5,
                      opacity = 0.3,
                      popup = popup
                       ).add_to(layerFlights)

    popupConfirmed = str(country) + ' ' + str(province) + '(Confirmed='+str(row.confirmed) + ' Deaths=' + str(death) + ' Recovered=' + str(recovered) + ')'

    folium.CircleMarker(location = [lat,lon], 
                        radius = radiusConfirmed, 
                        popup = popupConfirmed, 
                        color = colorConfirmed, 
                        fill_opacity = 0.3,
                          weight = 1, 
                          fill = True, 
                          fillColor = colorConfirmed
                           ).add_to(layerConfirmed)
    
    if row.recovered != 0:
        popupRecovered = str(country) + ' ' + str(province) + '(Confirmed='+str(row.confirmed) + ' Deaths=' + str(death) + ' Recovered=' + str(recovered) + ')'

        folium.CircleMarker(location = [lat,lon], 
                            radius = radiusRecovered, 
                            popup = popupRecovered, 
                            color = colorRecovered, 
                            fill_opacity = circleFillOpacity,
                              weight = 1, 
                              fill = True, 
                              fillColor = colorRecovered
                               ).add_to(layerRecovered) 
        
    if row.death != 0:
        popupDead = str(country) + ' ' + str(province) + '(Confirmed='+str(row.confirmed) + ' Deaths=' + str(death) + ' Recovered=' + str(recovered) + ')'

        folium.CircleMarker(location = [lat,lon], 
                            radius = radiusDead, 
                            popup = popupDead, 
                            color = colorDead, 
                            fill_opacity = circleFillOpacity,
                              weight = 1, 
                              fill = True, 
                              fillColor = colorDead
                               ).add_to(layerRecovered) 

layerFlights.add_to(map)
layerConfirmed.add_to(map)
layerRecovered.add_to(map)
layerDead.add_to(map)

folium.map.LayerControl('bottomleft', collapsed=False).add_to(map)

map

### Animation

The animation part for some reason is not really obvious within the folium library. There is one option to animate HeatMaps, but recently it stopped working, as well as options to animate routes, and marker positions through GeoJason. So the only way I found to actually indicate the development of situation with virus, is to make a bunch of screenshots, and feed them into the gif or video animation as frames. 

In [None]:
radiusMin = 2
radiusMax = 50
colorConfirmed = '#E80018'
colorRecovered = '#81D8D0'

for date in df.date.unique():
    print('date=', date)
    _df = df.loc[df.date == date, 
                ['state', 
                'country', 
                'confirmed', 
                'recovered', 
                'death', 
                'lat', 
                'lon']]
    _df.reset_index(drop = True, inplace = True)
    _map = folium.Map(location=[15.632909, 14.911222], 
                 tiles = "CartoDB dark_matter", 
                 zoom_start=2)

    folium.Choropleth(
                        geo_data=chinaVectors,
                        name='choropleth',
                        key_on='feature.properties.name',
                        fill_color='yellow',
                        fill_opacity=0.18,
                        line_opacity=0.7).add_to(_map)

    for i, row in _df.iterrows():
        if row.confirmed != 0:
            arrivalPoint = [row.lat, row.lon]

            if row.state != '0':
                popup = 'Flight:&nbsp;' + 'Hubei,&nbsp;China&nbsp;-&nbsp;' + row['state'] + ',&nbsp;' + row['country']
            else:
                popup = 'Flight:&nbsp;' + 'Hubei,&nbsp;China&nbsp;-&nbsp;' + row['country']
                folium.PolyLine(locations=[departurePoint, arrivalPoint], 
                              color='white', 
                              weight = 0.5,
                              opacity = 0.4,
                              popup = popup).add_to(_map)

        lat = row.lat
        lon = row.lon
        country = row.country
        province = row.state
        recovered = row.recovered
        death = row.death
        confirmed = row.confirmed

        radiusConfirmed = radiusMinMaxer(np.sqrt(confirmed))

        popupConfirmed = str(country) + ' ' + str(province) + '(Confirmed='+str(row.confirmed) + ' Deaths=' + str(death) + ' Recovered=' + str(recovered) + ')'
        folium.CircleMarker(location = [lat,lon], 
                            radius = radiusConfirmed, 
                            popup = popupConfirmed, 
                            color = colorConfirmed, 
                            fill_opacity = 0.2,
                            weight = 1, 
                            fill = True, 
                            fillColor = colorConfirmed).add_to(_map)
      
        if row.recovered != 0:
            radiusRecovered = radiusMinMaxer(np.sqrt(recovered))

        popupRecovered = str(country) + ' ' + str(province) + '(Confirmed='+str(row.confirmed) + ' Deaths=' + str(death) + ' Recovered=' + str(recovered) + ')'

        folium.CircleMarker(location = [lat,lon], 
                              radius = radiusRecovered, 
                              popup = popupRecovered, 
                              color = colorRecovered, 
                              fill_opacity = 0.2,
                              weight = 1, 
                              fill = True, 
                              fillColor = colorRecovered).add_to(_map) 

    path = '/kaggle/working/'
    f = path + 'map' + str(date).replace('/', '-') + '.html'
    _map.save(f)


<img src="https://s5.gifyu.com/images/ezgif.com-crop-27005491cdf6bfa9a.gif" width="1153px">

As you can see, there is no more interactivity. Now, this is just a scary pandemic gif animation. But it still might be useful for some kind of presentations. If you know a different way it could be animated I would really appreciate if you give an advice in comments  below this post.

### Combo map. Animation, all three stauses (confirmed/death/recovered), choropleth.

The function below is creating a mighty GoeJson that carries our timestamped layers to the folium map.

In [None]:
dfConfirmed = df.loc[df.confirmed != 0, 
                         ['state', 
                          'country', 
                          'confirmed', 
                          'lat', 
                          'lon',
                          'date']]

dfRecovered = df.loc[df.recovered != 0, 
                         ['state', 
                          'country', 
                          'recovered', 
                          'lat', 
                          'lon',
                         'date']]

dfDead = df.loc[df.death != 0, 
                         ['state', 
                          'country', 
                          'death', 
                          'lat', 
                          'lon',
                         'date']]

In [None]:
def create_geojson_features(dfConfirmed,
                            dfRecovered, 
                            dfDead,
                            radiusMax = 40, 
                            radiusMin = 2, 
                            colorConfirmed = colorConfirmed,
                            colorRecovered = colorRecovered,
                            colorDead = colorDead,
                            weight = 1,
                            fillOpacity = 0.2
                            ):

    print('> Creating GeoJSON features...')
    features = []
    feature = []
    
    for _, row in dfConfirmed.iterrows():
        radius = np.sqrt(row.confirmed)
        if radius != 0:
          if radius < radiusMin:
            radius = radiusMin

          if radius > radiusMax:
            radius = radiusMax

          feature = {
              'type': 'Feature',
              'geometry': {
                  'type':'Point', 
                  'coordinates':[row.lon, row.lat]
              },
              'properties': {
                  'time': row.date.__str__(),
                  'style': {'color' : colorConfirmed},
                  'icon': 'circle',
                  'iconstyle':{
                      'fillColor': colorConfirmed,
                      'fillOpacity': fillOpacity,
                      'stroke': 'true',
                      'radius': radius,
                      'weight': weight
                  }
              }
        }
        features.append(feature)

    for _, row in dfDead.iterrows():
        radius = np.sqrt(row.death)
        if radius != 0:
          if radius < radiusMin:
            radius = radiusMin

          if radius > radiusMax:
            radius = radiusMax
          popup = str(row.country) + ' ' + str(row.state) + '(Deaths=' + str(row.death) +')'
          feature = {
              'type': 'Feature',
              'geometry': {
                  'type':'Point', 
                  'coordinates':[row.lon,row.lat]
              },
              'properties': {
                  'time': row.date.__str__(),
                  'style': {'color' : colorDead},
                  'icon': 'circle',
                  'iconstyle':{
                      'fillColor': colorDead,
                      'fillOpacity': fillOpacity,
                      'stroke': 'true',
                      'radius': radius,
                      'weight': weight,
                      'popup': popup
                  }
              }
        }
        features.append(feature)

    for _, row in dfRecovered.iterrows():
        radius = np.sqrt(row.recovered)
        if radius != 0:
          if radius < radiusMin:
            radius = radiusMin

          if radius > radiusMax:
            radius = radiusMax

          feature = {
              'type': 'Feature',
              'geometry': {
                  'type':'Point', 
                  'coordinates':[row.lon,row.lat]
              },
              'properties': {
                  'time': row.date.__str__(),
                  'style': {'color' : colorRecovered},
                  'icon': 'circle',
                  'iconstyle':{
                      'fillColor': colorRecovered,
                      'fillOpacity': fillOpacity,
                      'stroke': 'true',
                      'radius': radius,
                      'weight': weight
                  }
              }
        }
        features.append(feature)
    
    
    return features

This function gets the GeoJson into the folium map.

In [None]:
def make_map(features, caption):
    print('> Making map...')
    coordinates=[15.632909, 14.911222]
    map = folium.Map(location=coordinates, 
                               control_scale=True, 
                               zoom_start=2,
                               tiles = 'CartoDB dark_matter',
                               detect_retina = True
                              )
    
    folium.Choropleth(
        geo_data=chinaVectors,
        name='Choropleth',
        key_on='feature.properties.name',
        fill_color='yellow',
        fill_opacity=0.18,
        line_opacity=0.7
        ).add_to(map)


    TimestampedGeoJson(
        {'type': 'FeatureCollection',
        'features': features}
        , period='P1D'
        , duration='P1D'
        , add_last_point=True
        , auto_play=False
        , loop=False
        , max_speed=1
        , loop_button=True
        , date_options='YYYY/MM/DD'
        , time_slider_drag_update=True
        , transition_time = 500
    ).add_to(map)
    
    map.caption = caption
    print('> Done.')
    return map

In [None]:
features = create_geojson_features(dfConfirmed, dfRecovered, dfDead, fillOpacity=0.3, weight = 1)
make_map(features, caption = "Coronavirus propagation 21Jan–13March, 2020.")

There are two issues that I wasn't able to fix this time: 
1. You can see that anumation is kind of breathing. I think this is because of the data being unequaly spread during the time stamps. Let me know if you get an idea how to fix it without too much effort.
2. The capture is not displayed.
3. No legend

### End of Part 1.

This post covers past and current situation with the coronavirus. In the part two I am going to look in the future, and try to get some clues of how the virus would migrate to new territories if the current situation would not change.

I would really appreciate any comments on how to produce map animations better. Also, I am looking for a solution to visualize flights information, in order to get a better idea how they impact propagation of the virus.