### Here we address the first of the few issues we recognized in our 'Folium' document.

I recognized that the blue line at the top and missing country bubbles are actually the same issue. When using .groupby on our dataset I passed a .sum() to sum the number of cases each coutnry had if they were split by province/state.

This summing actually summed 'Lat' and 'Long' aswell resulting in ridiculously high co-ordinates for certain countries, causing them to appear on the edge of our map instead of at their appropriate latitude and longtitudes. 

In [1]:
import folium
from folium import plugins
import ipywidgets
import numpy as np
import pandas as pd
import vega_datasets as vds
from vega_datasets import data

In [2]:
df = pd.read_csv('covid_19_clean_complete.csv', parse_dates = ['Date'])

In [5]:
df.head()

Unnamed: 0,Province/State,Country/Region,Lat,Long,Date,Confirmed,Deaths,Recovered
0,,Afghanistan,33.0,65.0,2020-01-22,0,0,0
1,,Albania,41.1533,20.1683,2020-01-22,0,0,0
2,,Algeria,28.0339,1.6596,2020-01-22,0,0,0
3,,Andorra,42.5063,1.5218,2020-01-22,0,0,0
4,,Angola,-11.2027,17.8739,2020-01-22,0,0,0


### Here I recognize that we have two groups of countries, those split by province/state and those which are not.

This poses an issue as highlighted above, so I decide to split our circle markers into these 2 respective groups. The countries not split by province/state can be accurately transposed using .groupby.sum(). 

While the second group must be modified to accurately reflect their latitudes and longitudes.

In [6]:
latest = df['Date'] == max(df['Date'])

In [8]:
df = df[latest]

In [9]:
df.head()

Unnamed: 0,Province/State,Country/Region,Lat,Long,Date,Confirmed,Deaths,Recovered
32860,,Afghanistan,33.0,65.0,2020-05-25,11173,219,1097
32861,,Albania,41.1533,20.1683,2020-05-25,1004,32,795
32862,,Algeria,28.0339,1.6596,2020-05-25,8503,609,4747
32863,,Andorra,42.5063,1.5218,2020-05-25,763,51,663
32864,,Angola,-11.2027,17.8739,2020-05-25,70,4,18


In [22]:
unmodified = df['Province/State'].isnull()

In [23]:
df1 = df[unmodified]

In [96]:
df1.set_index(['Country/Region'], inplace = True)

In [97]:
df1.head()

Unnamed: 0_level_0,Province/State,Lat,Long,Date,Confirmed,Deaths,Recovered
Country/Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Afghanistan,,33.0,65.0,2020-05-25,11173,219,1097
Albania,,41.1533,20.1683,2020-05-25,1004,32,795
Algeria,,28.0339,1.6596,2020-05-25,8503,609,4747
Andorra,,42.5063,1.5218,2020-05-25,763,51,663
Angola,,-11.2027,17.8739,2020-05-25,70,4,18


Our unmodified dataset.

In [17]:
df2 = df.dropna()

In [25]:
df2.head()

Unnamed: 0,Province/State,Country/Region,Lat,Long,Date,Confirmed,Deaths,Recovered
32868,Australian Capital Territory,Australia,-35.4735,149.0124,2020-05-25,107,3,104
32869,New South Wales,Australia,-33.8688,151.2093,2020-05-25,3092,48,2661
32870,Northern Territory,Australia,-12.4634,130.8456,2020-05-25,29,0,29
32871,Queensland,Australia,-28.0167,153.4,2020-05-25,1057,6,1039
32872,South Australia,Australia,-34.9285,138.6007,2020-05-25,439,4,435


Our datatset which requires manual changes.

In [32]:
df2['Country/Region'].unique()

array(['Australia', 'Canada', 'China', 'Denmark', 'France', 'Netherlands',
       'United Kingdom'], dtype=object)

In [48]:
df3 = df2.groupby(['Country/Region']).sum()

In [29]:
geo = pd.read_csv('worldgeodata.csv')

In [31]:
geo.head()

Unnamed: 0,name,country,latitude,longitude
0,Andorra,AD,42.546245,1.601554
1,United Arab Emirates,AE,23.424076,53.847818
2,Afghanistan,AF,33.93911,67.709953
3,Antigua and Barbuda,AG,17.060816,-61.796428
4,Anguilla,AI,18.220554,-63.068615


### Here I obtained an online data set which I cleaned, that contained Lat and Longs of all countries.

In [36]:
geo.drop('country',axis =1, inplace = True)

In [40]:
geo.rename(columns= {'name' : 'Country/Region', 'latitude' : 'Lat', 'longitude':'Long'},inplace = True)

In [45]:
geo.set_index(['Country/Region'], inplace = True)

In [46]:
geo.head()

Unnamed: 0_level_0,Lat,Long
Country/Region,Unnamed: 1_level_1,Unnamed: 2_level_1
Andorra,42.546245,1.601554
United Arab Emirates,23.424076,53.847818
Afghanistan,33.93911,67.709953
Antigua and Barbuda,17.060816,-61.796428
Anguilla,18.220554,-63.068615


In [75]:
df3

Unnamed: 0_level_0,Lat,Long,Confirmed,Deaths,Recovered
Country/Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Australia,-255.9695,1129.8623,7126,102,6552
Canada,671.7607,-1237.6289,87119,6655,0
China,1083.3367,3684.4197,84102,4638,76331
Denmark,133.5995,-49.5161,199,0,198
France,45.1348,57.5055,2901,53,1775
Netherlands,42.7307,-202.0806,196,19,168
United Kingdom,214.6518,-479.4887,1363,82,1161


### Figuring out a way to update Lat and Longs of this dataset without hard coding.

In [84]:
geo.loc['China']['Lat']

35.86166

In [54]:
countries = df3.index.tolist()

In [55]:
countries

['Australia',
 'Canada',
 'China',
 'Denmark',
 'France',
 'Netherlands',
 'United Kingdom']

In [88]:
Lat = []
Long = []

for country in countries:
    Lat.append(geo.loc[country]['Lat'])

for country in countries:
    Long.append(geo.loc[country]['Long'])

print(Lat)
print(Long)

[-25.274398, 56.130366, 35.86166, 56.26392, 46.227638, 52.132633, 55.378051]
[133.775136, -106.34677099999999, 104.195397, 9.501785, 2.213749, 5.291266, -3.435973]


In [89]:
df3['Lat'] = Lat

In [91]:
df3['Long'] = Long

In [92]:
df3

Unnamed: 0_level_0,Lat,Long,Confirmed,Deaths,Recovered
Country/Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Australia,-25.274398,133.775136,7126,102,6552
Canada,56.130366,-106.346771,87119,6655,0
China,35.86166,104.195397,84102,4638,76331
Denmark,56.26392,9.501785,199,0,198
France,46.227638,2.213749,2901,53,1775
Netherlands,52.132633,5.291266,196,19,168
United Kingdom,55.378051,-3.435973,1363,82,1161


Tada!

In [132]:

# map
world_circle = folium.Map(location=[40, 40], zoom_start=2)

# plugin for mini map
minimap = plugins.MiniMap(toggle_display=True)

# add minimap to map
world_circle.add_child(minimap)

# add scroll zoom toggler to map
plugins.ScrollZoomToggler().add_to(world_circle)

# add full screen button to map
plugins.Fullscreen(position='topright').add_to(world_circle)


folium.map.FeatureGroup(name='Confirmed Cases')


df1.apply(lambda row: folium.Circle(popup=row['Confirmed'], 
                                         tooltip = row.name,
                                         radius=row['Confirmed'], 
                                         location=[row['Lat'], row['Long']],
                                         fill = True,
                                         fill_color = '#1386cc'
                                        ).add_to(world_circle), axis=1)

df3.apply(lambda row: folium.Circle(popup=row['Confirmed'], 
                                         tooltip = row.name,
                                         radius=row['Confirmed'], 
                                         location=[row['Lat'], row['Long']],
                                         fill = True,
                                         fill_color = '#1386cc'
                                        ).add_to(world_circle), axis=1)


fg = folium.FeatureGroup('Confirmed Cases')
world_circle.add_child(fg)


folium.LayerControl().add_to(world_circle)

world_circle

## Notes:

So here we are, our problem seems to be solved. China and Australia have gotten their bubbles back and there isn't a long blue line at the edge of our map!

There seems to be a small issue with the UK, given the way it's entries were recorded it has 2 bubbles, one small and one large. The large bubble reflects the case count of the mainlan UK itself, but the small bubble is the casecount of all it's small overseas territories. Given that these numbers are so small and we are here for practice rather than creating a fully presentable visualization I have decided to leave it in.

Regarding the issue with small case count countries having obnoxious to click markers, I have some ideas but that will require some experimentation and tweaking in a future document. 

I have also just realized that Canada's case count has a single decimal place, which is rather strange considering we are dealing with 'Whole Humans', and is most likely due to the way it's data was recorded.

The eagled-eyed may also have realised I have dabbled in some layer control features. The original idea was to layer control 'Confirmed' ,'Deaths' and 'Recovered' for a seamless holistic map. Unfortunately, the way I have implemented my circle markers disallows me to use such a feature (I'm not 100% sure on this but I have my suspicions). 
