# Data Science Capstone Project - European COVID-19 Analysis

## Introduction

After a recent surge in public attacks against medical laboratories due COVID-19, the World Health Organisation (WHO) have engaged the services of a wannabe data scientist.

## Business Problem

The WHO require an investigation into the conspiracy theory involving the prevalence of medical laboratories and the death rate of the associated country.

Specifically they require:

1. European countries to be assigned to a category based on COVID-19 death rate
2. Establish if there is a correlation between medical laboratories and death rate




## Data

In [9]:
# import packages required

import requests
import pandas as pd
import numpy as np
from pandas.io.json import json_normalize
from geopy.geocoders import Nominatim

### FourSquare Medical Facility information

Medical lab information is available on foursquare and will give an indication of density of labs that can be compared to a countries COVID-19 death rate - example is provide below:

In [6]:
# FourSquare credentials


CLIENT_ID = 'RGSEAE3TO55N3VLHU4RYF3C5A5PWCKZUOBQ5JLQAWUTMBNLF' # your Foursquare ID
CLIENT_SECRET = 'TWFMJ0UAR5ILWQ1VGL5LFAEMJ44A4DXAXRM0YQTBG3BY2U5K' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 100
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: RGSEAE3TO55N3VLHU4RYF3C5A5PWCKZUOBQ5JLQAWUTMBNLF
CLIENT_SECRET:TWFMJ0UAR5ILWQ1VGL5LFAEMJ44A4DXAXRM0YQTBG3BY2U5K


In [18]:
# example of Medical lab infomation

address = 'Melbourne, Australia'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
#print(latitude, longitude)

search_query = 'Medical'
radius = 1000

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
#url

results = requests.get(url).json()
#results

# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

Unnamed: 0,categories,hasPerk,id,location.address,location.cc,location.city,location.country,location.crossStreet,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.neighborhood,location.postalCode,location.state,name,referralId,venuePage.id
0,"[{'id': '4bf58dd8d48988d177941735', 'name': 'D...",False,4cbbbe54bac937047b30f47c,463 Bourke St.,AU,Melbourne,Australia,,171,"[463 Bourke St., Melbourne VIC 3000, Australia]","[{'label': 'display', 'lat': -37.8151724400127...",-37.815172,144.961627,,3000,VIC,Town Medical Centre,v-1589432462,
1,"[{'id': '4bf58dd8d48988d177941735', 'name': 'D...",False,4be0a28d4f15c92841e5cb0b,"Level 3, 23 QV Terrace, 292 Swanston Street",AU,Melbourne,Australia,Cnr Swanston & Lonsdale St,457,"[Level 3, 23 QV Terrace, 292 Swanston Street (...","[{'label': 'display', 'lat': -37.8106192550542...",-37.810619,144.965673,,3000,VIC,Medical One,v-1589432462,
2,[],False,4b75f872f964a5201d332ee3,45 Collins Street,AU,Melbourne,Australia,,838,"[45 Collins Street, Melbourne VIC 3000, Austra...","[{'label': 'display', 'lat': -37.814379, 'lng'...",-37.814379,144.972691,,3000,VIC,Cosmetic & Laser Medical Centre,v-1589432462,557211494.0
3,"[{'id': '4bf58dd8d48988d177941735', 'name': 'D...",False,51b67ceb498ee6c810a8ea54,"Level 4,250, Collins Street",AU,Melbourne,Australia,,265,"[Level 4,250, Collins Street, Melbourne VIC 30...","[{'label': 'display', 'lat': -37.81573, 'lng':...",-37.81573,144.96549,,3000,VIC,Medical Specialists on Collins,v-1589432462,
4,"[{'id': '4bf58dd8d48988d104941735', 'name': 'M...",False,5035b48fe4b0e3f1e5b123d6,7/267 Collins Street,AU,Melbourne,Australia,,273,"[7/267 Collins Street, Melbourne VIC 3000, Aus...","[{'label': 'display', 'lat': -37.81614, 'lng':...",-37.81614,144.9651,,3000,VIC,Collins Street Medical Centre,v-1589432462,


### Wikipedia Country Covid-19 information

Covid-19 totoal numbers of deaths is available from the link below. Thsi figues can be used to calculate the countries death rate (by population). This table is updated daily. Example below:

In [5]:
# pull capital city info from nationsonline.org

url_capitals = 'https://www.nationsonline.org/oneworld/capitals_europe.htm'
html_capitals = requests.get(url_capitals).content
df_list = pd.read_html(html_capitals)
df_capitals = df_list[2]

df_capitals.head()

Unnamed: 0,0,1,2,3,4
0,Capital Cities and States of Europe,Capital Cities and States of Europe,Capital Cities and States of Europe,Capital Cities and States of Europe,Capital Cities and States of Europe
1,Capital City,Satellite View and Map,Citizens,Country,"Tower Bridge, London (adsbygoogle = window.ads..."
2,Amsterdam The Hague (Den Haag; seat of govt),Amsterdam Map The Hague,"863,000 540,000",Netherlands,"Tower Bridge, London (adsbygoogle = window.ads..."
3,Andorra la Vella,Andorra la Vella Map,23000,Andorra,"Tower Bridge, London (adsbygoogle = window.ads..."
4,Athens (Athína),Athens Map,664000,Greece,"Tower Bridge, London (adsbygoogle = window.ads..."


### Wikipedia Country Populations

In [None]:
Country population figures exist in the link below. Examples provided below:

In [4]:
# pull country population stats from wikipedia

url_worldPop = 'https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population'
html_worldPop = requests.get(url_worldPop).content
df_list = pd.read_html(html_worldPop)
df_worldPop = df_list[0]

df_worldPop.head()

Unnamed: 0,Rank,Country (or dependent territory),Population,% of worldpopulation,Date,Source
0,1,China[b],1402640000,,14 May 2020,National population clock[3]
1,2,India[c],1362227886,,14 May 2020,National population clock[4]
2,3,United States[d],329685899,,14 May 2020,National population clock[5]
3,4,Indonesia,266911900,,1 Jul 2019,National annual projection[6]
4,5,Pakistan[e],220892331,,1 Jul 2020,UN Projection[2]
