![covid](https://www.bandondental.ie/uploads/LTClELt1/banner.png)

# Overview of Coronavirus in the European Union
https://ourworldindata.org/coronavirus#coronavirus-country-profiles
***

# Introduction
***

There are 27 member states in the [European Union](https://en.wikipedia.org/wiki/Demographics_of_the_European_Union) (EU) as of Dec 2021. In this notebook I will:


1. Investigate 4 variables of the Coronavirus Pandemic within the EU.


2. Simulate data to represent how these variables have affected countries in the EU. 

# references for end

https://ourworldindata.org/explorers/coronavirus-data-explorer?facet=none&pickerSort=asc&pickerMetric=location&Metric=People+fully+vaccinated&Interval=7-day+rolling+average&Relative+to+Population=true&Align+outbreaks=false&country=AUT~BEL~BGR~HRV~CYP~CZE~DNK~EST~FIN~FRA~DEU~GRC~HUN~IRL~ITA~LVA~LTU~LUX~MLT~NLD~POL~PRT~ROU~SVK~SVN~ESP~SWE


https://ourworldindata.org/covid-vaccinations


https://www.worldometers.info/population/countries-in-the-eu-by-population/

# Variables
***

### Investigate the types of variables involved, their likely distributions, and their relationships with each other.

1. supply

2. demand/country/population

3. type of vaccines avialable

4. distribution of vaccine



**Columns**

country	population


reported_cases	


reported_deaths	


fully_vaccinated	


partially_vaccinated


unvaccinated	


cases_percentage	


death_percentage

## Reported cases

## Reported covid retaled deaths

## Percentage of population fully vaccinated

## Percentage of polulation partially vaccinated

## Percentage of population unvaccinated

# Relationship between variables
***

### cases vs deaths

### cases vs population

### vaccinated vs partially vs unvaccinated

### vaccinated vs cases

### unvaccinated vs cases

### unvaccinated vs deaths

### unvaccinated vs cases

# Simulate Data using Pandas
***

#### Detail your research and implement the simulation in a Jupyter notebook – the data set itself can simply be displayed in an output cell within the notebook.

#### Synthesise/simulate a data set as closely matching their properties as possible

## Importing Modules
***

In [1]:
# For dataframes.
import pandas as pd

# For numerical arrays. 
import numpy as np

# For plotting.
import matplotlib.pyplot as plt

## Design & Style
***

In [2]:
# Using magic command to set plots to display inline.
%matplotlib inline

# Setting plot style. 
plt.style.use('fivethirtyeight')

# Set figure size. 
plt.rcParams['figure.figsize'] = [18, 14]

### referneces for end
https://stackoverflow.com/questions/20167930/start-index-at-1-for-pandas-dataframe/20168416

population as per 13dec from https://en.wikipedia.org/wiki/Demographics_of_the_European_Union

vaccinated as per 13dec from https://ourworldindata.org/explorers/coronavirus-data-explorer?tab=table&facet=none&pickerSort=asc&pickerMetric=location&Metric=People+fully+vaccinated&Interval=7-day+rolling+average&Relative+to+Population=true&Align+outbreaks=false&country=~AUT

deaths as of 14dec https://ourworldindata.org/coronavirus-data

type of vaccines ireland https://covid-19.geohive.ie/pages/vaccinations

## The Dataframe
***

In [3]:
country = {'country':['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech Republic', 'Denmark',
           'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', 'Ireland', 'Italy', 'Latvia',
           'Lithuania', 'Luxembourg', 'Malta', 'Netherlands', 'Poland', 'Portugal', 'Romania', 'Slovakia',
           'Slovenia', 'Spain', 'Sweden']}

# Load data to df.
df = pd.DataFrame(country)
# Name column.
df.columns = ['country']
# Set index to start at 1. 
df.index = np.arange(1, len(df)+1)
df

Unnamed: 0,country
1,Austria
2,Belgium
3,Bulgaria
4,Croatia
5,Cyprus
6,Czech Republic
7,Denmark
8,Estonia
9,Finland
10,France


In [4]:
# Setting inital start point to generate same random numbers. 
np.random.seed(1000)

# Creating random integers and floats. 
data = {'popultion': np.random.randint(470000, 83000000, 27), 
        'reported_cases': np.random.randint(40624, 400000, 27),
        'reported_deaths': np.random.randint(471, 135000, 27),
        'fully_vaccinated': np.around(np.random.uniform(26.67, 88.81, 27), 2),
        'partially_vaccinated': np.around(np.random.uniform(0.21, 6.28, 27), 2)}
    
# Inputting data into df2. 
df2 = pd.DataFrame(data)
# Setting column names. 
df2.columns = ['population', 'reported_cases', 'reported_deaths', 'fully_vaccinated',
              'partially_vaccinated']
# Setting same index as df.
df2.index = np.arange(1, len(df2)+1)
df2

Unnamed: 0,population,reported_cases,reported_deaths,fully_vaccinated,partially_vaccinated
1,77872711,301136,108393,33.37,4.18
2,71908016,348642,49625,46.33,2.39
3,55372011,61381,45202,52.12,3.38
4,44528974,374730,31425,69.37,0.48
5,58200396,388884,4218,70.31,0.78
6,73707249,156343,127640,54.4,5.8
7,27265648,61248,72560,52.79,0.65
8,58866205,289700,101288,76.55,5.85
9,41098776,57863,124063,60.14,4.56
10,2573145,83980,52057,84.06,3.99


In [5]:
# Joining df2 to df.
df = df.join(df2)
df

Unnamed: 0,country,population,reported_cases,reported_deaths,fully_vaccinated,partially_vaccinated
1,Austria,77872711,301136,108393,33.37,4.18
2,Belgium,71908016,348642,49625,46.33,2.39
3,Bulgaria,55372011,61381,45202,52.12,3.38
4,Croatia,44528974,374730,31425,69.37,0.48
5,Cyprus,58200396,388884,4218,70.31,0.78
6,Czech Republic,73707249,156343,127640,54.4,5.8
7,Denmark,27265648,61248,72560,52.79,0.65
8,Estonia,58866205,289700,101288,76.55,5.85
9,Finland,41098776,57863,124063,60.14,4.56
10,France,2573145,83980,52057,84.06,3.99


In [6]:
# using np function to subtract one array from another. 
unvaccinated = np.subtract(df2['fully_vaccinated'], df2['partially_vaccinated'])

# Load df3. 
df3 = pd.DataFrame(unvaccinated)
# Setting column name. 
df3.columns = ['unvaccinated']
# Set same index again.
df3.index = np.arange(1, len(df3)+1)


In [7]:
df = df.join(df3)

In [8]:
cases_percentage = np.divide( df2['reported_cases'],df2['population'])
cases_percentage = np.around((np.multiply(cases_percentage, 100)), 2)

# Load df4. 
df4 = pd.DataFrame(cases_percentage)
# Setting column name. 
df4.columns = ['cases_percentage']
# Set same index again.
df4.index = np.arange(1, len(df4)+1)


In [9]:
df = df.join(df4)

In [10]:
death_percentage = np.divide( df2['reported_deaths'],df2['population'])
death_percentage = np.around((np.multiply(death_percentage, 100)), 2)

# Load df5. 
df5 = pd.DataFrame(death_percentage)
# Setting column name. 
df5.columns = ['death_percentage']
# Set same index again.
df5.index = np.arange(1, len(df4)+1)

In [11]:
df = df.join(df5)

In [12]:
df

Unnamed: 0,country,population,reported_cases,reported_deaths,fully_vaccinated,partially_vaccinated,unvaccinated,cases_percentage,death_percentage
1,Austria,77872711,301136,108393,33.37,4.18,29.19,0.39,0.14
2,Belgium,71908016,348642,49625,46.33,2.39,43.94,0.48,0.07
3,Bulgaria,55372011,61381,45202,52.12,3.38,48.74,0.11,0.08
4,Croatia,44528974,374730,31425,69.37,0.48,68.89,0.84,0.07
5,Cyprus,58200396,388884,4218,70.31,0.78,69.53,0.67,0.01
6,Czech Republic,73707249,156343,127640,54.4,5.8,48.6,0.21,0.17
7,Denmark,27265648,61248,72560,52.79,0.65,52.14,0.22,0.27
8,Estonia,58866205,289700,101288,76.55,5.85,70.7,0.49,0.17
9,Finland,41098776,57863,124063,60.14,4.56,55.58,0.14,0.3
10,France,2573145,83980,52057,84.06,3.99,80.07,3.26,2.02


## MIGHT USE AT END
***

In [13]:

'''
df.plot.barh(x='country', y=['fully_vaccinated', 'partially_vaccinated', 'unvaccinated'], width=0.65)
plt.gca().invert_yaxis()
plt.xlabel('Percentage of Population')
plt.ylabel('Countries')
plt.title('Vaccination Percentages of European Union')
plt.show()
'''

"\ndf.plot.barh(x='country', y=['fully_vaccinated', 'partially_vaccinated', 'unvaccinated'], width=0.65)\nplt.gca().invert_yaxis()\nplt.xlabel('Percentage of Population')\nplt.ylabel('Countries')\nplt.title('Vaccination Percentages of European Union')\nplt.show()\n"

In [14]:
# ACTUAL DATA COLLECT DEC 13TH 2021
'''
# Creating data for df columns. 
dta = {'country':['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech Republic', 'Denmark',
           'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', 'Ireland', 'Italy', 'Latvia',
           'Lithuania', 'Luxembourg', 'Malta', 'Netherlands', 'Poland', 'Portugal', 'Romania', 'Slovakia',
           'Slovenia', 'Spain', 'Sweden'],
        'population':[8772865, 11351727, 7101859, 4154213, 854802, 10578820, 5748769, 1315635, 5503297, 
                      66989083, 82521653, 10768193, 9797561, 4904226, 60589445, 1950116, 2847904, 590667, 
                      467988, 17081507, 37972964, 10309573, 19644350, 5435343, 2065895, 46528966, 9995153],
        'reported_cases':[1230000, 1960000, 714688, 654655, 141566, 2340000, 568477, 228596, 205357, 8360000, 
                          6580000, 1010000, 120000, 628306, 5240000, 262215, 491953, 94300, 40624, 2940000, 
                          3840000, 120000, 1790000, 1290000, 440633, 5340000, 1230000],
        'reported_deaths':[13218, 27631, 29536, 11666, 611, 34551, 3023, 1865, 1442, 121000, 106000, 19345, 36884,
                       5788, 135000, 4376, 6989, 895, 471, 20140, 88508, 18673, 57741, 15415, 5425, 88484, 15191],
        'fully_vaccinated':[68.38, 75.06, 26.67, 49.92, 65.88, 60.49, 77.18, 60.50, 73.56, 70.80, 68.99, 65.21, 
                      61.17, 76.43, 73.37, 65.78, 67.06, 67.69, 84.08, 74.37, 54.73, 88.81, 39.88, 43.32,
                      56.30, 80.66, 71.46],
        'partially_vaccinated':[3.11, 1.09, 0.80, 4.60, 4.53, 2.28, 3.00, 2.38, 4.29, 6.28, 3.01, 4.74, 3.03, 1.23,
                                5.53, 3.36, 3.19, 2.51, 0.58, 2.93, 1.29, 0.21, 1.43, 5.69, 3.23, 1.70, 3.96],
       'unvaccinated':[28.51, 23.85, 72.53, 45.48, 29.59, 37.23, 19.82, 37.12, 22.15, 22.92, 28.00, 30.05, 
                        35.80, 22.34, 21.10, 30.86, 29.75, 29.80, 15.34, 22.70, 43.98, 10.98, 58.69, 50.99, 
                        40.47, 17.64, 24.58]}

# Inputting data into df. 
dff = pd.DataFrame(dta)
# Setting column names. 
dff.columns = ['country', 'population', 'reported_cases', 'reported_deaths', 'fully_vaccinated', 
              'partially_vaccinated', 'unvaccinated']
# Setting index to np array starting at 1. 
dff.index = np.arange(1, len(dff)+1)
#df.set_index('country', inplace=True)
dff
'''

"\n# Creating data for df columns. \ndta = {'country':['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech Republic', 'Denmark',\n           'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', 'Ireland', 'Italy', 'Latvia',\n           'Lithuania', 'Luxembourg', 'Malta', 'Netherlands', 'Poland', 'Portugal', 'Romania', 'Slovakia',\n           'Slovenia', 'Spain', 'Sweden'],\n        'population':[8772865, 11351727, 7101859, 4154213, 854802, 10578820, 5748769, 1315635, 5503297, \n                      66989083, 82521653, 10768193, 9797561, 4904226, 60589445, 1950116, 2847904, 590667, \n                      467988, 17081507, 37972964, 10309573, 19644350, 5435343, 2065895, 46528966, 9995153],\n        'reported_cases':[1230000, 1960000, 714688, 654655, 141566, 2340000, 568477, 228596, 205357, 8360000, \n                          6580000, 1010000, 120000, 628306, 5240000, 262215, 491953, 94300, 40624, 2940000, \n                          3840000, 120000, 1790000, 12900

# Conclusion

# References
***
All references and code used in this notebook has been sourced in Dec 2021 from the following webpages:


- 


- 



***
# End