# Import the libraries

**Libraries Required**

* _Numpy_ `pip install numpy`
* _Pandas_ `pip install pandas` 
* _Seaborn_ `pip install seaborn`
* _Matplotlib_ `pip install matplotlib`

In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Importing COVID-19 Dataset

**Download the dataset from the link given below :**

* COVID-19 Confirmed Cases - https://bit.ly/39hXM8s

In [None]:
covid_dataset = pd.read_csv('../input/confirmed-covid-cases/covid_confirmed.csv')
covid_dataset.head()

**Check the shape of the dataframe**

In [None]:
covid_dataset.shape

## Delete the useless columns

In [None]:
covid_dataset.drop(['Lat','Long'], axis=1, inplace=True)
covid_dataset.head()

## Aggregate the rows by the country

In [None]:
covid_aggregated = covid_dataset.groupby('Country/Region').sum()

In [None]:
covid_aggregated.head()

**Check the new shape of the aggregated dataset**

In [None]:
covid_aggregated.shape

## Visualizing data related to some countries like India, China, US

In [None]:
covid_aggregated.loc['India'].plot()
covid_aggregated.loc['China'].plot()
covid_aggregated.loc['US'].plot()
plt.legend()

## Calculating a good measure

**We need to find a good measure represented as a number, describing the spread of the virus in the country.**

In [None]:
covid_aggregated.loc['India'].plot()

## Caculating the first derivative of the curve

In [None]:
covid_aggregated.loc['India'].diff().plot()

### Find Maxmimum Infection Rate for India

In [None]:
covid_aggregated.loc['India'].diff().max()

### Find Maxmimum Infection Rate for China

In [None]:
covid_aggregated.loc['China'].diff().max()

### Find Maxmimum Infection Rate for US

In [None]:
covid_aggregated.loc['US'].diff().max()

###  Find Maximum Infection Rate for all the countries

In [None]:
countries = list(covid_aggregated.index)
max_infection_rates = []
for country in countries :
    max_infection_rates.append(covid_aggregated.loc[country].diff().max())
covid_aggregated['Maximum Infection Rate'] = max_infection_rates

In [None]:
covid_aggregated.head()

### Create a new dataframe with only Maximum Infection Rate column 

In [None]:
covid_data = pd.DataFrame(covid_aggregated['Maximum Infection Rate'])
covid_data

# Import World Happiness Report Dataset

**Download dataset from the link given below :**

* World Happiness Report 2019 - https://bit.ly/2ZQFSXg

In [None]:
happiness_report = pd.read_csv('../input/world-happiness/2019.csv')
happiness_report.head()

**Check the shape of the dataset**

In [None]:
happiness_report.shape

**Drop unnecessary columns like Overall Rank, Score, Generosity, Perceptipns of corruption**

In [None]:
columns_to_drop = ['Overall rank','Score','Generosity','Perceptions of corruption']
happiness_report.drop(columns_to_drop, axis=1, inplace=True)
happiness_report.head()

**Change the indices of the dataframe**

In [None]:
happiness_report.set_index('Country or region', inplace=True)
happiness_report.head()

### COVID-19 final dataset

In [None]:
covid_data.head()

### World Happiness Report final dataset

In [None]:
happiness_report.head()

### Join the both final datasets we have prepared

In [None]:
data = happiness_report.join(covid_data).copy()
data.head()

### Correlation Matrix

In [None]:
data.corr()
# it is representing the currelation between every two columns of our dataset

# Visualization of Results

**Our analysis isn't finished unless we visualize the results in terms figures and graphs so that everyone can understand what we have got out of our analysis**

In [None]:
data.head()

### Plotting GDP vs Maximum Infection Rate

In [None]:
x = data['GDP per capita']
y = data['Maximum Infection Rate']
sns.scatterplot(x,np.log(y))

In [None]:
sns.regplot(x,np.log(y))

### Plotting Social support vs Maximum Infection Rate 

In [None]:
x = data['Social support']
y = data['Maximum Infection Rate']
sns.scatterplot(x,np.log(y))

In [None]:
sns.regplot(x,np.log(y))

### Plotting Health life expectancy vs Maximum Infection Rate 

In [None]:
x = data['Healthy life expectancy']
y = data['Maximum Infection Rate']
sns.scatterplot(x,np.log(y))

In [None]:
sns.regplot(x,np.log(y))

### Plotting Freedom to make life choices vs Maximum Infection Rate 

In [None]:
x = data['Freedom to make life choices']
y = data['Maximum Infection Rate']
sns.scatterplot(x,np.log(y))

In [None]:
sns.regplot(x,np.log(y))

# Conclusion

**From the above analysis, we came to a conclusion that people who are living in more developed countries are more prone to getting infected by the novel Corona virus as compared to those living in less developed countries.**
**This may be due to lack of Corona tests in the less developed countries. In order to prove that this is not the case, we can perform similar analysis on dataset related to cumulative number of deaths.**