# ðŸ¦  COVID-19 Data Analysis Project  
### Module 12 â€“ Assignment 5  
This notebook analyzes COVID-19 dataset & the World Happiness dataset.  
We perform:

- Data loading  
- Data cleaning  
- Exploratory Data Analysis (EDA)  
- Visualization  
- Country-wise trends  
- Merging datasets  
- Correlation between Happiness factors & COVID-19 outcomes  

---


1. Import Required Libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Do not specify colors (as per instructions)
plt.style.use('default')


2. Replace filenames


In [None]:
covid = pd.read_csv("covid_data.csv")
happy = pd.read_csv("world_happiness.csv")

covid.head(), happy.head()


3. Inspect & Clean Data

In [None]:
print("COVID-19 Dataset Info:")
covid.info()

print("\nHappiness Dataset Info:")
happy.info()

# Remove missing values
covid = covid.dropna()
happy = happy.dropna()

print("\nAfter cleaning:")
covid.info()
happy.info()


4. COVID-19 Summary (Group by Country)

In [None]:
covid_summary = covid.groupby("Country")[
    ["Confirmed", "Deaths", "Recovered"]
].sum().reset_index()

covid_summary.head()


5. Plot Top 10 Countries by Confirmed Cases

In [None]:
top10 = covid_summary.sort_values("Confirmed", ascending=False).head(10)

plt.figure(figsize=(10,5))
plt.bar(top10["Country"], top10["Confirmed"])
plt.title("Top 10 Countries by Confirmed COVID-19 Cases")
plt.xlabel("Country")
plt.ylabel("Confirmed Cases")
plt.xticks(rotation=45)
plt.show()


6. Death Rate Calculation

In [None]:
covid_summary["DeathRate"] = (
    covid_summary["Deaths"] / covid_summary["Confirmed"]
)

covid_summary.sort_values("DeathRate", ascending=False).head()


7. Histogram: COVID-19 Death Rates

In [None]:
plt.figure()
plt.hist(covid_summary["DeathRate"], bins=40)
plt.title("Distribution of COVID-19 Death Rates")
plt.xlabel("Death Rate")
plt.ylabel("Frequency")
plt.show()


8. Select Relevant Happiness Columns

In [None]:
happy_small = happy[[
    "Country",
    "Happiness Score",
    "Economy (GDP per Capita)",
    "Health (Life Expectancy)",
    "Freedom"
]]

happy_small.head()


9.Merge COVID + Happiness Datasets

In [None]:
merged = pd.merge(
    covid_summary,
    happy_small,
    on="Country",
    how="inner"
)

merged.head()


10. Correlation Analysis

In [None]:
corr = merged.corr(numeric_only=True)
corr


11. Correlation Plot (Heatmap Without Colors)

In [None]:
plt.figure(figsize=(8,6))
plt.imshow(corr, aspect='auto')
plt.title("Correlation Matrix")
plt.xticks(range(len(corr.columns)), corr.columns, rotation=90)
plt.yticks(range(len(corr.columns)), corr.columns)
plt.colorbar()
plt.show()


12. Save Clean & Merged Data

In [None]:
merged.to_csv("covid_happiness_merged.csv", index=False)
print("Merged dataset saved.")
