# 🌍 COVID-19 Global Data Analysis

**Date:** 2025-05-15

This notebook provides a comprehensive analysis of global COVID-19 trends using data from Our World in Data. We will explore trends in cases, deaths, and vaccinations, and visualize our findings.

## 1️⃣ Data Collection

- Download `owid-covid-data.csv` from [Our World in Data](https://ourworldindata.org/covid-deaths)
- Save the file in your working directory.

## 2️⃣ Data Loading & Exploration

In [1]:
import pandas as pd

# Load the dataset
df = pd.read_csv('owid-covid-data.csv')

# Display basic info
print(df.columns)
df.head()

ModuleNotFoundError: No module named 'pandas'

In [None]:
# Check for missing values
df.isnull().sum()

## 3️⃣ Data Cleaning

In [None]:
# Filter selected countries
countries = ['Kenya', 'United States', 'India']
df = df[df['location'].isin(countries)]

# Convert date column to datetime
df['date'] = pd.to_datetime(df['date'])

# Fill missing values using forward fill
df.fillna(method='ffill', inplace=True)

# Drop rows with missing critical values
df.dropna(subset=['date', 'total_cases', 'total_deaths'], inplace=True)

## 4️⃣ Exploratory Data Analysis (EDA)

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

# Plot total cases over time
plt.figure(figsize=(12,6))
for country in countries:
    data = df[df['location'] == country]
    plt.plot(data['date'], data['total_cases'], label=country)
plt.title('Total COVID-19 Cases Over Time')
plt.xlabel('Date')
plt.ylabel('Total Cases')
plt.legend()
plt.show()

In [None]:
# Plot total deaths over time
plt.figure(figsize=(12,6))
for country in countries:
    data = df[df['location'] == country]
    plt.plot(data['date'], data['total_deaths'], label=country)
plt.title('Total COVID-19 Deaths Over Time')
plt.xlabel('Date')
plt.ylabel('Total Deaths')
plt.legend()
plt.show()

In [None]:
# Calculate and plot death rate
df['death_rate'] = df['total_deaths'] / df['total_cases']
sns.lineplot(data=df, x='date', y='death_rate', hue='location')
plt.title('COVID-19 Death Rate Over Time')
plt.show()

## 5️⃣ Vaccination Progress

In [None]:
# Plot cumulative vaccinations over time
plt.figure(figsize=(12,6))
for country in countries:
    data = df[df['location'] == country]
    plt.plot(data['date'], data['total_vaccinations'], label=country)
plt.title('Cumulative COVID-19 Vaccinations Over Time')
plt.xlabel('Date')
plt.ylabel('Total Vaccinations')
plt.legend()
plt.show()

## 6️⃣ Optional: Choropleth Map

In [None]:
# Requires plotly
# import plotly.express as px

# latest_date = df['date'].max()
# latest_df = df[df['date'] == latest_date]

# fig = px.choropleth(latest_df, locations='iso_code',
#                     color='total_cases',
#                     hover_name='location',
#                     color_continuous_scale='Reds',
#                     title='Total COVID-19 Cases by Country')
# fig.show()

## 7️⃣ Insights & Reporting

**Key Insights:**
1. 
2. 
3. 

Use this section to summarize major findings and interesting trends. You can also export this notebook to PDF or PowerPoint for presentation.