# COVID-19 Global Data Tracker

This notebook serves as the main analysis and reporting tool for tracking global COVID-19 trends. It includes sections for data loading, cleaning, exploratory data analysis, visualizations, and insights reporting.

## 1. Data Loading

In this section, we will load the COVID-19 dataset from Our World in Data.

In [None]:
import pandas as pd

# Load dataset
df = pd.read_csv("../data/owid-covid-data.csv")

# Preview structure
print(df.columns)
print(df.head())

## 2. Data Cleaning

This section focuses on cleaning the dataset to prepare it for analysis.

In [None]:
# Filter countries
countries = ['Kenya', 'United States', 'India']
df = df[df['location'].isin(countries)]

# Convert date column
df['date'] = pd.to_datetime(df['date'])

# Drop rows with missing critical data
df.dropna(subset=['total_cases', 'total_deaths'], inplace=True)

# Fill remaining missing values
df.fillna(0, inplace=True)

## 📊 3. Exploratory Data Analysis (EDA)

In this section, we will perform exploratory data analysis to understand trends in the data.

In [None]:
import matplotlib.pyplot as plt

# Plot total cases
plt.figure(figsize=(12, 6))
for country in countries:
    data = df[df['location'] == country]
    plt.plot(data['date'], data['total_cases'], label=country)

plt.title("Total COVID-19 Cases Over Time")
plt.xlabel("Date")
plt.ylabel("Total Cases")
plt.legend()
plt.tight_layout()
plt.show()

## 💉 4. Vaccination Analysis

This section analyzes vaccination trends across the selected countries.

In [None]:
# Plot total vaccinations
plt.figure(figsize=(12, 6))
for country in countries:
    data = df[df['location'] == country]
    plt.plot(data['date'], data['total_vaccinations'], label=country)

plt.title("COVID-19 Vaccinations Over Time")
plt.xlabel("Date")
plt.ylabel("Total Vaccinations")
plt.legend()
plt.tight_layout()
plt.show()

## 🗺️ 5. (Optional) Choropleth Map

In this section, we will create a choropleth map to visualize COVID-19 cases by country.

In [None]:
import plotly.express as px

# Latest data
latest_df = df[df['date'] == df['date'].max()]

# Plot map
fig = px.choropleth(latest_df,
                    locations="iso_code",
                    color="total_cases",
                    hover_name="location",
                    title="Total COVID-19 Cases by Country",
                    color_continuous_scale="Reds")
fig.show()

## 📝 6. Key Insights

This section summarizes the key insights derived from the analysis.

- India had the highest surge during mid-2021.
- The US achieved rapid vaccination progress.
- Kenya maintained a relatively low case count.
- There is a clear correlation between vaccine coverage and death rate.