# COVID-19 Global Data Tracker
**Author**: Akafingo Germanus  
**Date**: 2025-05-05  

This project analyzes global COVID-19 trends including cases, deaths, and vaccinations using Python data tools.

## 1️⃣ Data Collection
**Source**: [Our World in Data - COVID-19 Dataset](https://ourworldindata.org/coronavirus-source-data)  
**File**: `owid-covid-data.csv`
- Save this file in your working directory before proceeding.

## 2️⃣ Data Loading & Exploration

In [1]:
import pandas as pd

# Load data
df = pd.read_csv('owid-covid-data.csv')

# Preview
print(df.shape)
df.head()

(429435, 67)


Unnamed: 0,iso_code,continent,location,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,...,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,population,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
0,AFG,Asia,Afghanistan,2020-01-05,0.0,0.0,,0.0,0.0,,...,,37.746,0.5,64.83,0.511,41128772,,,,
1,AFG,Asia,Afghanistan,2020-01-06,0.0,0.0,,0.0,0.0,,...,,37.746,0.5,64.83,0.511,41128772,,,,
2,AFG,Asia,Afghanistan,2020-01-07,0.0,0.0,,0.0,0.0,,...,,37.746,0.5,64.83,0.511,41128772,,,,
3,AFG,Asia,Afghanistan,2020-01-08,0.0,0.0,,0.0,0.0,,...,,37.746,0.5,64.83,0.511,41128772,,,,
4,AFG,Asia,Afghanistan,2020-01-09,0.0,0.0,,0.0,0.0,,...,,37.746,0.5,64.83,0.511,41128772,,,,


In [2]:
# Columns and missing values
df.columns
df.isnull().sum().sort_values(ascending=False).head(10)

weekly_icu_admissions_per_million          418442
weekly_icu_admissions                      418442
excess_mortality_cumulative_per_million    416024
excess_mortality                           416024
excess_mortality_cumulative                416024
excess_mortality_cumulative_absolute       416024
weekly_hosp_admissions_per_million         404938
weekly_hosp_admissions                     404938
icu_patients                               390319
icu_patients_per_million                   390319
dtype: int64

## 3️⃣ Data Cleaning
Focus on selected countries and ensure the `date` column is in datetime format.

In [None]:
df['date'] = pd.to_datetime(df['date'])

# Example: Filter data for Kenya, USA, and India
countries = ['Kenya', 'United States', 'India']
df_filtered = df[df['location'].isin(countries)]
df_filtered = df_filtered[['date', 'location', 'total_cases', 'total_deaths', 'new_cases', 'new_deaths', 'total_vaccinations']]
df_filtered = df_filtered.dropna(subset=['date', 'total_cases'])
df_filtered.head()

## 4️⃣ Exploratory Data Analysis

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(12,6))
sns.lineplot(data=df_filtered, x='date', y='total_cases', hue='location')
plt.title('Total COVID-19 Cases Over Time')
plt.ylabel('Total Cases')
plt.xlabel('Date')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

## 5️⃣ Visualizing Vaccination Progress

In [None]:
plt.figure(figsize=(12,6))
sns.lineplot(data=df_filtered, x='date', y='total_vaccinations', hue='location')
plt.title('Vaccination Rollout Over Time')
plt.ylabel('Total Vaccinations')
plt.xlabel('Date')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

## 6️⃣ Optional: Choropleth Map
*You can use Plotly Express or GeoPandas to show global case/vaccination distributions.*

## 7️⃣ Insights & Reporting
**Write your insights below:**

- Insight 1: 
- Insight 2: 
- Insight 3: 

Use markdown cells to explain visualizations and findings.