# COVID-19 Global Data Tracker

This notebook analyzes global COVID-19 trends including cases, deaths, recoveries, and vaccinations using the [Our World in Data COVID-19 dataset](https://github.com/owid/covid-19-data/tree/master/public/data).

## Objectives
- Import and clean COVID-19 data
- Analyze time trends for cases, deaths, vaccinations
- Compare metrics across countries
- Visualize data using charts and maps
- Summarize findings


In [None]:
# Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import warnings
warnings.filterwarnings('ignore')


## Load and Explore Dataset

In [None]:
# Load the dataset
df = pd.read_csv('owid-covid-data.csv')
df.head()

In [None]:
# Check data info and missing values
df.info()
df.isnull().sum().sort_values(ascending=False).head(10)

## Data Cleaning

In [None]:
# Convert date to datetime
df['date'] = pd.to_datetime(df['date'])

# Filter selected countries
countries = ['Kenya', 'India', 'United States']
df_filtered = df[df['location'].isin(countries)]

# Fill missing numeric values
df_filtered.fillna(method='ffill', inplace=True)
df_filtered.head()

## Exploratory Data Analysis (EDA)

In [None]:
# Plot total cases over time
plt.figure(figsize=(10,6))
for country in countries:
    country_df = df_filtered[df_filtered['location'] == country]
    plt.plot(country_df['date'], country_df['total_cases'], label=country)
plt.title('Total COVID-19 Cases Over Time')
plt.xlabel('Date')
plt.ylabel('Total Cases')
plt.legend()
plt.show()

In [None]:
# Compare daily new cases
sns.lineplot(data=df_filtered, x='date', y='new_cases', hue='location')
plt.title('Daily New COVID-19 Cases')
plt.show()

## Vaccination Progress

In [None]:
# Plot vaccinations over time
plt.figure(figsize=(10,6))
for country in countries:
    country_df = df_filtered[df_filtered['location'] == country]
    plt.plot(country_df['date'], country_df['total_vaccinations'], label=country)
plt.title('Total Vaccinations Over Time')
plt.xlabel('Date')
plt.ylabel('Vaccinations')
plt.legend()
plt.show()

## Insights & Observations

- insights = [
    "1. The USA had the highest total number of cases, followed by India.",
    "2. Kenya has experienced fewer total cases compared to the USA and India, but new cases have fluctuated widely.",
    "3. The USA had a faster vaccine rollout compared to Kenya and India in early 2021.",
    "4. Death rates in the USA have been higher compared to India and Kenya, reflecting different healthcare systems and virus management."
]

for insight in insights:
    print(insight)


