
# 🦠 COVID-19 Global Data Tracker

This notebook analyzes global COVID-19 trends including cases, deaths, and vaccinations. It uses real-world data to generate insights and visualizations.



## 1️⃣ Data Collection

Download the dataset from [Our World in Data](https://github.com/owid/covid-19-data/tree/master/public/data).
Save the file as `owid-covid-data.csv` in your working directory.


In [None]:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset
df = pd.read_csv("owid-covid-data.csv")
df.head()



## 2️⃣ Data Exploration

Preview the dataset and check for missing values.


In [None]:

print(df.columns)
print(df.isnull().sum())



## 3️⃣ Data Cleaning

Filter countries of interest and handle missing values.


In [None]:

countries = ["Kenya", "United States", "India"]
df = df[df['location'].isin(countries)]
df['date'] = pd.to_datetime(df['date'])
df = df.fillna(0)
df.head()



## 4️⃣ Exploratory Data Analysis (EDA)

Visualize total cases and deaths over time.


In [None]:

plt.figure(figsize=(12,6))
for country in countries:
    country_data = df[df['location'] == country]
    plt.plot(country_data['date'], country_data['total_cases'], label=country)
plt.title("Total COVID-19 Cases Over Time")
plt.xlabel("Date")
plt.ylabel("Total Cases")
plt.legend()
plt.show()


In [None]:

plt.figure(figsize=(12,6))
for country in countries:
    country_data = df[df['location'] == country]
    plt.plot(country_data['date'], country_data['total_deaths'], label=country)
plt.title("Total COVID-19 Deaths Over Time")
plt.xlabel("Date")
plt.ylabel("Total Deaths")
plt.legend()
plt.show()



## 5️⃣ Visualizing Vaccination Progress

Compare cumulative vaccinations over time.


In [None]:

plt.figure(figsize=(12,6))
for country in countries:
    country_data = df[df['location'] == country]
    plt.plot(country_data['date'], country_data['total_vaccinations'], label=country)
plt.title("Total Vaccinations Over Time")
plt.xlabel("Date")
plt.ylabel("Total Vaccinations")
plt.legend()
plt.show()



## 6️⃣ Optional: Choropleth Map

Use Plotly Express to visualize global case density.


In [None]:

import plotly.express as px

latest_date = df['date'].max()
latest_df = df[df['date'] == latest_date]
fig = px.choropleth(latest_df,
                    locations="iso_code",
                    color="total_cases",
                    hover_name="location",
                    title="Global COVID-19 Cases",
                    color_continuous_scale="Viridis")
fig.show()



## 7️⃣ Insights & Reporting

- India had the highest total cases among selected countries.
- Vaccination rollout was fastest in the United States.
- Death rates varied significantly across regions.

Use markdown cells to summarize your findings and add narrative explanations.
