# COVID-19 Data Analysis Project
This notebook analyzes global COVID-19 trends using real-world data.

## 1. Data Collection
We use the [Our World in Data COVID-19 Dataset](https://covid.ourworldindata.org/data/owid-covid-data.csv).

In [None]:
import pandas as pd

df = pd.read_csv('owid-covid-data.csv')

## 2. Data Exploration
Explore structure and missing values.

In [None]:
df.columns

In [None]:
df.head()

In [None]:
df.isnull().sum()

## 3. Data Cleaning
Filter relevant countries and prepare columns.

In [None]:
countries = ['Kenya', 'United States', 'India']
df = df[df['location'].isin(countries)]
df['date'] = pd.to_datetime(df['date'])
df = df.dropna(subset=['date', 'total_cases', 'total_deaths'])
df.fillna(method='ffill', inplace=True)

## 4. Exploratory Data Analysis (EDA)
Visualize total cases, deaths and compute death rate.

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

# Total cases over time
plt.figure(figsize=(12, 6))
for country in countries:
    temp = df[df['location'] == country]
    plt.plot(temp['date'], temp['total_cases'], label=country)
plt.title('Total COVID-19 Cases Over Time')
plt.legend()
plt.show()

In [None]:
# Total deaths over time
plt.figure(figsize=(12, 6))
for country in countries:
    temp = df[df['location'] == country]
    plt.plot(temp['date'], temp['total_deaths'], label=country)
plt.title('Total COVID-19 Deaths Over Time')
plt.legend()
plt.show()

In [None]:
df['death_rate'] = df['total_deaths'] / df['total_cases']
df[['location', 'date', 'death_rate']].tail()

## 5. Visualizing Vaccination Progress

In [None]:
plt.figure(figsize=(12, 6))
for country in countries:
    temp = df[df['location'] == country]
    plt.plot(temp['date'], temp['total_vaccinations'], label=country)
plt.title('Cumulative Vaccinations Over Time')
plt.legend()
plt.show()

## 6. Choropleth Map (Optional)

In [None]:
import plotly.express as px

latest = df[df['date'] == df['date'].max()]
fig = px.choropleth(latest,
                    locations='location',
                    locationmode='country names',
                    color='total_cases',
                    title='Total COVID-19 Cases by Country')
fig.show()

## 7. Insights & Conclusion
1. The United States had the highest total cases and deaths among the selected countries.
2. India experienced a sharp rise in cases during the second wave around mid-2021.
3. Kenya had a slower vaccine rollout compared to the USA and India.
4. Death rates varied significantly, likely due to healthcare differences and early detection.
5. Vaccination drives strongly correlated with drop in new cases post mid-2021.

**Conclusion:**
This analysis reveals the global impact of COVID-19 in different regions, highlighting disparities in vaccination and outcomes. Visualization helps communicate trends and guide health policies.