# COVID-19 Vaccination Progress Analysis

This notebook analyzes COVID-19 vaccination data from the 'Our World in Data' dataset. We explore the vaccination rollout across selected countries, focusing on cumulative counts and percentage of the population vaccinated over time.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('owid-covid-data.csv', parse_dates=['date'])
print(df.columns)
# print(df.head())
# print(df.describe())

df = df[df['continent'].notnull()]

# Filter for relevant columns
columns_to_keep = ['location', 'date', 'total_cases', 'new_cases','total_deaths',
                   'new_deaths', 'total_vaccinations', 'people_fully_vaccinated', 
                   'people_vaccinated_per_hundred', 'population'
                ]

df = df[columns_to_keep]
df = df.dropna(subset=['new_cases'])

# Daily COVID-19 Cases in Kenya (March 2021)

In [None]:
# You can edit the parameters to get specific information
# parameters to use
country = 'Kenya'
month = '2021-03'

# Filter data
daily_month = df[(df['location']==country) & (df['date'].dt.strftime('%Y-%m')== month)]

# plot
plt.figure(figsize=(12,6))
plt.plot(daily_month['date'], daily_month['new_cases'],marker='o')
plt.title(f'Daily COVID-19 Cases in {country} - {month}')
plt.xlabel('Date')
plt.ylabel('New Cases')
plt.xticks(rotation=45)
plt.grid(True)
plt.tight_layout()
plt.show()

### Insight:
- Kenya experienced a noticeable surge in daily reported cases in March 2021.
- The trend shows an upward trajectory starting mid-March, indicating the onset of a COVID-19 wave.
- This period corresponds to the start of Kenya’s third wave, driven by community transmission and the emergence of variants.

# Monthly COVID-19 Cases in Kenya (2021)

In [None]:
# Monthly Cases in a Year in One Country
year = 2021

# Filter and group
monthly = df[(df['location']== country) & (df['date'].dt.year==year)]
monthly_grouped = monthly.groupby(monthly['date'].dt.to_period('M'))['new_cases'].sum().reset_index()
monthly_grouped['date'] = monthly_grouped['date'].dt.to_timestamp()

# Plot
plt.figure(figsize=(12, 6))
plt.bar(monthly_grouped['date'], monthly_grouped['new_cases'], width=20)
plt.title(f'Monthly COVID-19 Cases in {country} - {year}')
plt.xlabel('Month')
plt.ylabel('New Cases')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

### Insight:
- Peaks were observed in March and August 2021, reflecting Kenya's third and fourth waves.
- The highest number of monthly cases was in August, likely due to the Delta variant.
- Cases dropped significantly after September, likely influenced by increased public awareness and initial vaccination efforts.


# Daily COVID-19 Cases (All Countries, March 1–31, 2021)

In [None]:
# Daily Comparison Between Two or More Countries
# Parameters
countries = ['Kenya', 'Uganda', 'Tanzania']
start_date = '2021-03-01'
end_date = '2021-03-31'

compare_daily = df[
    (df['location'].isin(countries))&
    (df['date']>= start_date) &
    (df['date']<= end_date)
]

pivot_daily = compare_daily.pivot(index='date', columns='location', values='new_cases')

# Plot
plt.figure(figsize=(12,6))
for country in countries:
    plt.plot(pivot_daily.index, pivot_daily[country], label=country)
plt.title(f'Daily COVID-19 Cases: {start_date} to {end_date}')
plt.xlabel('Date')
plt.ylabel('New Cases')
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

### Insight:
- South Africa consistently reported the highest number of daily cases compared to Kenya, Uganda, and Tanzania.
- Kenya's daily cases increased steadily, showing a sharper rise than Uganda or Tanzania.
- Tanzania showed little or no reported cases—likely due to lack of reporting or underreporting during that period.


# Monthly COVID-19 Case Comparison (2021)

In [18]:
# Monthly Comparison Between two or More countries
df['month'] = df['date'].dt.to_period('M')

# Filter
monthly_compare = df[df['location'].isin(countries)]
monthly_group = monthly_compare.groupby(['month','location'])['new_cases'].sum().reset_index()
monthly_group['month'] = monthly_group['month'].dt.to_timestamp()

# Pivot
pivot_monthly = monthly_group.pivot(index='month', columns='location', values='new_cases')

# Plot
plt.figure(figsize=(12,6))
for country in countries:
    plt.plot(pivot_monthly.index, pivot_monthly[country], marker='o', label=country)
plt.title('Monthly COVID-19 Case Comparison')
plt.xlabel('Month')
plt.ylabel('New Cases')
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

### Insight:
- Peaks were observed in March and August 2021, reflecting Kenya's third and fourth waves.
- The highest number of monthly cases was in August, likely due to the Delta variant.
- Cases dropped significantly after September, likely influenced by increased public awareness and initial vaccination efforts.


## Cumulative Vaccination Progress Over Time

In [None]:
df = df.dropna(subset=['total_vaccinations'])

# List of countries to compare
countries= ['Kenya', 'Uganda', 'Tanzania', 'South Africa']

# Filter only the selected countries
vacc_data = df[df['location'].isin(countries)]

# Pivot the data for easy plotting
pivot_vacc = vacc_data.pivot(index='date', columns='location', values='total_vaccinations')

# Plotting
plt.figure(figsize=(14,7))

for country in countries:
    plt.plot(pivot_vacc.index, pivot_vacc[country], label=country)
plt.title('Cumulative COVID-19 Vaccination Over Time')
plt.xlabel('Date')
plt.ylabel('Total Vaccinations')
plt.legend()
plt.xticks(rotation=45)
plt.grid(True)
plt.tight_layout()
plt.show()

#### Insight:
- South Africa had the most consistent and rapid vaccine rollout among the selected countries.
- Kenya followed with moderate progress, while Uganda and Tanzania administered fewer total doses.
- Acceleration in mid-2021 likely corresponds with global vaccine shipments and COVAX efforts.

## Percentage of Population Vaccinated

In [None]:
# Compare % Vaccinated Population
df = df[['location', 'date', 'people_vaccinated_per_hundred']]
df = df.dropna(subset=('people_vaccinated_per_hundred'))

# Choose countries
countries = ['Kenya', 'Uganda', 'Tanzania', 'South Africa']

# Filter data
df_vax = df[df['location'].isin(countries)]

# Pivot for plotting
pivot_percent = df_vax.pivot(index='date', columns='location', values='people_vaccinated_per_hundred')

plt.figure(figsize=(14,7))

for country in countries:
    plt.plot(pivot_percent.index, pivot_percent[country], label=country)

plt.title('COVID-19 Vaccination Progress (% of Population Vaccination)')
plt.xlabel('Date')
plt.ylabel('% Vaccinated (At Least One Dose)')
plt.legend()
plt.grid(True)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

#### Insight:
- South Africa leads in % vaccinated, reaching over 40%.
- Tanzania had a delayed and minimal uptake, staying under 10% for much of the time.
- Kenya had a rapid rise between August–October 2021, possibly tied to new public health campaigns.z

## Anomalies & Notable Patterns

#### Anomalies:
- Tanzania reported no data early on, reflecting national policies.
- Uganda had flat lines likely due to inconsistent data collection.

#### Interesting Pattern:
- Spikes in vaccination often came just before or during COVID-19 waves.

## Conclusion

Vaccination strategies and outcomes varied by country. South Africa led in both speed and scale. Kenya and Uganda improved steadily. Tanzania lagged in both timing and coverage, likely due to initial vaccine hesitancy.

Understanding these trends can help shape better preparedness and equitable vaccine distribution in future public health crises.