# COVID-19 Data Analysis

This notebook analyzes COVID-19 impacts using a dataset containing global case and death data.

In [None]:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset
file_path = 'COVID-19 Coronavirus.csv'
data = pd.read_csv(file_path)

# Summary statistics
print(data.describe())

# Cases and deaths by continent
continent_impact = data.groupby('Continent')[['Total Cases', 'Total Deaths']].sum()
continent_impact.plot(kind='bar', figsize=(10, 6))
plt.title('COVID-19 Total Cases and Deaths by Continent')
plt.ylabel('Count')
plt.xlabel('Continent')
plt.xticks(rotation=45)
plt.show()

# Correlation heatmap
correlation_matrix = data.corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f", figsize=(8, 6))
plt.title('Correlation Matrix')
plt.show()
    

## Key Findings


1. **Summary Statistics**: 
   - Total cases range up to 81.8 million.
   - Cases per million population show significant variation.

2. **Continent Analysis**:
   - Africa has the highest number of countries in the dataset (58).
   - Total cases and deaths are higher in regions with larger populations and testing.

3. **Correlation Insights**:
   - Strong correlation between Total Cases and Total Deaths.
   - Population shows a moderate correlation with cases and deaths.
