# COVID-19 Temporal Analysis

This notebook analyzes the temporal evolution of COVID-19 cases by country using data visualization techniques with Pandas and Plotly.

## Objective
- Load and process COVID-19 time series data
- Create monthly aggregations by country
- Visualize the evolution of cases over time
- Compare trends between different countries

## Requirements
- pandas
- plotly
- datetime

## Step 1: Import Required Libraries

In [None]:
import pandas as pd
import plotly.express as px
from datetime import datetime
import os

print("Libraries imported successfully")

## Step 2: Load COVID-19 Data

Load the CSV file containing COVID-19 time series data. Make sure to place your CSV file in the same directory as this notebook.

In [None]:
# Load the CSV file
df = pd.read_csv("covid.csv")

# Display basic information about the dataset
print(f"Dataset shape: {df.shape}")
print(f"Columns: {df.columns.tolist()}")

# Display first few rows
df.head()

## Step 3: Data Preparation

Convert date columns and create monthly grouping variables for time series analysis.

In [None]:
# Convert the 'date' column to datetime format
df['date'] = pd.to_datetime(df['date'])

# Create column for grouping by year and month
df['year_month'] = df['date'].dt.to_period('M')

print("Dates converted successfully")
print(f"Date range: {df['date'].min()} to {df['date'].max()}")

# Verify the new columns
print(f"Updated columns: {df.columns.tolist()}")

## Step 4: Monthly Aggregation

Group data by country and month to create monthly summaries of COVID-19 cases.

In [None]:
# Group by country and month, taking the maximum value of cases in that month
df_monthly = df.groupby(['country', 'year_month'])['total_cases'].max().reset_index()

# Convert 'year_month' to timestamp for proper plotting
df_monthly['date'] = df_monthly['year_month'].dt.to_timestamp()

print(f"Date range: {df_monthly['year_month'].min()} to {df_monthly['year_month'].max()}")
print(f"Monthly data shape: {df_monthly.shape}")

# Example: Show data for El Salvador
example_sv = df_monthly[df_monthly['country'] == 'El Salvador']
print("\nExample - El Salvador data:")
print(example_sv[['country', 'date', 'year_month', 'total_cases']].head())

## Step 5: Country Selection and Filtering

Select specific countries for analysis and verify they exist in the dataset.

In [None]:
# Countries to analyze (you can modify these)
countries_to_analyze = ['El Salvador', 'Guatemala', 'Honduras']

# Verify that the countries exist in the dataset
print("Verifying selected countries:")
found_countries = []

for country in countries_to_analyze:
    if country in df_monthly['country'].unique():
        found_countries.append(country)
        print(f"Found: {country}")
    else:
        print(f"Not found: {country}")

# Filter data for found countries
df_countries = df_monthly[df_monthly['country'].isin(found_countries)].copy()

print(f"\nFiltered data: {len(df_countries)} records for {len(found_countries)} countries")

## Step 6: Create Visualization

Generate an interactive line chart showing the monthly evolution of COVID-19 cases by country.

In [None]:
# Create the monthly evolution chart with Plotly Express
fig = px.line(df_countries,
              x='date',
              y='total_cases',
              color='country',
              title='Monthly Evolution of Total COVID-19 Cases by Country',
              labels={'date': 'Month', 'total_cases': 'Total Cases', 'country': 'Country'})

fig.update_layout(xaxis_title="Month",
                  yaxis_title="Total Cases",
                  hovermode="x unified")

fig.show()