## World Map of COVID-19 Cases

In this section we created a **worldwide map visualization** of the COVID-19 pandemic data.  
The goal was to clearly show the spread of total reported cases relative to the population of each country.

### Key steps:
- Converted the date column to a proper datetime format.
- Removed aggregates (`OWID_*`) to keep only real countries.
- Forward-filled cumulative metrics within each country.
- Selected the latest available observation for each country.
- Computed **cases per capita** to normalize values by population.

In [1]:
# --- Imports ---
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
import plotly.express as px
import plotly.graph_objects as go

In [8]:
# --- Load data ---
file_path = "../Data/Covid_19_dataset.csv"
df = pd.read_csv(file_path)

# --- Convert date to datetime ---
df["date"] = pd.to_datetime(df["date"], errors="coerce")

# Project color palette (same for Seaborn & Plotly)
project_palette = ["#0474ed","#91f0fa","#08a29e","#a2b458","#a6cabd","#326164"]

# Seaborn will use this
palette_seaborn = project_palette * 2

# Plotly will use this
palette_plotly = project_palette

In [13]:
import plotly.express as px

# --- Latest snapshot per country ---
df_c = df[~df["iso_code"].str.startswith("OWID_")].copy()
df_c = df_c.sort_values(["iso_code","date"])

cum_cols = [
    "total_cases","total_deaths","total_tests","total_vaccinations",
    "people_vaccinated","people_fully_vaccinated","total_boosters","hosp_patients"
]
present = [c for c in cum_cols if c in df_c.columns]
df_c[present] = df_c.groupby("iso_code")[present].ffill()

idx = df_c.groupby("iso_code")["date"].idxmax()
last_rows = df_c.loc[idx].copy()

countries = last_rows[[
    "iso_code","continent","location","total_cases","population","latitude","longitude"
]].copy()
countries = countries[(countries["population"] > 0) & countries["total_cases"].notna()]
countries["cases_per_capita"] = countries["total_cases"] / countries["population"]

# --- Map plot ---
fig = px.scatter_geo(
    countries,
    lat="latitude", lon="longitude",
    color="continent",
    hover_name="location",
    size="cases_per_capita", size_max=45,
    hover_data={
        "total_cases":":,", 
        "cases_per_capita":":.2%", 
        "population":":,",
        "latitude":False, "longitude":False
    },
    title="COVID-19: Total Reported Cases by Country",
    color_discrete_sequence=palette_plotly,
    labels={
        "continent": "Continent",
        "cases_per_capita": "Cases per Person",
        "total_cases": "Total Cases",
        "population": "Population"
    }
)

fig.update_geos(
    scope="world",
    projection_type="natural earth",
    showcountries=True, countrycolor="#555",
    showland=True, landcolor="#222",
    coastlinecolor="#555"
)

fig.update_layout(
    width=1200, height=600,
    paper_bgcolor="#111", plot_bgcolor="#111",
    font_color="#eee",
    legend_title_text="Continent",
    margin=dict(l=0, r=0, t=60, b=0)
)

fig.show()

### Visualization details:
- **Plotly Express** `scatter_geo` was used for an interactive map.
- Circles are scaled by *cases per capita*.
- Colors represent **continents**, consistently with our project palette.
- Tooltip shows country name, total cases, cases per person, and population.
- Map is centered, uses a **dark theme** and displays country borders for better readability.

### Result:
The map provides a clear, interactive overview of the pandemic’s impact across countries, normalized by population size.  
It closes the third notebook with a comprehensive global perspective, setting the stage for further analysis.
