# 📊 Datawrapper — Creating Charts from USGS Earthquake Data

This notebook generates **two interactive charts** using the [Datawrapper Python API](https://datawrapper.readthedocs.io/en/latest/user-guide/api.html), based on seismic data retrieved from the USGS.

---

## 🔧 Configuration Parameters

- **`folder_id = "332403"`**  
  Datawrapper folder ID where both charts will be stored. Use this to organize outputs for this project within your Datawrapper workspace.

- **`input_dir = "data_input/USGS"`**  
  Directory containing USGS input data (e.g., shakemaps, QuakeML metadata, impact JSON). This is typically generated by the `get_usgs_data` script.

- **`distance_to_fault = 100`**  
  Maximum distance (in kilometers) from the fault line used to filter relevant aftershocks for visualization.

---

## 📈 Charts Produced

1. **Population Exposure by Shake Intensity and Country**  
   Shows how population is distributed across different levels of ground shaking.  
   → [Example: Myanmar earthquake – Datawrapper chart](https://datawrapper.dwcdn.net/dQpEY/17/)

2. **Aftershocks Timeline and Magnitude**  
   A time series showing the number and size of aftershocks following the main event.  
   → [Example: Myanmar earthquake – Datawrapper chart](https://www.datawrapper.de/_/UWXHh/?v=20)

---

## 📚 Resources on the Datawrapper Python Library

- [📘 Official documentation](https://datawrapper.readthedocs.io/en/latest/user-guide/api.html)  
  Comprehensive guide for authentication, chart creation, updates, and more.

- [📰 Introductory blog post](https://www.datawrapper.de/blog/datawrapper-python-package)  
  Quick overview of how to use the Datawrapper Python package with practical examples.

---

*Tip: You’ll need a Datawrapper API token in your environment to run this notebook. Make sure it's available under the `DATAWRAPPER_API_KEY` variable.*


In [1]:
import pandas as pd
from datawrapper import Datawrapper
import os
import geopandas as gpd

# User paramaters

### Configuration Parameters

- **`folder_id = "332403"`**  
  The ID of the Datawrapper folder where charts related to this analysis will be created or updated. This folder organizes all visualizations linked to the current project.

- **`input_dir = "data_input/USGS"`**  
  Path to the input directory containing USGS data files, from script "get_usgs_data".

- **`distance_to_fault = 100`**  
  Distance threshold (in kilometers) used to filter aftershocks near the main fault. Only events within 100 km of the main rupture are considered for specific analyses or visualizations.


In [2]:
# Datawrapper folder ID
folder_id = "332403"

# Set input directory (here, directory of the "get_usgs_data" script)
input_dir = "data_input/USGS"

# Filtering distance to fault for aftershocks
distance_to_fault = 100

# Set Datawrapper

## Connect with the API key

In [3]:
# Récupère ta clé API depuis les variables d'environnement
api_key = os.getenv("DATAWRAPPER_API_KEY")  # remplace par le nom exact de ta variable

# Initialise Datawrapper
dw = Datawrapper(api_key)


# # Tu peux maintenant utiliser dw.account_info()
# info = dw.account_info()
# print(info)

# Data population exposure

In [4]:
# Import data
df_exposure = pd.read_csv(f"{input_dir}/exposure_population_by_country_mmi.csv")
label_exposure = pd.read_csv(f"{input_dir}/mmi_intensity_category.csv")
# print(df_exposure.head())
# print(label_exposure.head())

df_exposure = pd.merge(
    df_exposure,
    label_exposure,
    how="left",
    left_on="mmi",
    right_on="Intensity"
)

df_exposure.head()

Unnamed: 0,country,mmi,pop_exposure,country_name_en,country_name_fr,country_name_de,Intensity,Shaking_EN,Damage_EN,Shaking_DE,Damage_DE,Shaking_FR,Damage_FR
0,ALL,1,0,Total,Total,Total,1,Not felt,,Nicht fühlbar,Keine,Non ressenti,Aucun
1,ALL,2,0,Total,Total,Total,2,Weak,,Schwach,Keine,Faible,Aucun
2,ALL,3,32989235,Total,Total,Total,3,Weak,,Schwach,Keine,Faible,Aucun
3,ALL,4,163708636,Total,Total,Total,4,Light,,Leicht,Keine,Léger,Aucun
4,ALL,5,24656409,Total,Total,Total,5,Moderate,Very light,Mässig,Sehr gering,Modéré,Très léger


### Cleaning exposure, compute share by countries, sort countries by exposure to highest MMI level, and transform to wide

In [5]:
# 1. Pivot large
df_wide = (
    df_exposure[["mmi", "Shaking_EN", "country_name_en", "pop_exposure"]]
    .pivot(index=["mmi", "Shaking_EN"], columns="country_name_en", values="pop_exposure")
    .reset_index()
)

# 2. Supprimer les intensités faibles
df_wide = df_wide[~df_wide["mmi"].isin([1, 2])]

# 3. Liste des pays (hors colonnes de base)
pays_cols = df_wide.columns.difference(["mmi", "Shaking_EN", "Total"])

# 4. Créer colonnes pourcentages
pct_col_names = {col: col.capitalize() for col in pays_cols}
for col, new_col in pct_col_names.items():
    df_wide[new_col] = (
        df_wide[col] / df_wide["Total"].replace(0, pd.NA) * 100
    ).clip(upper=100).round(1)

# 5. Pondérer l’exposition aux fortes intensités (e.g. mmi * pourcentage)
df_scores = df_wide[["mmi"] + list(pct_col_names.values())].copy()
for col in pct_col_names.values():
    df_scores[col] = df_scores["mmi"] * df_scores[col]

# 6. Score final par pays (somme pondérée), sauf Birmanie
exposure_scores = (
    df_scores[df_scores["mmi"] >= 4]  # facultatif : focus haute intensité
    .drop(columns=["mmi"])
    .sum()
    .sort_values(ascending=False)
)

# 7. Ordre des colonnes de % : Myanmar d’abord, puis autres selon score
pct_cols_ordered = ["Myanmar (burma)"] + [
    col for col in exposure_scores.index if col != "Myanmar (burma)"
]

# 8. Colonnes absolues renommées avec suffixe _abs
for col in pays_cols:
    df_wide[col + "_abs"] = df_wide[col]

# 9. Colonnes absolues dans le même ordre
abs_cols_ordered = [
    key + "_abs"
    for key, val in pct_col_names.items()
    if val in pct_cols_ordered
]

# 10. Colonnes finales dans le bon ordre
final_cols = ["mmi", "Shaking_EN", "Total"] + pct_cols_ordered + abs_cols_ordered
df_wide = df_wide[final_cols]

# 11. Trier les lignes : MMI décroissant, % Myanmar décroissant
df_wide = df_wide.sort_values(by=["mmi", "Myanmar (burma)"], ascending=[False, False])

### Styling for datawrapper (same colors as in map)

In [6]:
# Mapping couleur : du jaune clair au rouge foncé
intensity_colors = {
    "Weak":        "#fff5f0",
    "Light":       "#fee0d2",
    "Moderate":    "#fcbba1",
    "Strong":      "#fc9272",
    "Very strong": "#fb6a4a",
    "Severe":      "#ef3b2c",
    "Violent":     "#cb181d",
    "Extreme":     "#99000d"
}

# Mapping couleur de texte
def text_color(val):
    if val in ["Weak", "Light"]:
        return "#333"
    else:
        return "white"

# Crée directement Intensity_html basé sur Shaking_EN + couleur
df_wide["Intensity_html"] = df_wide["Shaking_EN"].map(
    lambda val: f'<span style="background-color: {intensity_colors.get(val, "#cccccc")}; color: {text_color(val)}; padding: 2px 6px; border-radius: 4px;">{val}</span>'
)



In [7]:
df_wide.head()

country_name_en,mmi,Shaking_EN,Total,Myanmar (burma),Thailand,Bangladesh,India,China,Laos,Bhutan,...,Bangladesh_abs,Bhutan_abs,Cambodia_abs,China_abs,India_abs,Laos_abs,Myanmar (Burma)_abs,Thailand_abs,Vietnam_abs,Intensity_html
9,10,Extreme,414753,100.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,414753,0.0,0.0,"<span style=""background-color: #99000d; color:..."
8,9,Violent,5800931,100.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,6215684,0.0,0.0,"<span style=""background-color: #cb181d; color:..."
7,8,Severe,3636635,100.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,3636635,0.0,0.0,"<span style=""background-color: #ef3b2c; color:..."
6,7,Very strong,8787774,100.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,8787774,0.0,0.0,"<span style=""background-color: #fb6a4a; color:..."
5,6,Strong,20277188,98.7,0.7,0.0,0.0,0.6,0.0,0.0,...,0.0,0.0,0.0,0.6,0.0,0.0,20015152,0.7,0.0,"<span style=""background-color: #fc9272; color:..."


### Create dw exposure chart (run one time)

In [None]:
# # The function create_chart() can take four arguments: title, chart_type, data, and folder_id.

# dw_exposure = dw.create_chart(
#     title="Population exposure at the earthquake",
#     chart_type="tables",
#     data = df_wide,
#     folder_id=folder_id
# )

### Assign ID to chart

In [None]:
# dw_exposure_id = dw_exposure['id']
# print(dw_exposure_id)

dw_exposure_id = "dQpEY" #<--- !!! Delete this for you usage !!!

### Update data

In [8]:
# Mettre Intensity_html en première colonne (à la place de mmi)
cols = ["Intensity_html"] + [col for col in df_wide.columns if col not in ["mmi", "Intensity_html"]]
df_wide = df_wide[cols]
dw.add_data(dw_exposure_id, df_wide)


NameError: name 'dw_exposure_id' is not defined

### Optional publish and get iframe

In [29]:
# dw.publish_chart(dw_exposure_id)
# dw_exposure_iframe = dw.get_iframe_code(dw_exposure_id)

## Aftershocks data

In [None]:
# Import
dw_aftershocks = gpd.read_file(f"{input_dir}/aftershock_with_distance.gpkg")
dw_aftershocks.head()

Unnamed: 0,id,time,magnitude,depth,latitude,longitude,magtype,gap,rms,horizontal_error,vertical_error,distance_to_fault_m,distance_to_fault_km,geometry
0,us7000pn9s,2025-03-28 06:20:52.715000+00:00,7.7,10.0,22.011,95.9363,,,,,,4841.911795,4.841912,POINT (95.9363 22.011)
1,us7000pn9z,2025-03-28 06:32:04.777000+00:00,6.7,10.0,21.6975,95.969,,,,,,2036.881791,2.036882,POINT (95.969 21.6975)
2,us7000pncy,2025-03-28 06:39:14.645000+00:00,4.8,10.0,19.8979,95.8026,,,,,,36294.032829,36.294033,POINT (95.8026 19.8979)
3,us7000pncv,2025-03-28 06:42:24.760000+00:00,4.9,10.0,21.8377,95.8747,,,,,,11571.227728,11.571228,POINT (95.8747 21.8377)
4,us7000pnb6,2025-03-28 06:45:44.906000+00:00,4.9,10.0,19.1284,96.2075,,,,,,6474.853617,6.474854,POINT (96.2075 19.1284)


### Filter earthquakes with minimum distance to fault

In [36]:
dw_aftershocks_filtered = dw_aftershocks[dw_aftershocks["distance_to_fault_km"] <= distance_to_fault]

### Clean time and accentuate sizes differences

In [32]:

df_aftershocks_filtered = dw_aftershocks_filtered.drop(columns="geometry").copy()

# Conversion en datetime, si nécessaire
df_aftershocks_filtered["time"] = pd.to_datetime(df_aftershocks_filtered["time"])

# Chronologie fine pour DW
df_aftershocks_filtered["datetime"] = df_aftershocks_filtered["time"].dt.strftime('%Y-%m-%d %H:%M')
df_aftershocks_filtered["iso8601"] = df_aftershocks_filtered["time"].dt.strftime("%Y-%m-%dT%H:%M:%SZ")

# Ajout du radius avec une fonction quadratique
def mag_to_radius(mag, base=2):
    return base + (mag - 3.5)**2.5

df_aftershocks_filtered["radius"] = df_aftershocks_filtered["magnitude"].apply(mag_to_radius)

# Sélection des colonnes clés
cols = ["id", "magnitude", "radius", "distance_to_fault_km", "datetime", "iso8601"]
df_aftershock_for_dw = df_aftershocks_filtered[cols].copy()



### Create chart for aftershocks

In [33]:
# # Chart creation (run one time)
# dw_aftershock = dw.create_chart(
#     title="Aftershocks",
#     chart_type="tables",
#     data = df_aftershock_for_dw,
#     folder_id=folder_id
# )

In [None]:
# dw_aftershock_id = dw_aftershock['id']
# print(dw_aftershock_id)

dw_aftershock_id = "UWXHh" # # !!! Delete this for you usage !!!

### Update chart data

In [35]:
dw.add_data(dw_aftershock_id, df_aftershock_for_dw)

True

### Optional Count aftershocks in a time window

In [37]:
# =============================================================================
# =============================================================================
# --- 1. Paramètres utilisateur ---
days_after = 6
# =============================================================================
# =============================================================================


# S'assurer que le DataFrame est trié par 'time' croissante
df_aftershocks_filtered = df_aftershocks_filtered.sort_values("time")

# Prendre la date du tout premier aftershock comme point de départ
first_time = df_aftershocks_filtered["time"].iloc[0]

# Définir la fenêtre de 6 jours après ce timestamp
window_end = first_time + pd.Timedelta(days=days_after)

# Filtrer les aftershocks dans cette fenêtre
after_6days = df_aftershocks_filtered[
    (df_aftershocks_filtered["time"] >= first_time) &
    (df_aftershocks_filtered["time"] < window_end)
]

count = len(after_6days)
print(f"Nombre de tremblements de terre dans les {days_after} jours à partir du premier : {count}")


Nombre de tremblements de terre dans les 6 jours à partir du premier : 38
