This is the **fifteenth script to run** in the workflow.  

# Merge Harmonised Eurobarometer Surveys (2007–2022)

This script consolidates multiple **harmonised Eurobarometer surveys** into a single dataset covering **2007–2022**.  

---

## Data Source  
The original Eurobarometer datasets were downloaded from the official **GESIS Eurobarometer Data Service**:  
👉 [GESIS – Eurobarometer Data Service](https://www.gesis.org/en/eurobarometer-data-service/survey-series/topics#:~:text=Natural,-Resources%3A%20Energy)  

---

## Harmonisation Process (done in Stata)  
Before merging, each survey wave was **harmonised in Stata** into a standard format with the variable `climate_risk_perception`.  

### Key Details:
The harmonisation was implemented in the do-file: `eurobarometer_harmonization`.

In [None]:
import pandas as pd
import os

# === Input folder containing harmonised Eurobarometer CSVs ===
in_path = "insert/your/path/"

# === List of harmonised CSV files to merge ===
# Each file corresponds to one Eurobarometer wave (already harmonised).
files = [
    f"{in_path}/EB68.2_2007_harmonised.csv",
    f"{in_path}/EB71.1_2009_harmonised.csv",
    f"{in_path}/EB75.4_2011_harmonised.csv",
    f"{in_path}/EB80.2_2013_harmonised.csv",
    f"{in_path}/EB83.4_2015_harmonised.csv",
    f"{in_path}/EB88.1_2017_harmonised.csv",
    f"{in_path}/EB92.4_2019_harmonised.csv",
    f"{in_path}/EB95.1_2021_harmonised.csv",
    f"{in_path}/EB97.3_2022_harmonised.csv",
    f"{in_path}/EB2020_harmonised.csv"   # Special survey from 2020
]

# === Load each file into a DataFrame ===
dfs = [pd.read_csv(f) for f in files]

# === Concatenate all waves vertically into one dataset ===
merged = pd.concat(dfs, ignore_index=True)

# === Sort by country and survey year (if these columns exist) ===
sort_cols = [c for c in ["country", "eb_year"] if c in merged.columns]
merged = merged.sort_values(by=sort_cols).reset_index(drop=True)

# === Save the merged dataset to CSV ===
out_path = f"{in_path}/EB_merged_2007_2022.csv"
merged.to_csv(out_path, index=False)

print(f"Merged dataset saved to {out_path}")

This is the **sixteenth script to run** in the workflow.  

# Eurobarometer Climate Risk Perception – Interpolated Panel (2004–2025)

This script creates an **harmonised panel dataset** of Eurobarometer climate risk perception indicators, interpolated across years, and extended back to 2004.
- Interpolation ensures continuous time coverage for regression analysis.  
- Backfilling 2004–2006 keeps all rows in the panel → avoids losing early government/directive matches in **Stata regressions**.  
- If a stricter design is preferred, these years can be dropped instead.  

In [None]:
import pandas as pd

# === Path to EB merged dataset ===
eb_path = "insert/your/path/EB_merged_2007_2022.csv"

# === Load dataset ===
df_eb = pd.read_csv(eb_path)

# Ensure clean country names
df_eb["country"] = df_eb["country"].str.strip()

# Panel over 2004–2025 for all countries
all_years = pd.DataFrame({"year": range(2004, 2026)})
countries = df_eb["country"].unique()

panel = (
    pd.MultiIndex.from_product([countries, all_years["year"]], names=["country", "year"])
    .to_frame(index=False)
)

# Merge raw EB values
panel = panel.merge(
    df_eb.rename(columns={"eb_year": "year", "climate_risk_perception": "crp_raw"}),
    on=["country", "year"],
    how="left"
)

# === Variable: interpolation (linear between EB years) ===
panel["crp_interp"] = (
    panel.groupby("country")["crp_raw"]
         .apply(lambda g: g.interpolate(method="linear"))
         .reset_index(level=0, drop=True)
)

# === Override 2004–2006 with 2007 values ===
for yr in [2004, 2005, 2006]:
    panel.loc[panel["year"] == yr, "crp_interp"] = (
        panel.loc[panel["year"] == 2007, ["country", "crp_interp"]]
              .set_index("country")
              .reindex(panel.loc[panel["year"] == yr, "country"])
              .values
    )

# === Save ===
out_path = "insert/your/path/EB_CRPs_interp.csv"
panel.to_csv(out_path, index=False)