# 📈 Nowcasting US GDP Growth during COVID-19 (Lab Activity)

**Objective**

+ Build a model to nowcast **U.S. quarterly GDP growth** using **monthly indicators** from the FRED-MD database. 
+ Focus on the period around COVID-19 (2020) to assess how well the models capture the economic shock in real-time.
+ The group that produces the **most accurate nowcasts (lowest RMSE)** will win.

---

## 📦 Data Overview

### High-Frequency Predictors

- **Source**: [FRED-MD](https://www.stlouisfed.org/research/economists/mccracken/fred-databases)
- **Frequency**: Monthly
- **Variables**: Over 100 U.S. macroeconomic series
- **Transformations**: First row of the dataset includes codes for applying transformations:
    - "1": No transformation
    - "2": First difference
    - "5": Log first difference, etc.
    - (You can use the `apply_transformation()` helper function if provided)

### Low-Frequency Target

- **Variable**: U.S. real GDP QoQ growth (quarterly growth rate in percent). Apply transformation (5) and multiply by 100.
- **Task**: Predict this variable using high-frequency monthly data (a mixed-frequency regression problem).

---

## 🎯 Forecasting Task

You must produce a **rolling nowcast**:
- **Alignment lag**: 0 (i.e., predict the current quarter using available monthly info)
- **Start rolling**: 2016-12-01
- **End rolling**: 2020-06-01

Your submission must be a **CSV file** with the following format:
```
date,target,prediction
2016-12-01,0.8,0.92
2017-03-01,1.2,0.59
...
2020-06-01,-4.3,-10.23
```
---

## 📤 Submission Instructions

- Export your final predictions to CSV:

```
df_nowcast = pd.DataFrame({
    "date": forecast_dates,         # datetime list or index
    "target": true_values,          # actual GDP growth
    "prediction": model_forecasts   # your nowcasts
})
df_nowcast.to_csv("group_name_nowcast.csv", index=False)
```
- Send this CSV file to renato.vassallo@bse.eu.

---

## 🧪 Evaluation

- Evaluation based on **RMSE** over the forecast horizon:
    - 2016-12-01 to 2020-06-01
- The team with the **lowest RMSE wins** 🎉

---

## ⏱️ Time Limit

⏰ 25 minutes
- Work in groups (up to 3 people)

---

## 💡 Ideas

- Select variables most correlated with GDP
- Normalize or lag variables
- Try dimensionality reduction

---

Good luck 🚀

## Helper functions

In [None]:
import numpy as np
import pandas as pd

def apply_transformation(series, code):
    """
    Apply FRED-MD transformation code to a pandas Series.
    """
    if code == "1":
        return series
    elif code == "2":
        return series.diff()
    elif code == "3":
        return series.diff().diff()
    elif code == "4":
        return np.log(series)
    elif code == "5":
        return np.log(series).diff()
    elif code == "6":
        return np.log(series).diff().diff()
    elif code == "7":
        return series.pct_change()
    else:
        return pd.Series(np.nan, index=series.index)
    
def keep_fully_populated_last_year(df, min_obs=12):
    """
    Keep only columns (series) with at least `min_obs` non-NaN values 
    in the last 12 months of the data.

    Args:
        df (pd.DataFrame): The time series DataFrame with datetime index
        min_obs (int): Minimum number of non-NaN observations required

    Returns:
        pd.DataFrame: Filtered DataFrame with only complete series
    """
    # Define the last 12 months in the index
    last_date = df.index.max()
    one_year_ago = last_date - pd.DateOffset(months=11)

    recent_data = df.loc[one_year_ago:last_date]
    print(f"Checking series completeness from {one_year_ago.date()} to {last_date.date()}")

    # Keep only columns with 12 valid values
    valid_series = [col for col in df.columns if recent_data[col].count() == min_obs]

    print(f"Keeping {len(valid_series)} of {df.shape[1]} series with full data in last 12 months.")

    return df[valid_series]

## Useful links

In [None]:
# Quarterly FRED-MD data URL (for target)
url_quart = "https://www.stlouisfed.org/-/media/project/frbstl/stlouisfed/research/fred-md/quarterly/current.csv"

# Monthly FRED-MD data URL (for features)
url_month = "https://www.stlouisfed.org/-/media/project/frbstl/stlouisfed/research/fred-md/monthly/current.csv"