# Economic Data (Exogenous Variables)

In this section we will present the economic data which will be used as supporting features: 

- **Consumer Price Index**: Consumer price index is sourced from the Statistisches Bundesamt. [Link](https://www-genesis.destatis.de/datenbank/online/statistic/61111/table/61111-0002)
- **Deposit Facility Rate**: Deposit facility rate data is sourced from Deutsche Bundesbank. [Link](https://www.bundesbank.de/dynamic/action/en/statistics/time-series-databases/time-series-databases/759784/759784?listId=www_szista_mb01)
- **Employment Level:** Employment level data is available at Deutsche Bundesbank. [Link](https://www.bundesbank.de/dynamic/action/en/statistics/time-series-databases/time-series-databases/745582/745582?tsId=BBDL1.M.DE.N.EMP.EBA000.A0000.A00.D00.0.ABA.A&listId=www_siws_mb09_06b&dateSelect=2025
)
- **Gross Domestic Product:** GDP Data is available at Deutsche Bundesbank Website. [Link](https://www.bundesbank.de/dynamic/action/en/statistics/time-series-databases/time-series-databases/745582/745582?listId=www_ssb_lr_bip&tsId=BBNZ1.Q.DE.N.H.0000.L&dateSelect=2025)
- **Marginal Lending Rate**: Marginal Lending Rate is available at Deutsche Bundesbank Website. [Link](https://www.bundesbank.de/dynamic/action/en/statistics/time-series-databases/time-series-databases/759784/759784?listId=www_szista_mb01)
- **Historical Oil Prices**: The historical oil prices are available at the European Commission Website. [Link](https://energy.ec.europa.eu/data-and-analysis/weekly-oil-bulletin_en)

All the raw data is stored in the folder `data/raw`, that is what is called `bronze` data. With the following structure:

```bash
.
├── bundesbank
│   ├── CB_marginal_lending_facility_rate.csv
│   ├── ECB_deposit_facility_rate.csv
│   ├── employment_level_germany.csv
│   └── germany_GDP.csv
├── european_commission
│   └── Weekly_Oil_Bulletin_Prices_History_maticni_4web.xlsx
└── statistisches_bundesamt
    ├── consumer_price_index.xlsx
    └── population.csv
```

After loading and cleaning the data, the silver data is stored under the folder `data/processed` and has the following structure :

```bash
.
├── historical_consumer_price_index.parquet
├── historical_deposit_rate.parquet
├── historical_employment_level_germany.parquet
├── historical_GDP_germany.parquet
├── historical_lending_rate.parquet
└── historical_oil_prices.parquet
```

In [1]:
import pandas as pd

## Consumer Price Index

The consumer price index data is tranformed from bronze to silver using the script `neuralts/data_preparation/load_consumer_price_index.py`. 
The dataset contains the following columns:

- **Date**: The date of the observation in datetime format YYYY-MM-DD
- **Year**: The year of the observation
- **Month**: The month of the observation (1-12)
- **consumer_price_index**: The consumer price index value (base year 2020=100)
- **YoY_change**: Year-over-Year percentage change in the consumer price index
- **MoM_change**: Month-over-Month percentage change in the consumer price index

The dataset spans from January 2018 to October 2025, containing 94 monthly observations.

In [25]:
storage_path = "../data/processed/historical_consumer_price_index.parquet"
df = pd.read_parquet(storage_path, engine='pyarrow')
df.head()

Unnamed: 0,Year,Month,consumer_price_index,YoY_change,MoM_change,Date
0,2018,1,96.4,1.4,0.0031,2018-01-31
1,2018,2,96.7,1.2,0.0031,2018-02-28
2,2018,3,97.2,1.5,0.0052,2018-03-31
3,2018,4,97.5,1.4,0.0031,2018-04-30
4,2018,5,98.2,2.1,0.0072,2018-05-31


In [26]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 94 entries, 0 to 93
Data columns (total 6 columns):
 #   Column                Non-Null Count  Dtype         
---  ------                --------------  -----         
 0   Year                  94 non-null     object        
 1   Month                 94 non-null     int32         
 2   consumer_price_index  94 non-null     float64       
 3   YoY_change            94 non-null     float64       
 4   MoM_change            94 non-null     float64       
 5   Date                  94 non-null     datetime64[ns]
dtypes: datetime64[ns](1), float64(3), int32(1), object(1)
memory usage: 4.2+ KB


In [27]:
df.describe()

Unnamed: 0,Month,consumer_price_index,YoY_change,MoM_change,Date
count,94.0,94.0,94.0,94.0,94
mean,6.393617,108.289362,3.028723,0.002639,2021-12-14 22:43:24.255319040
min,1.0,96.4,-0.6,-0.0081,2018-01-31 00:00:00
25%,3.25,99.825,1.525,0.0008,2020-01-07 18:00:00
50%,6.0,104.6,2.2,0.0026,2021-12-15 12:00:00
75%,9.0,117.575,4.175,0.004275,2023-11-22 12:00:00
max,12.0,123.0,8.8,0.0198,2025-10-31 00:00:00
std,3.427335,9.074275,2.370138,0.004283,


## Deposit Facility Rate

The consumer price index data is tranformed from bronze to silver using the script `neuralts/data_preparation/load_deposit_rate.py`. The data contains the following columns: 

- **Date**: The date of the observation in datetime format YYYY-MM-DD HH:MM
- **Deposit_Rate**: The ECB deposit facility rate value
- **Deposit_Rate_Change**: Absolute change in the deposit rate from the previous month
- **Deposit_Rate_Relative_Change**: Relative (percentage) change in the deposit rate from the previous month
- **Deposit_Rate_YoY_Change**: Year-over-Year absolute change in the deposit rate
- **Deposit_Rate_YoY_Relative_Change**: Year-over-Year relative (percentage) change in the deposit rate

The dataset spans from January 1999 to October 2025, containing 322 monthly observations.

In [33]:
storage_path = "../data/processed/historical_deposit_rate.parquet"
df = pd.read_parquet(storage_path, engine='pyarrow')
df.head()

Unnamed: 0,Date,Deposit_Rate,Deposit_Rate_Change,Deposit_Rate_Relative_Change,Deposit_Rate_YoY_Change,Deposit_Rate_YoY_Relative_Change
0,1999-01-31,2.0,0.0,0.0,0.0,0.0
1,1999-02-28,2.0,0.0,0.0,0.0,0.0
2,1999-03-31,2.0,0.0,0.0,0.0,0.0
3,1999-04-30,1.5,-0.5,-0.25,0.0,0.0
4,1999-05-31,1.5,0.0,0.0,0.0,0.0


In [31]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 322 entries, 0 to 321
Data columns (total 6 columns):
 #   Column                            Non-Null Count  Dtype         
---  ------                            --------------  -----         
 0   Date                              322 non-null    datetime64[ns]
 1   Deposit_Rate                      322 non-null    float64       
 2   Deposit_Rate_Change               322 non-null    float64       
 3   Deposit_Rate_Relative_Change      322 non-null    float64       
 4   Deposit_Rate_YoY_Change           322 non-null    float64       
 5   Deposit_Rate_YoY_Relative_Change  322 non-null    float64       
dtypes: datetime64[ns](1), float64(5)
memory usage: 15.2 KB


In [32]:
df.describe()

  return umr_sum(a, axis, dtype, out, keepdims, initial, where)


Unnamed: 0,Date,Deposit_Rate,Deposit_Rate_Change,Deposit_Rate_Relative_Change,Deposit_Rate_YoY_Change,Deposit_Rate_YoY_Relative_Change
count,322,322.0,322.0,322.0,322.0,322.0
mean,2012-06-15 02:00:44.720496896,1.065217,0.0,0.003384,0.026398,
min,1999-01-31 00:00:00,-0.5,-1.0,-1.0,-3.0,-inf
25%,2005-10-07 18:00:00,-0.2,0.0,0.0,-0.2,-0.4
50%,2012-06-15 00:00:00,0.75,0.0,0.0,0.0,0.0
75%,2019-02-21 00:00:00,2.25,0.0,0.0,0.25,0.25
max,2025-10-31 00:00:00,4.0,0.75,1.0,4.0,inf
std,,1.428597,0.159585,0.156666,1.062107,


## Employment Level

The consumer price index data is tranformed from bronze to silver using the script `neuralts/data_preparation/load_emmployment_level.py`. The data contains the following columns: 

- **Date**: The date of the observation in datetime format
- **Employment_Level**: The total employment level in Germany
- **Month**: The month of the observation (1-12)
- **Year**: The year of the observation
- **Employment_Level_Change**: Absolute change in employment level from the previous month
- **Employment_Level_Relative_Change**: Relative (percentage) change in employment level from the previous month
- **Employment_Level_YoY_Change**: Year-over-Year absolute change in employment level
- **Employment_Level_YoY_Relative_Change**: Year-over-Year relative (percentage) change in employment level
- **Employment_Level_MA_3**: 3-month moving average of employment level
- **Employment_Level_MA_6**: 6-month moving average of employment level

The dataset spans from June 1999 to September 2025, containing 316 monthly observations.

In [34]:
storage_path = "../data/processed/historical_employment_level_germany.parquet"
df = pd.read_parquet(storage_path, engine='pyarrow')
df.head()

Unnamed: 0,Date,Employment_Level,Month,Year,Employment_Level_Change,Employment_Level_Relative_Change,Employment_Level_YoY_Change,Employment_Level_YoY_Relative_Change,Employment_Level_MA_3,Employment_Level_MA_6
0,1999-06-30,27418361,6,1999,-123568.0,-0.00451,423412.0,0.01544,27417040.0,27663250.0
1,1999-07-31,27294793,7,1999,-123568.0,-0.00451,423412.0,0.01544,27417040.0,27663250.0
2,1999-08-31,27537965,8,1999,243172.0,0.00891,423412.0,0.01544,27417040.0,27663250.0
3,1999-09-30,27858014,9,1999,320049.0,0.01162,423412.0,0.01544,27563590.0,27663250.0
4,1999-10-31,27874461,10,1999,16447.0,0.00059,423412.0,0.01544,27756810.0,27663250.0


In [35]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 316 entries, 0 to 315
Data columns (total 10 columns):
 #   Column                                Non-Null Count  Dtype         
---  ------                                --------------  -----         
 0   Date                                  316 non-null    datetime64[ns]
 1   Employment_Level                      316 non-null    int64         
 2   Month                                 316 non-null    int32         
 3   Year                                  316 non-null    int32         
 4   Employment_Level_Change               316 non-null    float64       
 5   Employment_Level_Relative_Change      316 non-null    float64       
 6   Employment_Level_YoY_Change           316 non-null    float64       
 7   Employment_Level_YoY_Relative_Change  316 non-null    float64       
 8   Employment_Level_MA_3                 316 non-null    float64       
 9   Employment_Level_MA_6                 316 non-null    float64       
dtypes:

In [36]:
df.describe()

Unnamed: 0,Date,Employment_Level,Month,Year,Employment_Level_Change,Employment_Level_Relative_Change,Employment_Level_YoY_Change,Employment_Level_YoY_Relative_Change,Employment_Level_MA_3,Employment_Level_MA_6
count,316,316.0,316.0,316.0,316.0,316.0,316.0,316.0,316.0,316.0
mean,2012-08-14 23:09:52.405063168,30147900.0,6.512658,2012.082278,24335.984177,0.000795,293890.572785,0.009631,30123610.0,30091220.0
min,1999-06-30 00:00:00,26008770.0,1.0,1999.0,-436714.0,-0.01567,-838062.0,-0.03023,26057540.0,26166750.0
25%,2006-01-23 06:00:00,27620200.0,4.0,2005.75,-67052.5,-0.002175,18215.5,0.00052,27635320.0,27663250.0
50%,2012-08-15 12:00:00,29311750.0,7.0,2012.0,29609.0,0.00102,423412.0,0.014325,29306000.0,29217450.0
75%,2019-03-07 18:00:00,33324230.0,9.0,2019.0,117163.0,0.004035,627002.25,0.020755,33325160.0,33305550.0
max,2025-09-30 00:00:00,35236860.0,12.0,2025.0,350527.0,0.01244,808602.0,0.02877,35226530.0,35072940.0
std,,2981013.0,3.439707,7.622411,161127.58558,0.005496,392342.879226,0.013538,2969078.0,2950246.0


## Gross Domestic Product (GDP) Data

The consumer price index data is tranformed from bronze to silver using the script `neuralts/data_preparation/load_gpd.py`. The data contains the following columns: 

- **Date**: The date of the observation in datetime format
- **GDP**: The Gross Domestic Product value for Germany (Basis Value 2020=100)
- **Month**: The month of the observation (1-12)
- **Year**: The year of the observation

The dataset spans from June 1970 to September 2025, containing 664 monthly observations.

In [37]:
storage_path = "../data/processed/historical_GDP_germany.parquet"
df = pd.read_parquet(storage_path, engine='pyarrow')
df.head()

Unnamed: 0,Date,GDP,Month,Year
0,1970-06-30,39.73,6,1970
1,1970-07-31,39.73,7,1970
2,1970-08-31,39.73,8,1970
3,1970-09-30,41.57,9,1970
4,1970-10-31,41.57,10,1970


In [38]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 664 entries, 0 to 663
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   Date    664 non-null    datetime64[ns]
 1   GDP     664 non-null    float64       
 2   Month   664 non-null    int32         
 3   Year    664 non-null    int32         
dtypes: datetime64[ns](1), float64(1), int32(2)
memory usage: 15.7 KB


In [39]:
df.describe()

Unnamed: 0,Date,GDP,Month,Year
count,664,664.0,664.0,664.0
mean,1998-02-13 19:59:16.626505984,75.196295,6.506024,1997.582831
min,1970-06-30 00:00:00,38.87,1.0,1970.0
25%,1984-04-22 12:00:00,55.24,4.0,1984.0
50%,1998-02-14 00:00:00,77.24,7.0,1998.0
75%,2011-12-07 18:00:00,92.0875,9.0,2011.0
max,2025-09-30 00:00:00,107.02,12.0,2025.0
std,,20.291426,3.446198,15.989507


## Marginal Lending Rate

The consumer price index data is tranformed from bronze to silver using the script `neuralts/data_preparation/load_lending_rate.py`. The data contains the following columns: 

- **Date**: The date of the observation in datetime format
- **Lending_Rate**: The ECB marginal lending facility rate value
- **Lending_Rate_Change**: Absolute change in the lending rate from the previous month
- **Lending_Rate_Relative_Change**: Relative (percentage) change in the lending rate from the previous month
- **Lending_Rate_YoY_Change**: Year-over-Year absolute change in the lending rate
- **Lending_Rate_YoY_Relative_Change**: Year-over-Year relative (percentage) change in the lending rate

The dataset spans from January 1999 to October 2025, containing 322 monthly observations.

In [43]:
storage_path = "../data/processed/historical_lending_rate.parquet"
df = pd.read_parquet(storage_path, engine='pyarrow')
df.head()

Unnamed: 0,Date,Lending_Rate,Lending_Rate_Change,Lending_Rate_Relative_Change,Lending_Rate_YoY_Change,Lending_Rate_YoY_Relative_Change
0,1999-01-31,4.5,0.0,0.0,-0.5,-0.11111
1,1999-02-28,4.5,0.0,0.0,-0.5,-0.11111
2,1999-03-31,4.5,0.0,0.0,-0.5,-0.11111
3,1999-04-30,3.5,-1.0,-0.22222,-0.5,-0.11111
4,1999-05-31,3.5,0.0,0.0,-0.5,-0.11111


In [44]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 322 entries, 0 to 321
Data columns (total 6 columns):
 #   Column                            Non-Null Count  Dtype         
---  ------                            --------------  -----         
 0   Date                              322 non-null    datetime64[ns]
 1   Lending_Rate                      322 non-null    float64       
 2   Lending_Rate_Change               322 non-null    float64       
 3   Lending_Rate_Relative_Change      322 non-null    float64       
 4   Lending_Rate_YoY_Change           322 non-null    float64       
 5   Lending_Rate_YoY_Relative_Change  322 non-null    float64       
dtypes: datetime64[ns](1), float64(5)
memory usage: 15.2 KB


In [42]:
df.describe()

Unnamed: 0,Date,Lending_Rate,Lending_Rate_Change,Lending_Rate_Relative_Change,Lending_Rate_YoY_Change,Lending_Rate_YoY_Relative_Change
count,322,322.0,322.0,322.0,322.0,322.0
mean,2012-06-15 02:00:44.720496896,2.364752,-0.006522,0.004136,-0.056522,0.327213
min,1999-01-31 00:00:00,0.25,-1.0,-0.46667,-3.5,-0.7
25%,2005-10-07 18:00:00,0.3,0.0,0.0,-0.5,-0.2
50%,2012-06-15 00:00:00,2.25,0.0,0.0,0.0,0.0
75%,2019-02-21 00:00:00,4.0,0.0,0.0,0.25,0.04762
max,2025-10-31 00:00:00,5.75,0.75,2.0,4.0,16.0
std,,1.79508,0.175228,0.14079,1.118579,2.076971


## Oil Prices

The consumer price index data is tranformed from bronze to silver using the script `neuralts/data_preparation/load_oil_prices.py`. The data contains the following columns: 

- **Date**: The date of the observation in datetime format
- **GR_price_with_tax_euro95**: Price of Euro95 gasoline with tax in Germany
- **GR_price_with_tax_diesel**: Price of diesel with tax in Germany
- **GR_price_with_tax_heGRing_oil**: Price of heating oil with tax in Germany
- **GR_price_with_tax_fuel_oil_1**: Price of fuel oil type 1 with tax in Germany

The dataset spans from January 2005 to November 2025, containing 251 monthly observations.

In [48]:
storage_path = "../data/processed/historical_oil_prices.parquet"
df = pd.read_parquet(storage_path, engine='pyarrow')
df.head()

Unnamed: 0,Date,GR_price_with_tax_euro95,GR_price_with_tax_diesel,GR_price_with_tax_heGRing_oil,GR_price_with_tax_fuel_oil_1
0,2005-01-31,796.0,782.0,438.0,227.7
1,2005-02-28,806.0,790.0,448.0,238.8
2,2005-03-31,823.0,862.0,492.0,251.03
3,2005-04-30,873.0,870.0,517.0,266.36
4,2005-05-31,852.0,841.0,841.0,267.6


In [46]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 251 entries, 0 to 250
Data columns (total 5 columns):
 #   Column                         Non-Null Count  Dtype         
---  ------                         --------------  -----         
 0   Date                           251 non-null    datetime64[ns]
 1   GR_price_with_tax_euro95       251 non-null    float64       
 2   GR_price_with_tax_diesel       251 non-null    float64       
 3   GR_price_with_tax_heGRing_oil  251 non-null    float64       
 4   GR_price_with_tax_fuel_oil_1   251 non-null    float64       
dtypes: datetime64[ns](1), float64(4)
memory usage: 9.9 KB


In [47]:
df.describe()

Unnamed: 0,Date,GR_price_with_tax_euro95,GR_price_with_tax_diesel,GR_price_with_tax_heGRing_oil,GR_price_with_tax_fuel_oil_1
count,251,251.0,251.0,251.0,251.0
mean,2015-07-01 01:20:19.123505920,1494.89243,1313.792829,1013.274263,490.265737
min,2005-01-31 00:00:00,796.0,782.0,438.0,227.7
25%,2010-04-15 00:00:00,1310.0,1108.0,837.0,375.6
50%,2015-06-30 00:00:00,1562.0,1341.0,1010.0,484.52
75%,2020-09-15 00:00:00,1711.5,1481.0,1231.725,611.69
max,2025-11-30 00:00:00,2400.0,2126.0,1606.0,718.69
std,,330.370096,264.610818,261.70191,134.565496


## Conclusion: Exogenous Variables for Vehicle Registration Forecasting

We can observe that these economic indicators serve as crucial supporting features for predicting automotive registration trends, capturing the macroeconomic forces that influence consumer purchasing behavior and market dynamics.

**Consumer Price Index**: Inflation dynamics directly affect purchasing power and vehicle affordability, with higher inflation potentially dampening registration volumes.

**Deposit Facility Rate**: ECB monetary policy signals borrowing costs and economic conditions, directly impacting auto loan affordability. Rate increases typically correlate with reduced vehicle financing demand, particularly for premium segments.

**Employment Level**: Employment stability is a fundamental driver of consumer confidence and purchasing capability for vehicles. The moving averages smooth seasonal variations and reveal underlying labor market trends that precede registration changes.

**Gross Domestic Product (GDP)**: GDP growth reflects overall economic health and wealth creation, establishing the baseline for automotive market expansion or contraction. Economic cycles captured in GDP strongly correlate with discretionary spending on vehicles.

**Marginal Lending Rate**: This upper bound of the ECB interest rate corridor influences commercial lending conditions and dealer financing costs.

**Oil Prices**: Rising fuel prices historically accelerate the transition to more efficient or alternative powertrains, making this a critical segmentation predictor.

These six data sources provide a comprehensive macroeconomic context that, when combined with the temporal patterns in registration data might enable models to distinguish between cyclical fluctuations, structural shifts, and trend-driven changes in the German automotive market.