# 1. Data Extraction

In this notebook we will extract our data and analyse it. For that purpose, we are importing from our library where we define the
```bcrp_dataframe``` dataframe. This function will allows us to use the API interface of the Central Bank of Reserve of Peru (BCRP) to automatically create a pandas dataframe with the necessary codes.

## 1.1 Libraries

We import the necessary libraries, including our own library in the modules file

In [1]:
# Warnings
import warnings
warnings.filterwarnings("ignore")

# Basic Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import requests
import seaborn as sns
from scipy import stats
from functools import reduce

# Statsmodels
import statsmodels.api as sm
import pmdarima as pmd
from pmdarima.arima import auto_arima
from statsmodels.tsa.api import VAR
from statsmodels.tsa.vector_ar.var_model import VARResults
from statsmodels.tsa.statespace.sarimax import SARIMAX
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import STL

# Machine Learning models
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
from sklearn.model_selection import train_test_split, GridSearchCV, TimeSeriesSplit
from sklearn.linear_model import Ridge, Lasso, ElasticNet, ElasticNetCV, LinearRegression
from sklearn.linear_model import LinearRegression
from sklearn.metrics import (
    mean_absolute_error,
    mean_squared_error,
    mean_absolute_percentage_error,
    median_absolute_error,
    r2_score,
    precision_score

)

from xgboost import XGBRegressor



In [2]:
# We import our own functions
import sys
sys.path.append('../../..')  # Move two levels up to the project root
from modules.functions import *

## 1.2 Extraction
We define our inputs and apply them the ```bcrp_dataframe``` function in order to obtain the pandas dataframe with the corresponding series

We define the following inputs:

    series     = the code of the series we are going to extract
    start_date = the starting date, when the BCRP starts using the interest rate as a policy measure
    end_date   = December 2019
    freq       = Monthly frequency

### df_1
We can now create the first dataframe with the ```bcrp_dataframe``` function. This dataframe contains out target variable Headline Inflation 

In [3]:
series     = ['PN01271PM', 'PN01280PM', 'PN01282PM', 'PN01278PM', 'PN09817PM','PN09816PM', 'PN01276PM', 'PN01313PM', 'PN01314PM',  
             'PN01315PM', 'PN09818PM','PN01286PM']
start_date = '2003-01'
end_date   = '2023-12'
freq       = 'Mensual'

In [4]:
df_1 = bcrp_dataframe( series , start_date , end_date , freq )
df_1.head()

Unnamed: 0_level_0,Índice de precios Lima Metropolitana (var% mensual) - IPC,Índice de precios Lima Metropolitana (var% mensual) - IPC Transables,Índice de precios Lima Metropolitana (var% mensual) - IPC No Transables,Índice de precios Lima Metropolitana (var% mensual) - IPC Subyacente,Índice de precios Lima Metropolitana (var% mensual) - IPC No Subyacente,Índice de precios Lima Metropolitana (var% mensual) - IPC Alimentos y Energía,Índice de precios Lima Metropolitana (var% mensual) - IPC Sin Alimentos y Energía,Índice de precios Lima Metropolitana (var% mensual) - IPC Alimentos y Bebidas,Índice de precios Lima Metropolitana (var% mensual) - IPC sin Alimentos y Bebidas,Índice de precios Lima Metropolitana (var% mensual) - IPC Subyacente Sin Alimentos y Bebidas,Índice de precios Lima Metropolitana (var% mensual) - IPC Importado,Índice de precios Lima Metropolitana (var% mensual) - Índice de Precios al por Mayor
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2003-01-01,0.23142,0.179002,0.270381,-0.004192,0.593592,0.506054,-0.111733,0.39801,0.06433,-0.012549,0.70214,-0.161934
2003-02-01,0.468825,0.135374,0.70877,0.128185,0.992255,0.722192,0.228145,0.594648,0.406663,0.070296,0.727582,0.437078
2003-03-01,1.11778,0.400108,1.623551,0.157515,2.564665,0.357386,1.938313,-0.098522,2.136297,0.446856,1.391691,0.825606
2003-04-01,-0.050857,0.112646,-0.182128,0.074363,-0.260881,-0.265865,0.264059,-0.394477,0.305377,0.069202,0.275001,-0.212853
2003-05-01,-0.032025,-0.346022,0.181194,0.011232,-0.1029,-0.139973,0.055604,0.19802,-0.264172,0.177667,-1.113037,-0.141752


In [5]:
df_1 = get_trend(df_1)
df_1.head()

Unnamed: 0_level_0,Índice de precios Lima Metropolitana (var% mensual) - IPC,Índice de precios Lima Metropolitana (var% mensual) - IPC Transables,Índice de precios Lima Metropolitana (var% mensual) - IPC No Transables,Índice de precios Lima Metropolitana (var% mensual) - IPC Subyacente,Índice de precios Lima Metropolitana (var% mensual) - IPC No Subyacente,Índice de precios Lima Metropolitana (var% mensual) - IPC Alimentos y Energía,Índice de precios Lima Metropolitana (var% mensual) - IPC Sin Alimentos y Energía,Índice de precios Lima Metropolitana (var% mensual) - IPC Alimentos y Bebidas,Índice de precios Lima Metropolitana (var% mensual) - IPC sin Alimentos y Bebidas,Índice de precios Lima Metropolitana (var% mensual) - IPC Subyacente Sin Alimentos y Bebidas,Índice de precios Lima Metropolitana (var% mensual) - IPC Importado,Índice de precios Lima Metropolitana (var% mensual) - Índice de Precios al por Mayor
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2003-01-01,0.092926,-0.090958,0.221921,0.023684,0.208767,-0.178262,0.421416,-0.272199,0.426209,0.11083,-0.264946,-0.193953
2003-02-01,0.112976,-0.053866,0.230003,0.029531,0.247979,-0.113158,0.387611,-0.195562,0.395215,0.106729,-0.165246,-0.130778
2003-03-01,0.13306,-0.016576,0.238007,0.035395,0.287271,-0.048449,0.354348,-0.120099,0.365339,0.102505,-0.066987,-0.068351
2003-04-01,0.153179,0.020809,0.246004,0.041241,0.326699,0.015669,0.321872,-0.04606,0.336814,0.098193,0.030109,-0.006735
2003-05-01,0.173188,0.058179,0.253822,0.047045,0.365928,0.078897,0.290215,0.02628,0.309608,0.093844,0.126013,0.053942


### df_inflation
We can now create the first dataframe with the ```bcrp_dataframe``` function. This dataframe contains out target variable Headline Inflation 

In [3]:
series     = ['PN01271PM']
start_date = '2003-01'
end_date   = '2023-12'
freq       = 'Mensual'

In [4]:
df_inflation = bcrp_dataframe( series , start_date , end_date , freq )
df_inflation = get_trend(df_inflation)
df_inflation.head()

Unnamed: 0_level_0,Índice de precios Lima Metropolitana (var% mensual) - IPC
Fecha,Unnamed: 1_level_1
2003-01-01,0.092926
2003-02-01,0.112976
2003-03-01,0.13306
2003-04-01,0.153179
2003-05-01,0.173188


### df_4A
We import the variables from Flores and Grandez (2023)

In [110]:
series     = ['PN01302PM']
start_date = '2003-01'
end_date   = '2023-12'
freq       = 'Mensual'

df_4a = bcrp_dataframe( series , start_date , end_date , freq )
df_4a = get_trend(df_4a)
df_4a.tail()

Unnamed: 0_level_0,Índice de precios al consumidor Lima Metropolitana: clasificación sectorial (variación porcentual) - Inflación Subyacente - Servicios - Comidas Fuera del Hogar
Fecha,Unnamed: 1_level_1
2023-08-01,0.493711
2023-09-01,0.466855
2023-10-01,0.439759
2023-11-01,0.412382
2023-12-01,0.384683


### df_4B
We import the variables from Flores and Grandez (2023)

In [113]:
series     = ['PN38916BM', 'PN38920BM']
start_date = '2003-03'
end_date   = '2023-12'
freq       = 'Mensual'

df_4b = bcrp_dataframe( series , start_date , end_date , freq )
#df_4b = df_4b.resample('M').interpolate(method='spline', order=2)
#df_4b = df_4b.pct_change()
df_4b.dropna(inplace=True)
#df_4b.index = df_4b.index.to_period('M').to_timestamp()
df_4b.tail()

Unnamed: 0_level_0,Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Exportaciones,Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Importaciones
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1
2023-08-01,1.222114,0.637752
2023-09-01,2.545351,0.498539
2023-10-01,-1.221888,0.41825
2023-11-01,0.766457,-0.877823
2023-12-01,1.103257,-2.321367


### df_5
We import the variables from Flores and Grandez (2023) that have a monthly transformation

In [137]:
series     = ['PD39797PM', 'PN00544MM', 'PN00541MM', 'PN00542MM', 'PN37627AM', 'RD13056DM', 'PN01755AM', 'PN01758AM', 'PN01759AM', 'PN01722AM', 'PN01766AM','PN01764AM', 'PN01725AM', 'PN01768AM']
start_date = '2003-01'
end_date   = '2023-12'
freq       = 'Mensual'

df_5 = bcrp_dataframe( series , start_date , end_date , freq )
df_5 = df_5.pct_change()
df_5 = df_5.dropna()
df_5.head()

Unnamed: 0_level_0,Indicadores de coyuntura - Colocaciones de pollos BB (miles de unidades),"Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - ME - Consumo (millones US$)","Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Consumo (millones S/)","Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Hipotecario (millones S/)",Indicadores indirectos de la tasa de utilización de la capacidad instalada del sector manufacturero - Manufactura no primaria - Alimentos y bebidas,Producción de electricidad por departamento - Lima (gwh),Producto bruto interno y demanda interna (índice 2007=100) - Agropecuario,Producto bruto interno y demanda interna (índice 2007=100) - Pesca,Producto bruto interno y demanda interna (índice 2007=100) - Minería e Hidrocarburos,Producto bruto interno y demanda interna (variaciones porcentuales anualizadas) - Manufactura - Manufactura no Primaria,Producto bruto interno y demanda interna (índice 2007=100) - Construcción,Producto bruto interno y demanda interna (índice 2007=100) - Manufactura - Manufactura no Primaria,Producto bruto interno y demanda interna (variaciones porcentuales anualizadas) - Comercio,Producto bruto interno y demanda interna (índice 2007=100) - Otros Servicios
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
2007-02-01,-0.111833,0.018648,0.019231,0.064859,-0.095231,-0.092203,0.058961,-0.113515,-0.065709,-0.024856,0.009715,-0.049243,0.717578,0.020164
2007-03-01,0.125827,0.024471,0.022813,0.075911,0.099374,0.192426,0.14089,0.016781,0.165048,0.168648,0.095299,0.105331,-0.30758,0.04418
2007-04-01,-0.037132,0.017068,0.023593,0.074806,-0.089545,-0.079114,0.258195,1.041227,-0.056056,0.163413,-0.085174,-0.072004,-2.119808,0.032706
2007-05-01,0.011645,0.037142,0.035926,0.060289,0.133984,0.082652,0.186145,0.720307,0.043615,-0.015815,0.210034,0.091221,-0.179348,0.012181
2007-06-01,0.018449,0.022069,0.029964,0.058422,-0.014951,0.022965,-0.098177,-0.14837,-0.033051,-0.240609,-0.053304,-0.030873,-1.600324,-0.020202


In [138]:
df_5.tail()

Unnamed: 0_level_0,Indicadores de coyuntura - Colocaciones de pollos BB (miles de unidades),"Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - ME - Consumo (millones US$)","Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Consumo (millones S/)","Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Hipotecario (millones S/)",Indicadores indirectos de la tasa de utilización de la capacidad instalada del sector manufacturero - Manufactura no primaria - Alimentos y bebidas,Producción de electricidad por departamento - Lima (gwh),Producto bruto interno y demanda interna (índice 2007=100) - Agropecuario,Producto bruto interno y demanda interna (índice 2007=100) - Pesca,Producto bruto interno y demanda interna (índice 2007=100) - Minería e Hidrocarburos,Producto bruto interno y demanda interna (variaciones porcentuales anualizadas) - Manufactura - Manufactura no Primaria,Producto bruto interno y demanda interna (índice 2007=100) - Construcción,Producto bruto interno y demanda interna (índice 2007=100) - Manufactura - Manufactura no Primaria,Producto bruto interno y demanda interna (variaciones porcentuales anualizadas) - Comercio,Producto bruto interno y demanda interna (índice 2007=100) - Otros Servicios
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
2023-08-01,0.019147,0.003929,0.008218,0.005268,0.073791,0.0531,-0.168606,0.383577,0.009423,-0.231102,0.049219,0.084509,-0.060579,0.00134
2023-09-01,-0.032237,0.006853,0.0023,0.009073,0.093072,0.00033,-0.024562,-0.235857,0.042027,0.506074,0.028969,-0.008608,-0.330258,0.024949
2023-10-01,0.016049,0.016984,0.003725,0.006669,-0.043358,-0.03196,0.037994,0.655763,0.016131,-0.412871,0.095891,0.026013,-0.282298,-0.016062
2023-11-01,0.008049,0.005182,0.000605,0.006279,0.01583,-0.10217,0.031954,0.674095,0.017945,-0.423602,-0.057156,0.008244,-0.046946,0.003585
2023-12-01,-0.027293,-0.025853,-0.00289,0.003147,-0.080944,-0.195104,0.018158,-0.368982,0.045737,-0.30255,0.415133,-0.044734,0.539088,0.108977


### df_6
We import the variables of expenditure from Flores and Grandez (2023) 

In [140]:
series     = ['PN02528AQ', 'PN02539AQ', 'PN02529AQ', 'PN02533AQ', 'PN02530AQ', 'PN02534AQ']
start_date = '2003-03'
end_date   = '2024-06'
freq       = 'Trimestral'

df_6 = bcrp_dataframe( series , start_date , end_date , freq )
df_6 = df_6.resample('M').interpolate(method='spline', order=2)
df_6 = df_6.pct_change()
df_6 = df_6.dropna()
df_6.index = df_6.index.to_period('M').to_timestamp()
df_6.tail()

Unnamed: 0_level_0,Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna,Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna sin Inventarios,Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna - Consumo Privado,Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna - Inversión Bruta Interna - Inversión Bruta Fija - Privada,Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna - Consumo Público,Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna - Inversión Bruta Interna - Inversión Bruta Fija - Pública
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2023-08-01,0.006621,-0.009052,-0.019059,0.017419,0.009135,-0.023573
2023-09-01,0.006937,-0.002427,-0.015914,0.012434,0.026666,0.046891
2023-10-01,0.009064,0.007524,-0.010544,0.008746,0.04761,0.138482
2023-11-01,0.010551,0.016856,-0.004317,0.00458,0.063132,0.195929
2023-12-01,0.012693,0.026919,0.001758,0.000801,0.079992,0.236925


### df_7
We import the external variables (mainly commodities) following Flores and Grandez (2023) 

In [141]:
series     = ['PN38766BM', 'PN38779BM', 'PN38784BM', 'PN06484IM', 'PN01661XM', 'PN01662XM', 'PN01664XM', 'PN38810BM']
start_date = '2003-03'
end_date   = '2023-12'
freq       = 'Mensual'

df_7 = bcrp_dataframe( series , start_date , end_date , freq )
df_7 = df_7.pct_change()
df_7 = df_7.dropna()
df_7.tail()

Unnamed: 0_level_0,Exportaciones de productos tradicionales (precios) - Pesqueros - Harina de Pescado - Precio (US$ por toneladas),Exportaciones de productos tradicionales (precios) - Agrícolas - Café - Precio (US$ por toneladas),Exportaciones de productos tradicionales (precios) - Mineros - Cobre - Precio (¢US$ por libras),Liquidez internacional del BCRP - Valuación Contable del Oro (US$ por onzas troy),Cotizaciones de productos (promedio del periodo) - Trigo - EEUU (US$ por toneladas),Cotizaciones de productos (promedio del periodo) - Maíz - EEUU (US$ por toneladas),Cotizaciones de productos (promedio del periodo) - Aceite Soya - EEUU (US$ por toneladas),Exportaciones de productos tradicionales (precios) - Petróleo y Gas Natural - Petróleo Crudo y Derivados - Precio (US$ por barriles)
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2023-08-01,-0.005454,-0.069119,0.002513,0.002989,-0.127831,-0.076292,0.012297,0.206876
2023-09-01,0.032075,-0.039845,0.010806,-0.047186,-0.035864,-0.10408,-0.068624,-0.005663
2023-10-01,0.042719,-0.024912,-0.036353,0.073004,-0.121192,0.013018,-0.134533,0.014802
2023-11-01,-0.069442,0.013585,0.001803,0.026029,0.01866,-0.044609,-0.06095,-0.109343
2023-12-01,0.068633,0.038128,0.059882,0.014051,0.032591,-0.000897,-0.029272,-0.01735


### df_8
We import the interest rate and exchange rate following Flores and Grandez (2023) 

In [118]:
series     = ['PN01260PM', 'PD04722MM', 'PN00493MM']
start_date = '2003-03'
end_date   = '2023-12'
freq       = 'Mensual'

df_8 = bcrp_dataframe( series , start_date , end_date , freq )
df_8 = df_8.pct_change()
df_8 = df_8.dropna()
df_8.tail()

Unnamed: 0_level_0,Índice del tipo de cambio real (var% mensual) - Multilateral,Tasas de interés del Banco Central de Reserva - Tasa de Referencia de la Política Monetaria,Tasas de interés del Banco Central de Reserva - Tasa de Encaje
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2023-08-01,-2.413133,0.0,-0.085417
2023-09-01,-0.842197,-0.032258,0.02562
2023-10-01,9.82907,-0.033333,0.001447
2023-11-01,-1.377848,-0.034483,0.06604
2023-12-01,-0.861948,-0.035714,0.041179


### df_9
We import the finance variables following Flores and Grandez (2023) 

In [119]:
series     = ['PN01142MM', 'PN01143MM', 'RD13908DM', 'RD13854DM']
start_date = '2003-03'
end_date   = '2023-12'
freq       = 'Mensual'

df_9 = bcrp_dataframe( series , start_date , end_date , freq )
df_9 = df_9.pct_change()
df_9 = df_9.dropna()
df_9.tail()

Unnamed: 0_level_0,Bolsa de Valores de Lima - Índices Bursátiles - Índice General BVL (base 31/12/91 = 100),Bolsa de Valores de Lima - Índices Bursátiles - Índice Selectivo BVL (base 31/12/91 = 100),Ingresos tributarios recaudados por SUNAT - Impuesto General a las Ventas interno según departamento - Total (millones de soles),Ingresos tributarios recaudados por SUNAT - impuesto a la renta según departamento - Total (millones de soles)
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2023-08-01,-0.015987,-0.005701,0.051729,0.01334
2023-09-01,-0.026158,-0.022564,-0.030609,0.017657
2023-10-01,-0.031559,-0.045676,0.033536,0.53504
2023-11-01,0.005364,0.003497,-0.017229,-0.276218
2023-12-01,0.183527,0.130948,-0.005219,-0.019778


### df_10
We import the expectations of some variables following Flores and Grandez (2023) 

In [120]:
series     = ['PD12912AM', 'PD38048AM', 'PD38049AM']
start_date = '2003-03'
end_date   = '2023-12'
freq       = 'Mensual'

df_10 = bcrp_dataframe( series , start_date , end_date , freq )
df_10 = df_10.dropna()
df_10.tail()

Unnamed: 0_level_0,Expectativas macroeconómicas - Expectativa de Inflación a 12 meses,Expectativas macroeconómicas - Expectativa de PBI a 12 meses,Expectativas macroeconómicas - Expectativa de TC a 12 meses
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2023-08-01,3.358333,1.966667,3.7375
2023-09-01,3.378125,1.94375,3.7725
2023-10-01,3.329167,1.8625,3.820833
2023-11-01,3.145833,2.058333,3.79875
2023-12-01,2.825,2.25,3.79


### df_11
We import the expectations of some variables following Flores and Grandez (2023) 

In [121]:
series     = ['PD38045AM', 'PD38046AM', 'PD38047AM', 'PD38043AM', 'PD38042AM', 'PD38041AM', 'PD38044AM']
start_date = '2003-03'
end_date   = '2023-12'
freq       = 'Mensual'

df_11 = bcrp_dataframe( series , start_date , end_date , freq )
df_11 = df_11.pct_change()
df_11 = df_11.dropna()
df_11.tail()

Unnamed: 0_level_0,Expectativas empresariales totales - Índice de expectativas de la economía a 3 meses,Expectativas empresariales totales - Índice de expectativas del sector a 3 meses,Expectativas empresariales totales - Índice de expectativas de la demanda a 3 meses,Expectativas empresariales totales - Índice de órdenes de compra respecto al mes anterior,Expectativas empresariales totales - Índice de inventarios respecto al mes anterior,Expectativas empresariales totales - Índice de venta respecto al mes anterior,Expectativas empresariales totales - Índice de la situación actual del negocio
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2023-08-01,-0.009454,0.029787,0.020431,0.059756,-0.12864,0.112357,0.049139
2023-09-01,-0.083033,-0.084305,-0.031007,-0.056778,0.104902,-0.049537,-0.073077
2023-10-01,-0.042286,-0.03236,-0.115888,-0.047224,-0.052989,-0.099,-0.018959
2023-11-01,-0.028548,0.003884,0.042851,0.040926,0.015431,0.059722,-0.016416
2023-12-01,0.095335,0.061404,-0.056807,-0.041719,0.026118,-0.063812,0.028773


### df_12
We import the employment variables following Flores and Grandez (2023) 

In [103]:
series     = ['PN38050GM', 'PN38052GM', 'PN38053GM', 'PN38054GM', 'PN38055GM', 'PN38056GM', 'PN38057GM', 'PN38058GM', 'PN38059GM', 'PN38060GM', 'PN38061GM', 'PN38062GM', 'PN38069GM', 'PN38070GM', 'PN38063GM']
start_date = '2003-03'
end_date   = '2023-12'
freq       = 'Mensual'

df_12 = bcrp_dataframe( series , start_date , end_date , freq )
df_12 = df_12.pct_change()
df_12 = df_12.dropna()
df_12.tail()

Unnamed: 0_level_0,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Edad - 14 a 24 años,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Edad - 25 a 44 años,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Edad - 45 a más años,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Independiente,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Dependiente,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Trabajador no Remunerado,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 1 a 10 trabajadores,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 11 a 50 trabajadores,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 51 y más,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Adecuadamente Empleada,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Subempleada,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - Coeficiente de Ocupación,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - Ingreso Mensual,Empleo en Lima Metropolitana - Promedio móvil tres meses (porcentaje) - Tasa de Desempleo (%)
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
2023-08-01,0.004426,0.001775,-0.001113,0.002015,-0.003387,0.003075,-0.00963,0.003672,0.025094,-0.016357,0.010773,-0.014092,-0.002963,0.007206,0.059598
2023-09-01,-0.002307,-0.019097,-0.008221,0.009354,0.037512,-0.012249,-0.236708,0.000451,-0.022502,-0.003658,-0.002652,-0.004252,-0.006672,-0.003658,0.014112
2023-10-01,0.001795,-0.021443,0.000632,0.017016,0.020163,-0.004473,-0.023207,0.014899,-0.06129,0.005309,0.009151,-0.004133,0.000334,0.005766,-0.026198
2023-11-01,0.012298,0.01907,0.011134,0.011187,0.029782,0.004042,-0.03639,0.024137,-0.017521,-0.003176,-0.012732,0.048607,0.008813,-0.019686,0.001051
2023-12-01,0.003443,-0.025943,0.00564,0.016115,-0.006375,0.002075,0.255287,0.005681,0.010032,0.00284,0.014463,-0.00718,0.002026,0.00621,-0.027032


In [123]:
#df_final = df_inflation.join(df_4a, how='inner').join(df_4b, how='inner').join(df_5, how='inner').join(df_6, how='inner').join(df_7, how='inner').join(df_8, how='inner').join(df_9, how='inner').join(df_10).join(df_11).join(df_12)

In [150]:
df_final = df_inflation.join(df_4a, how='inner').join(df_4b, how='inner').join(df_5, how='inner').join(df_6, how='inner').join(df_7, how='inner').join(df_8, how='inner').join(df_9, how='inner').join(df_10).join(df_11).join(df_12)
df_final.dropna(inplace=True)
df_final.head()

Unnamed: 0_level_0,Índice de precios Lima Metropolitana (var% mensual) - IPC,Índice de precios al consumidor Lima Metropolitana: clasificación sectorial (variación porcentual) - Inflación Subyacente - Servicios - Comidas Fuera del Hogar,Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Exportaciones,Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Importaciones,Indicadores de coyuntura - Colocaciones de pollos BB (miles de unidades),"Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - ME - Consumo (millones US$)","Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Consumo (millones S/)","Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Hipotecario (millones S/)",Indicadores indirectos de la tasa de utilización de la capacidad instalada del sector manufacturero - Manufactura no primaria - Alimentos y bebidas,Producción de electricidad por departamento - Lima (gwh),...,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Dependiente,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Trabajador no Remunerado,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 1 a 10 trabajadores,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 11 a 50 trabajadores,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 51 y más,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Adecuadamente Empleada,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Subempleada,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - Coeficiente de Ocupación,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - Ingreso Mensual,Empleo en Lima Metropolitana - Promedio móvil tres meses (porcentaje) - Tasa de Desempleo (%)
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2012-02-01,0.293213,0.513753,2.084305,0.415673,-0.07056,0.013861,0.012647,0.018248,-0.00385,0.046,...,-0.015026,0.085068,0.014641,-0.061396,-0.006115,-0.000228,0.00094,-0.001258,0.001426,0.072148
2012-03-01,0.281043,0.505105,-1.91835,0.712507,0.117432,0.010176,0.012328,0.024442,0.030024,0.11416,...,0.000473,-0.077565,-0.017944,-0.006126,0.018381,0.020207,-0.044616,-0.00769,0.02964,0.041515
2012-04-01,0.26781,0.498486,-0.974725,0.043185,-0.060808,0.013539,0.008976,0.018552,-0.042243,-0.099706,...,0.006696,-0.095389,0.003692,-0.016509,0.009713,-0.0242,0.046237,0.001891,-0.016161,-0.069863
2012-05-01,0.253922,0.493666,-0.216514,-1.344951,0.063189,0.024991,0.019996,0.020493,0.048016,0.019639,...,0.007881,0.002999,0.003679,0.069158,-0.007347,-0.011218,0.033366,0.005487,-0.013942,-0.103756
2012-06-01,0.240839,0.489829,-6.245486,-1.499753,-0.007178,0.015136,0.015002,0.023488,-0.017617,0.025558,...,0.008321,0.014449,-0.000673,-0.005861,0.004731,-0.009531,0.01406,-0.001138,-0.000283,-0.129309


## 1.3 Data Inspection
We inspect the df. We first verify that all values are non-null. The, we apply the ```describe``` function to see the main variables.

In [143]:
df_final.isna().sum()

Índice de precios Lima Metropolitana (var% mensual) - IPC                                                                                                          0
Índice de precios al consumidor Lima Metropolitana: clasificación sectorial (variación porcentual) - Inflación Subyacente - Servicios - Comidas Fuera del Hogar    0
Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Exportaciones                                                          0
Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Importaciones                                                          0
Indicadores de coyuntura - Colocaciones de pollos BB (miles de unidades)                                                                                           0
                                                                                                                                                                  ..
Empleo en 

In [144]:
df_final.describe()

Unnamed: 0,Índice de precios Lima Metropolitana (var% mensual) - IPC,Índice de precios al consumidor Lima Metropolitana: clasificación sectorial (variación porcentual) - Inflación Subyacente - Servicios - Comidas Fuera del Hogar,Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Exportaciones,Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Importaciones,Indicadores de coyuntura - Colocaciones de pollos BB (miles de unidades),"Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - ME - Consumo (millones US$)","Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Consumo (millones S/)","Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Hipotecario (millones S/)",Indicadores indirectos de la tasa de utilización de la capacidad instalada del sector manufacturero - Manufactura no primaria - Alimentos y bebidas,Producción de electricidad por departamento - Lima (gwh),...,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Dependiente,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Trabajador no Remunerado,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 1 a 10 trabajadores,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 11 a 50 trabajadores,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 51 y más,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Adecuadamente Empleada,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Subempleada,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - Coeficiente de Ocupación,Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - Ingreso Mensual,Empleo en Lima Metropolitana - Promedio móvil tres meses (porcentaje) - Tasa de Desempleo (%)
count,143.0,143.0,143.0,143.0,143.0,143.0,143.0,143.0,143.0,143.0,...,143.0,143.0,143.0,143.0,143.0,143.0,143.0,143.0,143.0,143.0
mean,0.283274,0.371246,0.116242,0.06451,0.003797,0.001961,0.008574,0.012522,0.003708,0.011433,...,0.001993,0.004034,0.00413,0.002477,0.001288,0.00292,0.002715,0.001006,0.002862,0.001861
std,0.156216,0.18496,2.588655,1.275334,0.05597,0.022043,0.008219,0.007971,0.067126,0.135077,...,0.042679,0.113991,0.073202,0.057305,0.031069,0.056991,0.052849,0.050537,0.014504,0.08282
min,-0.011845,0.078336,-6.245486,-3.340271,-0.142182,-0.110985,-0.020597,-0.004029,-0.282675,-0.446969,...,-0.26757,-0.37037,-0.377523,-0.326794,-0.176258,-0.342658,-0.242929,-0.303347,-0.052191,-0.204016
25%,0.180763,0.18772,-1.883314,-0.701121,-0.031384,-0.005974,0.004772,0.00666,-0.044531,-0.065135,...,-0.007477,-0.038205,-0.007964,-0.024655,-0.010083,-0.005755,-0.013453,-0.005769,-0.005268,-0.049291
50%,0.254087,0.400757,0.014477,0.078572,0.005239,0.003467,0.010385,0.010979,-0.00385,0.00033,...,0.00317,-0.003544,0.001523,0.003507,0.000968,0.002808,0.000676,-7.9e-05,0.004436,-0.010915
75%,0.311395,0.458847,1.75815,0.756391,0.036463,0.010878,0.013174,0.017501,0.048005,0.065681,...,0.013545,0.050084,0.011357,0.026518,0.012947,0.013333,0.012444,0.006933,0.011792,0.047477
max,0.679208,0.76576,7.039029,4.03757,0.146322,0.151069,0.030034,0.034917,0.240214,0.748028,...,0.22467,0.794118,0.582138,0.285714,0.146204,0.261944,0.409335,0.334168,0.03167,0.461078


We have 195 observation ranging from ```2003-10-01``` to ```2019-12-01```. The mean of monthly % change of all CPI variables is around 0.2. The mean of the lacing rate and the interest rate is 10.7% and 3.67%, respectively. The three monetary variables have a small monthly % change, around 0.01 and 0.001 for Minimum Wage index. 

In [145]:
df_final.columns

Index(['Índice de precios Lima Metropolitana (var% mensual) - IPC',
       'Índice de precios al consumidor Lima Metropolitana: clasificación sectorial (variación porcentual) - Inflación Subyacente - Servicios - Comidas Fuera del Hogar',
       'Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Exportaciones',
       'Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Importaciones',
       'Indicadores de coyuntura - Colocaciones de pollos BB (miles de unidades)',
       'Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - ME - Consumo (millones US$)',
       'Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Consumo (millones S/)',
       'Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Hipotecario (millones S

## 1.4 Data adjustment
We will rename the columns for easier identification of the variables. We will also create a new dataframe with the lags of the variables. 

In [151]:
# New column names
columns ={
    'Índice de precios Lima Metropolitana (var% mensual) - IPC': 'CPI',
    'Índice de precios al consumidor Lima Metropolitana: clasificación sectorial (variación porcentual) - Inflación Subyacente - Servicios - Comidas Fuera del Hogar': 'Core CPI - Services - Dining Out',
    'Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Exportaciones': 'Terms of Trade - Exports',
    'Términos de intercambio de comercio exterior (var% mensual) - Índice de Precios Nominales - Importaciones': 'Terms of Trade - Imports',
    'Indicadores de coyuntura - Colocaciones de pollos BB (miles de unidades)': 'Chicken Placements (thousands)',
    'Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - ME - Consumo (millones US$)': 'Private Credit - ME - Consumption (mill US$)',
    'Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Consumo (millones S/)': 'Private Credit - MN - Consumption (mill S/)',
    'Crédito al sector privado de las sociedades creadoras de depósito, por tipo de crédito y por monedas - Saldos - MN - Hipotecario (millones S/)': 'Private Credit - MN - Mortgage (mill S/)',
    'Indicadores indirectos de la tasa de utilización de la capacidad instalada del sector manufacturero - Manufactura no primaria - Alimentos y bebidas': 'Capacity Utilization - Food & Beverages',
    'Producción de electricidad por departamento - Lima (gwh)': 'Electricity Production - Lima (GWh)',
    'Producto bruto interno y demanda interna (índice 2007=100) - Agropecuario': 'GDP - Agriculture',
    'Producto bruto interno y demanda interna (índice 2007=100) - Pesca': 'GDP - Fishing',
    'Producto bruto interno y demanda interna (índice 2007=100) - Minería e Hidrocarburos': 'GDP - Mining',
    'Producto bruto interno y demanda interna (variaciones porcentuales anualizadas) - Manufactura - Manufactura no Primaria': 'GDP Growth - Non-Primary Manufacturing',
    'Producto bruto interno y demanda interna (índice 2007=100) - Construcción': 'GDP - Construction',
    'Producto bruto interno y demanda interna (índice 2007=100) - Manufactura - Manufactura no Primaria': 'GDP - Non-Primary Manufacturing',
    'Producto bruto interno y demanda interna (variaciones porcentuales anualizadas) - Comercio': 'GDP Growth - Commerce (annual %)',
    'Producto bruto interno y demanda interna (índice 2007=100) - Otros Servicios': 'GDP - Other Services (index 2007=100)',
    'Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna': 'GDP by Expenditure - Domestic Demand (mill S/ 2007)',
    'Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna sin Inventarios': 'GDP by Expenditure - Domestic Demand ex. Inventories (mill S/ 2007)',
    'Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna - Consumo Privado': 'Private Consumption (mill S/ 2007)',
    'Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna - Inversión Bruta Interna - Inversión Bruta Fija - Privada': 'Private Fixed Investment (mill S/ 2007)',
    'Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna - Consumo Público': 'Public Consumption (mill S/ 2007)',
    'Producto bruto interno por tipo de gasto (millones S/ 2007) - Demanda Interna - Inversión Bruta Interna - Inversión Bruta Fija - Pública': 'Public Fixed Investment (mill S/ 2007)',
    'Exportaciones de productos tradicionales (precios) - Pesqueros - Harina de Pescado - Precio (US$ por toneladas)': 'Fishmeal Price (US$ per ton)',
    'Exportaciones de productos tradicionales (precios) - Agrícolas - Café - Precio (US$ por toneladas)': 'Coffee Price (US$ per ton)',
    'Exportaciones de productos tradicionales (precios) - Mineros - Cobre - Precio (¢US$ por libras)': 'Copper Price (¢US$ per lb)',
    'Liquidez internacional del BCRP - Valuación Contable del Oro (US$ por onzas troy)': 'Gold (US$ per oz troy)',
    'Cotizaciones de productos (promedio del periodo) - Trigo - EEUU (US$ por toneladas)': 'Wheat Price (US$ per ton)',
    'Cotizaciones de productos (promedio del periodo) - Maíz - EEUU (US$ por toneladas)': 'Corn Price (US$ per ton)',
    'Cotizaciones de productos (promedio del periodo) - Aceite Soya - EEUU (US$ por toneladas)': 'Soybean Oil Price (US$ per ton)',
    'Exportaciones de productos tradicionales (precios) - Petróleo y Gas Natural - Petróleo Crudo y Derivados - Precio (US$ por barriles)': 'Crude Oil Price (US$ per barrel)',
    'Índice del tipo de cambio real (var% mensual) - Multilateral': 'Real Exchange Rate Index (monthly %)',
    'Tasas de interés del Banco Central de Reserva  - Tasa de Referencia de la Política Monetaria': 'Monetary Policy Rate',
    'Tasas de interés del Banco Central de Reserva  - Tasa de Encaje': 'Reserve Requirement Rate',
    'Bolsa de Valores de Lima - Índices Bursátiles - Índice General BVL (base 31/12/91 = 100)': 'General Index (base 31/12/91 = 100)',
    'Bolsa de Valores de Lima - Índices Bursátiles - Índice Selectivo BVL (base 31/12/91 = 100)': 'Selective Index (base 31/12/91 = 100)',
    'Ingresos tributarios recaudados por SUNAT - Impuesto General a las Ventas interno según departamento - Total (millones de soles)': 'Tax Revenue - VAT (mill S/)',
    'Ingresos tributarios recaudados por SUNAT - impuesto a la renta según departamento - Total (millones de soles)': 'Tax Revenue - Income Tax (mill S/)',
    'Expectativas macroeconómicas - Expectativa de Inflación a 12 meses': 'Macroeconomic Expectations - Inflation (12 months)',
    'Expectativas macroeconómicas - Expectativa de PBI a 12 meses': 'Macroeconomic Expectations - GDP (12 months)',
    'Expectativas macroeconómicas - Expectativa de TC a 12 meses': 'Macroeconomic Expectations - Exchange Rate (12 months)',
    'Expectativas empresariales totales - Índice de expectativas de la economía a 3 meses': 'Business Expectations - Economy (3 months)',
    'Expectativas empresariales totales - Índice de expectativas del sector a 3 meses': 'Business Expectations - Sector (3 months)',
    'Expectativas empresariales totales - Índice de expectativas de la demanda a 3 meses': 'Business Expectations - Demand (3 months)',
    'Expectativas empresariales totales - Índice de órdenes de compra respecto al mes anterior': 'Business Expectations - Orders (MoM)',
    'Expectativas empresariales totales - Índice de inventarios respecto al mes anterior': 'Business Expectations - Inventory (MoM)',
    'Expectativas empresariales totales - Índice de venta respecto al mes anterior': 'Business Expectations - Sales (MoM)',
    'Expectativas empresariales totales - Índice de la situación actual del negocio': 'Business Expectations - Current Situation',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA': 'Labor Force (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Edad - 14 a 24 años': 'Employed 14-24 (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Edad - 25 a 44 años': 'Employed 25-44 (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Edad - 45 a más años': 'Employed 45+ (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Independiente': 'Independent Worker (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Dependiente': 'Dependent Worker (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Categoría Ocupacional - Trabajador no Remunerado': 'Unpaid Worker (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 1 a 10 trabajadores': 'Firms 1-10 Workers (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 11 a 50 trabajadores': 'Firms 11-50 Workers (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Ocupada - Por Tamaño de Empresa - De 51 y más': 'Firms 51+ Workers (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Adecuadamente Empleada': 'Adequately Employed (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - PEA Subempleada': 'Underemployed (3-month MA, thousands)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - Coeficiente de Ocupación': 'Employment Ratio (3-month MA)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (miles de personas) - Ingreso Mensual': 'Monthly Income (3-month MA)',
    'Empleo en Lima Metropolitana - Promedio móvil tres meses (porcentaje) - Tasa de Desempleo (%)': 'Unemployment Rate (3-month MA, %)'
}

# We rename the columns so they are easier to analyse
df_final.rename(columns=columns, inplace=True)

In [152]:
df_final

Unnamed: 0_level_0,CPI,Core CPI - Services - Dining Out,Terms of Trade - Exports,Terms of Trade - Imports,Chicken Placements (thousands),Private Credit - ME - Consumption (mill US$),Private Credit - MN - Consumption (mill S/),Private Credit - MN - Mortgage (mill S/),Capacity Utilization - Food & Beverages,Electricity Production - Lima (GWh),...,"Dependent Worker (3-month MA, thousands)","Unpaid Worker (3-month MA, thousands)","Firms 1-10 Workers (3-month MA, thousands)","Firms 11-50 Workers (3-month MA, thousands)","Firms 51+ Workers (3-month MA, thousands)","Adequately Employed (3-month MA, thousands)","Underemployed (3-month MA, thousands)",Employment Ratio (3-month MA),Monthly Income (3-month MA),"Unemployment Rate (3-month MA, %)"
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2012-02-01,0.293213,0.513753,2.084305,0.415673,-0.070560,0.013861,0.012647,0.018248,-0.003850,0.046000,...,-0.015026,0.085068,0.014641,-0.061396,-0.006115,-0.000228,0.000940,-0.001258,0.001426,0.072148
2012-03-01,0.281043,0.505105,-1.918350,0.712507,0.117432,0.010176,0.012328,0.024442,0.030024,0.114160,...,0.000473,-0.077565,-0.017944,-0.006126,0.018381,0.020207,-0.044616,-0.007690,0.029640,0.041515
2012-04-01,0.267810,0.498486,-0.974725,0.043185,-0.060808,0.013539,0.008976,0.018552,-0.042243,-0.099706,...,0.006696,-0.095389,0.003692,-0.016509,0.009713,-0.024200,0.046237,0.001891,-0.016161,-0.069863
2012-05-01,0.253922,0.493666,-0.216514,-1.344951,0.063189,0.024991,0.019996,0.020493,0.048016,0.019639,...,0.007881,0.002999,0.003679,0.069158,-0.007347,-0.011218,0.033366,0.005487,-0.013942,-0.103756
2012-06-01,0.240839,0.489829,-6.245486,-1.499753,-0.007178,0.015136,0.015002,0.023488,-0.017617,0.025558,...,0.008321,0.014449,-0.000673,-0.005861,0.004731,-0.009531,0.014060,-0.001138,-0.000283,-0.129309
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2023-08-01,0.182324,0.493711,1.222114,0.637752,0.019147,0.003929,0.008218,0.005268,0.073791,0.053100,...,0.003075,-0.009630,0.003672,0.025094,-0.016357,0.010773,-0.014092,-0.002963,0.007206,0.059598
2023-09-01,0.133933,0.466855,2.545351,0.498539,-0.032237,0.006853,0.002300,0.009073,0.093072,0.000330,...,-0.012249,-0.236708,0.000451,-0.022502,-0.003658,-0.002652,-0.004252,-0.006672,-0.003658,0.014112
2023-10-01,0.085365,0.439759,-1.221888,0.418250,0.016049,0.016984,0.003725,0.006669,-0.043358,-0.031960,...,-0.004473,-0.023207,0.014899,-0.061290,0.005309,0.009151,-0.004133,0.000334,0.005766,-0.026198
2023-11-01,0.036736,0.412382,0.766457,-0.877823,0.008049,0.005182,0.000605,0.006279,0.015830,-0.102170,...,0.004042,-0.036390,0.024137,-0.017521,-0.003176,-0.012732,0.048607,0.008813,-0.019686,0.001051


In [53]:
df_lags = df.copy()

for variable in df_lags.columns[1:]:
    df_lags[f'{variable}_lag_1'] = df_lags[variable].shift()
    df_lags[f'{variable}_lag_2'] = df_lags[variable].shift(2)
    df_lags[f'{variable}_lag_3'] = df_lags[variable].shift(3)
    df_lags[f'{variable}_lag_4'] = df_lags[variable].shift(4)

In [54]:
# We delete contemporary variables
df_lags.drop(columns = ['Monetary Policy Rate','Circulating Currency Seasonally Adjusted (mill S/)',
       'Net International Reserves (mill $)', 'Real Minimum Wage (Index)', 'Wheat (US$ per ton)', 'Corn  (US$ per ton)', 
       'Soybean oil (US$ per ton)', 'Crude oil (US$ per barrel)', 'Exchange rate'], inplace = True)

df_lags = df_lags.dropna()

## 1.5 Save Results
We save it to the ```input``` folder, where we can use it to do the forecasting in the next notebook.

In [55]:
df.to_csv('../../../input/df_raw_test.csv')

In [56]:
df_lags.to_csv('../../../input/df_lags_test.csv')

In [57]:
df_lags.tail()

Unnamed: 0_level_0,CPI,Monetary Policy Rate_lag_1,Monetary Policy Rate_lag_2,Monetary Policy Rate_lag_3,Monetary Policy Rate_lag_4,Exchange rate_lag_1,Exchange rate_lag_2,Exchange rate_lag_3,Exchange rate_lag_4,Circulating Currency Seasonally Adjusted (mill S/)_lag_1,...,Corn (US$ per ton)_lag_3,Corn (US$ per ton)_lag_4,Soybean oil (US$ per ton)_lag_1,Soybean oil (US$ per ton)_lag_2,Soybean oil (US$ per ton)_lag_3,Soybean oil (US$ per ton)_lag_4,Crude oil (US$ per barrel)_lag_1,Crude oil (US$ per barrel)_lag_2,Crude oil (US$ per barrel)_lag_3,Crude oil (US$ per barrel)_lag_4
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2023-08-01,0.182324,7.75,7.75,7.75,7.75,3.601255,3.650419,3.688668,3.76545,11.196625,...,5.461808,5.531248,7.342273,7.158705,7.063081,7.124298,4.327658,4.253043,4.271259,4.37587
2023-09-01,0.133933,7.75,7.75,7.75,7.75,3.697768,3.601255,3.650419,3.688668,11.194037,...,5.487035,5.461808,7.354495,7.342273,7.158705,7.063081,4.399204,4.327658,4.253043,4.271259
2023-10-01,0.085365,7.5,7.75,7.75,7.75,3.730995,3.697768,3.601255,3.650419,11.196305,...,5.368204,5.487035,7.283402,7.354495,7.342273,7.158705,4.490772,4.399204,4.327658,4.253043
2023-11-01,0.036736,7.25,7.5,7.75,7.75,3.845759,3.730995,3.697768,3.601255,11.202967,...,5.288844,5.368204,7.138916,7.283402,7.354495,7.342273,4.448655,4.490772,4.399204,4.327658
2023-12-01,-0.011845,7.0,7.25,7.5,7.75,3.760795,3.845759,3.730995,3.697768,11.200187,...,5.17894,5.288844,7.07603,7.138916,7.283402,7.354495,4.351357,4.448655,4.490772,4.399204


In [58]:
df.tail()

Unnamed: 0_level_0,CPI,Monetary Policy Rate,Exchange rate,Circulating Currency Seasonally Adjusted (mill S/),Net International Reserves (mill $),Real Minimum Wage (Index),Wheat (US$ per ton),Corn (US$ per ton),Soybean oil (US$ per ton),Crude oil (US$ per barrel)
Fecha,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2023-08-01,0.182324,7.75,3.697768,11.194037,11.182374,5.646268,5.714305,5.288844,7.354495,4.399204
2023-09-01,0.133933,7.5,3.730995,11.196305,11.173721,5.646104,5.677783,5.17894,7.283402,4.490772
2023-10-01,0.085365,7.25,3.845759,11.202967,11.172515,5.649334,5.548594,5.191875,7.138916,4.448655
2023-11-01,0.036736,7.0,3.760795,11.200187,11.180961,5.650966,5.567082,5.14624,7.07603,4.351357
2023-12-01,-0.011845,6.75,3.733942,11.186627,11.1709,5.646966,5.599153,5.145342,7.04632,4.276196


In [59]:
df_lags.columns

Index(['CPI', 'Monetary Policy Rate_lag_1', 'Monetary Policy Rate_lag_2',
       'Monetary Policy Rate_lag_3', 'Monetary Policy Rate_lag_4',
       'Exchange rate_lag_1', 'Exchange rate_lag_2', 'Exchange rate_lag_3',
       'Exchange rate_lag_4',
       'Circulating Currency Seasonally Adjusted (mill S/)_lag_1',
       'Circulating Currency Seasonally Adjusted (mill S/)_lag_2',
       'Circulating Currency Seasonally Adjusted (mill S/)_lag_3',
       'Circulating Currency Seasonally Adjusted (mill S/)_lag_4',
       'Net International Reserves (mill $)_lag_1',
       'Net International Reserves (mill $)_lag_2',
       'Net International Reserves (mill $)_lag_3',
       'Net International Reserves (mill $)_lag_4',
       'Real Minimum Wage (Index)_lag_1', 'Real Minimum Wage (Index)_lag_2',
       'Real Minimum Wage (Index)_lag_3', 'Real Minimum Wage (Index)_lag_4',
       'Wheat (US$ per ton)_lag_1', 'Wheat (US$ per ton)_lag_2',
       'Wheat (US$ per ton)_lag_3', 'Wheat (US$ per ton)_l