<a href="https://colab.research.google.com/github/JCaballerot/Consultoria_ASEI/blob/main/Model_deployment/forecasting_ASEI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


<h1 align=center><font size = 5>Implementación de forecasting de venta de inmuebles
</font></h1>

---

## Introducción

El presente proceso tiene como objetivo realizar el proceso de puntuación mensual de los modelos de forecasting de ventas de inmuebles para los distritos de Jesús María y Miraflores en cada una de las 3 zonas por distrito.


## Extracción de Variables Macro

**1. IPC Alimentos y energia**

> ind_prec_cons_lima_met_alim_ener
>
> https://estadisticas.bcrp.gob.pe/estadisticas/series/mensuales/indice-de-precios-indice-dic-2021-100


**2. Expectativa del PBI**

> exp_PBI
> 
> https://estadisticas.bcrp.gob.pe/estadisticas/series/mensuales/resultados/PN01728AM/html


**3. Indice de precios de inflación subyacente**

> ind_prec_inf_suby_bienes
> 
> https://estadisticas.bcrp.gob.pe/estadisticas/series/mensuales/indice-de-precios-al-consumidor-clasificacion-sectorial



**4. Variación porcentual de la demanda interna**

> var_porc_demanda_interna
> 
> https://estadisticas.bcrp.gob.pe/estadisticas/series/mensuales/producto-bruto-interno-y-demanda-interna-variaciones-porcentuales-anualizadas




**5. Producción de energía eléctrica**

> prod_ener_lima
> 
> https://estadisticas.bcrp.gob.pe/estadisticas/series/mensuales/produccion-de-electricidad-por-departamento





**6. Indice de Precio de bienes inmuebles**

> ind_prec_inm
> 
> https://estadisticas.bcrp.gob.pe/estadisticas/series/mensuales/indice-de-precio-de-bienes-inmuebles


## Librerías

Cargamos las librerías utilizadas como parte del proceso de puntuación

In [None]:
import pandas as pd
pd.set_option('display.max_rows', None)

import pickle
import numpy as np

## Periodos

Definimos los meses que se puntuaran en el proceso

In [None]:
periods = ['Mar-22', 'Abr-22', 'May-22', 'Jun-22', 'Jul-22', 'Ago-22', 'Sept-22', 'Oct-22', 'Nov-22', 'Dic-22']

## Cargando data input

Proceso de carga de las fuentes macro

In [None]:

macro = pd.read_csv('macros_implementacion.csv', delimiter=';', index_col=['codmes'], parse_dates=['codmes'])
variables_macro = macro.columns.tolist()
macro[variables_macro] = macro[variables_macro].shift(3)



Proceso de carga de la data de ASEI

In [None]:
asei = pd.read_csv('input_ASEI.csv', delimiter = ';', index_col=['codmes'], parse_dates=['codmes'])
variables_asei = asei.columns.tolist()
asei[variables_asei] = asei[variables_asei].shift(2)

In [None]:

pddf = pd.merge(asei, macro, on = 'codmes', how='left').copy()


## Cargando modelos

Proceso de carga de los objetos de los modelos respectivos

In [None]:

arima_mz1 = pickle.load(open("./arima_mz1.ml", 'rb'))
arima_mz2 = pickle.load(open("./arima_mz2.ml", 'rb'))
arima_mz3 = pickle.load(open("./arima_mz3.ml", 'rb'))

arima_jmz1 = pickle.load(open("./arima_jmz1.ml", 'rb'))
arima_jmz2 = pickle.load(open("./arima_jmz2.ml", 'rb'))
arima_jmz3 = pickle.load(open("./arima_jmz3.ml", 'rb'))

var_jmz1 = pickle.load(open("./var_jmz1.ml", 'rb'))
var_jmz2 = pickle.load(open("./var_jmz2.ml", 'rb'))
var_jmz3 = pickle.load(open("./var_jmz3.ml", 'rb'))



## Generación de variables 

Proceso de generación de feature engeeniering sobre variables

In [None]:
variables = pddf.columns.tolist()
pddfRes = pddf.copy()

pddfRes[[x + '_avg3' for x in variables]] = pddfRes[variables].rolling(3, min_periods = 3).mean()
pddfRes[[x + '_avg6' for x in variables]] = pddfRes[variables].rolling(6, min_periods = 6).mean()

for x in variables:
  pddfRes[x + '_rat3to6'] = pddfRes[x + '_avg3']/pddfRes[x + '_avg6']

## Preprocesamiento de variables

En este punto se aplican las estandarizaciones sobre las variables macro

In [None]:

# Variables Miraflores Zona 1

mz1_feat_macro = ['mz1_precio_oferta_zon1_rat3to6_std', 'mz1_exp_PBI_avg3_std', 'mz1_flg_feia_avg3_std']
mz1_feat_arima = 'M_Venta_zona_1_avg3'

pddfRes['mz1_precio_oferta_zon1_rat3to6_std'] = (pddfRes.M_precio_oferta_zon1_rat3to6 - 1.005935)/0.008043
pddfRes['mz1_exp_PBI_avg3_std'] = (pddfRes.exp_PBI_avg3 - 0.045776)/0.016491
pddfRes['mz1_flg_feia_avg3_std'] = (pddfRes.flg_feia_avg3 - 0.340741)/0.194481


In [None]:
# Variables Miraflores Zona 2

mz2_feat_macro = ['mz2_ind_prec_cons_lima_met_alim_ener_avg3_std', 'mz2_exp_PBI_avg3_std', 'mz2_flg_feia_avg3_std']
mz2_feat_arima = 'M_Venta_zona_2_avg3'

pddfRes['mz2_ind_prec_cons_lima_met_alim_ener_avg3_std'] = (pddfRes.ind_prec_cons_lima_met_alim_ener_avg3 - 89.775296)/3.052684
pddfRes['mz2_exp_PBI_avg3_std'] = (pddfRes.exp_PBI_avg3 - 0.045776)/0.016491
pddfRes['mz2_flg_feia_avg3_std'] = (pddfRes.flg_feia_avg3 - 0.340741)/0.194481


In [None]:
# Variables Miraflores Zona 3

mz3_feat_macro = ['mz3_ind_prec_inf_suby_bienes_avg3_std', 'mz3_precio_venta_total_rat3to6_std']
mz3_feat_arima = 'M_Venta_zona_3_avg3'

pddfRes['mz3_ind_prec_inf_suby_bienes_avg3_std'] = (pddfRes.ind_prec_inf_suby_bienes_avg3 - 0.19257)/0.102632
pddfRes['mz3_precio_venta_total_rat3to6_std'] = (pddfRes.M_precio_venta_total_rat3to6 - 1.00568)/0.013322


In [None]:
# Variables Jesus Maria Zona 1

jmz1_feat_macro = ['jmz1_ind_prec_inm_rat3to6_std', 'jmz1_ind_prec_inf_suby_bienes_avg6_std']
jmz1_feat_arima = 'JM_Venta_zona_1_avg3'

pddfRes['jmz1_ind_prec_inm_rat3to6_std'] = (pddfRes.ind_prec_inm_rat3to6 - 1.000698)/0.005181
pddfRes['jmz1_ind_prec_inf_suby_bienes_avg6_std'] = (pddfRes.ind_prec_inf_suby_bienes_avg6 - 0.185542)/0.083822


In [None]:
# Variables Jesus Maria Zona 2

jmz2_feat_macro = ['jmz2_var_porc_demanda_interna_avg3_std', 'jmz2_flg_feia_avg3_std', 'jmz2_exp_PBI_avg3_std']
jmz2_feat_arima = 'JM_Venta_zona_2_avg3'

pddfRes['jmz2_var_porc_demanda_interna_avg3_std'] = (pddfRes.var_porc_demanda_interna_avg3 - 3.355473)/13.253201
pddfRes['jmz2_flg_feia_avg3_std'] = (pddfRes.flg_feia_avg3 - 0.340580)/0.192311
pddfRes['jmz2_exp_PBI_avg3_std'] = (pddfRes.exp_PBI_avg3 - 0.045371)/0.016536


In [None]:
# Variables Jesus Maria Zona 3

jmz3_feat_macro = ['jmz3_ind_prec_inm_avg3_std', 'jmz3_ind_prec_inf_suby_bienes_rat3to6_std', 'jmz3_prod_ener_lima_avg6_std']
jmz3_feat_arima = 'JM_Venta_zona_3_avg3'

pddfRes['jmz3_ind_prec_inm_avg3_std'] = (pddfRes.ind_prec_inm_avg3 - 107.502391)/1.673276
pddfRes['jmz3_ind_prec_inf_suby_bienes_rat3to6_std'] = (pddfRes.ind_prec_inf_suby_bienes_rat3to6 - 1.066234)/0.302180
pddfRes['jmz3_prod_ener_lima_avg6_std'] = (pddfRes.prod_ener_lima_avg6 - 2102.496290)/288.166124


In [None]:
pddf_pred = pddfRes.loc[periods]

## Puntuación de modelos ARIMA

Proceso de puntuación de los modelos arima basados en las ventas pasadas

In [None]:

# Miraflores Zona 1
fc_mz1, se, conf  = arima_mz1.forecast(len(periods), 
                                   exog = pddf_pred[mz1_feat_arima].fillna(pddf_pred[mz1_feat_arima].dropna()[-1]), 
                                   alpha=0.05)

# Miraflores Zona 2
fc_mz2, se, conf  = arima_mz2.forecast(len(periods), 
                                   exog = pddf_pred[mz2_feat_arima].fillna(pddf_pred[mz2_feat_arima].dropna()[-1]), 
                                   alpha=0.05)

# Miraflores Zona 3
fc_mz3, se, conf  = arima_mz3.forecast(len(periods), 
                                   exog = pddf_pred[mz3_feat_arima].fillna(pddf_pred[mz3_feat_arima].dropna()[-1]), 
                                   alpha=0.05)

# Jesus Maria Zona 1
fc_jmz1, se, conf  = arima_jmz1.forecast(len(periods), 
                                   exog = pddf_pred[jmz1_feat_arima].fillna(pddf_pred[jmz1_feat_arima].dropna()[-1]), 
                                   alpha=0.05)

# Jesus Maria Zona 2
fc_jmz2, se, conf  = arima_jmz2.forecast(len(periods), 
                                   exog = pddf_pred[jmz2_feat_arima].fillna(pddf_pred[jmz2_feat_arima].dropna()[-1]), 
                                   alpha=0.05)

# Jesus Maria Zona 3
fc_jmz3, se, conf  = arima_jmz3.forecast(len(periods), 
                                   exog = pddf_pred[jmz3_feat_arima].fillna(pddf_pred[jmz3_feat_arima].dropna()[-1]), 
                                   alpha=0.05)



  exog = exog[:, None]


In [None]:

pddf_pred['mz1_venta_predicted'] = fc_mz1
pddf_pred['mz2_venta_predicted'] = fc_mz2
pddf_pred['mz3_venta_predicted'] = fc_mz3

pddf_pred['jmz1_venta_predicted'] = fc_jmz1
pddf_pred['jmz2_venta_predicted'] = fc_jmz2
pddf_pred['jmz3_venta_predicted'] = fc_jmz3


## Puntuación de modelos macro

Proceso de cálculo de los modelos macroeconómicos

In [None]:
for x in list(set(mz1_feat_macro + mz2_feat_macro + mz3_feat_macro + jmz1_feat_macro + jmz2_feat_macro + jmz3_feat_macro)):
  pddf_pred[x] = pddf_pred[x].fillna(pddf_pred[x].dropna()[-1])


In [None]:
# Miraflores Zona 1
pddf_pred['mz1_macro_pred'] = 12.7242 -\
                              2.2126*pddf_pred.mz1_precio_oferta_zon1_rat3to6_std +\
                              1.2573*pddf_pred.mz1_exp_PBI_avg3_std +\
                              1.8231*pddf_pred.mz1_flg_feia_avg3_std



In [None]:
# Miraflores Zona 2
pddf_pred['mz2_macro_pred'] = 42.7099 +\
                              5.96310*pddf_pred.mz2_exp_PBI_avg3_std +\
                              15.6759*pddf_pred.mz2_ind_prec_cons_lima_met_alim_ener_avg3_std +\
                              3.87770*pddf_pred.mz2_flg_feia_avg3_std



In [None]:
# Miraflores Zona 3
pddf_pred['mz3_macro_pred'] = 15.5556 +\
                              4.8866*pddf_pred.mz3_ind_prec_inf_suby_bienes_avg3_std -\
                              1.3525*pddf_pred.mz3_precio_venta_total_rat3to6_std


In [None]:
# Jesus Maria Zona 1
pddf_pred['jmz1_macro_pred'] = 57.6329 +\
                               7.6093*pddf_pred.jmz1_ind_prec_inm_rat3to6_std +\
                               5.9194*pddf_pred.jmz1_ind_prec_inf_suby_bienes_avg6_std



In [None]:
# Jesus Maria Zona 2
pddf_pred['jmz2_macro_pred'] = 44.0168 +\
                               9.1532*pddf_pred.jmz2_var_porc_demanda_interna_avg3_std +\
                               5.0141*pddf_pred.jmz2_flg_feia_avg3_std +\
                               7.1417*pddf_pred.jmz2_exp_PBI_avg3_std


In [None]:
# Jesus Maria Zona 3
pddf_pred['jmz3_macro_pred'] = 26.5652 +\
                               3.7640*pddf_pred.jmz3_ind_prec_inm_avg3_std +\
                               2.7606*pddf_pred.jmz3_ind_prec_inf_suby_bienes_rat3to6_std -\
                               2.5381*pddf_pred.jmz3_prod_ener_lima_avg6_std


## Puntuación de modelos var/ensemble

Proceso de combinación de modelos basado en metodologías de ensamble (Miraflores)/Vectores autorregresivos (Jesus Maria)

In [None]:


pddf_pred['mz1_venta_predicted_final'] = (pddf_pred['mz1_venta_predicted'] + pddf_pred['mz1_macro_pred'])/2
pddf_pred['mz2_venta_predicted_final'] = (pddf_pred['mz2_venta_predicted'] + pddf_pred['mz2_macro_pred'])/2
pddf_pred['mz3_venta_predicted_final'] = (pddf_pred['mz3_venta_predicted'] + pddf_pred['mz3_macro_pred'])/2

# Var Jesus Maria Zona 1
jmz1_ensemble = ['jmz1_venta_predicted', 'jmz1_macro_pred']
fc_jmz1, se, conf  = var_jmz1.forecast(len(periods), exog = pddf_pred[jmz1_ensemble], alpha=0.05)
pddf_pred['jmz1_venta_predicted_final'] = fc_jmz1

# Var Jesus Maria Zona 2
jmz2_ensemble = ['jmz2_venta_predicted', 'jmz2_macro_pred']
fc_jmz2, se, conf  = var_jmz2.forecast(len(periods), exog = pddf_pred[jmz2_ensemble], alpha=0.05)
pddf_pred['jmz2_venta_predicted_final'] = fc_jmz2

# Var Jesus Maria Zona 3
jmz3_ensemble = ['jmz3_venta_predicted', 'jmz3_macro_pred']
fc_jmz3, se, conf  = var_jmz3.forecast(len(periods), exog = pddf_pred[jmz3_ensemble], alpha=0.05)
pddf_pred['jmz3_venta_predicted_final'] = fc_jmz3


## Exportar resultados

Exportar Resultados finales

In [None]:

pddf_pred[['mz1_venta_predicted_final', 'mz2_venta_predicted_final', 'mz3_venta_predicted_final',
           'jmz1_venta_predicted_final', 'jmz2_venta_predicted_final', 'jmz3_venta_predicted_final']].to_csv('forecasting_results.csv')
