# **Análisis con Machine Learning**
## **Taller 1**
#### **Andrea Bayona - Juan Pablo Cano**


En la actualidad, el sector inmobiliario ruso está en pleno auge. Ofrece muchas oportunidades emocionantes y un alto rendimiento en cuanto a estilos de vida e inversiones. El mercado inmobiliario lleva varios años en fase de crecimiento, lo que significa que todavía se pueden encontrar propiedades a precios muy atractivos, pero es muy probable que aumenten en el futuro. Para poder entender el mercado, una inmobiliaria rusa le ha brindado la información de la venta de más de 45 mil inmuebles entre los años de 2018 y 2021. Y quieren entender cuáles son las características principales que inciden en los precios de venta, para poder proponer planes de construcción de inmuebles en las áreas urbanas disponibles, que tomen en cuenta estas características.

In [3]:
!shred -u setup_colab_general.py
!wget -q "https://github.com/jpcano1/python_utils/raw/main/setup_colab_general.py" -O setup_colab_general.py
import setup_colab_general as setup_general

setup_general.setup_general()

shred: setup_colab_general.py: failed to open for writing: No such file or directory


100%|██████████| 3/3 [00:00<00:00, 1461.77KB/s]

General Functions Enabled Successfully





In [None]:
!pip install --disable-pip-version-check --progress-bar off -q https://github.com/pandas-profiling/pandas-profiling/archive/master.zip
!pip install --disable-pip-version-check --progress-bar off -q tabulate

## **Importando la librerías necesarias**

In [1]:
import os

import matplotlib.pyplot as plt

plt.style.use("seaborn-deep")

import mlflow

# Librerías extras
import itertools
from typing import Optional

import numpy as np
import pandas as pd
#import pandas_profiling
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.linear_model import Lasso, LinearRegression, Ridge
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures, StandardScaler
from tabulate import tabulate

from utils import general as gen

In [5]:
data_url = (
    "https://raw.githubusercontent.com/"
    "Camilojaravila/202210_MINE-4206_ANAL"
    "ISIS_CON_MACHINE_LEARNING/main/Taller%"
    "201/russian_prices.csv"
)
gen.download_content(data_url, filename="russian_prices.csv")

100%|██████████| 1971/1971 [00:00<00:00, 17044.71KB/s]


## **Lectura y perfilamiento**

### **Diccionario de Datos**
La inmobiliaria ha construido el siguiente diccionario de datos:

* date - Fecha de publicación del anuncio.
* time - Tiempo que la publicación estuvo activo.
* geo_lat - Latitud.
* geo_lon - Longitud.
* region - Region de Rusia. Hay 85 regiones en total.
* building_type - Tipo de Fachada. 0 - Other. 1 - Panel. 2 - Monolithic. 3 - * Brick. 4 - Blocky. 5 - Wooden.
* object_type - Tipo de Apartmento. 1 - Secondary real estate market; 2 - New * building.
* level - Piso del Apartamento.
* levels - Número de pisos.
* rooms - Número de Habitaciones. Si el valor es "-1", Significa que es un "studio apartment".
* area - Área total del apartamento (m2).
* kitchen_area - Área de la Cocina (m2).
* price - Precio. En rublos

A continuación, se leen los datos y se revisan las primeras líneas para verficar que la carga fue exitosa

In [2]:
russian_prices_df = pd.read_csv("data/russian_prices.csv")

In [3]:
russian_prices_df.head()

Unnamed: 0.1,Unnamed: 0,price,date,time,geo_lat,geo_lon,region,building_type,level,levels,rooms,area,kitchen_area,object_type
0,14040,3900000,2018-09-10,12:40:14,55.78648,49.223459,2922.0,1.0,10.0,11.0,3.0,67.0,8.8,1.0
1,24608,4250000,2018-09-11,17:26:15,55.905045,37.393578,81.0,1.0,25.0,25.0,1.0,39.0,10.5,1.0
2,76636,4340360,2018-09-18,02:35:04,59.882717,30.451298,2661.0,0.0,4.0,27.0,1.0,57.11,11.38,1.0
3,31944,8000000,2018-09-12,21:40:17,55.640462,37.359415,3.0,1.0,1.0,17.0,3.0,74.5,10.0,1.0
4,82427,2750000,2018-09-18,06:18:38,55.042053,82.940926,9654.0,1.0,1.0,5.0,2.0,44.6,6.0,1.0


In [4]:
russian_prices_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 67762 entries, 0 to 67761
Data columns (total 14 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Unnamed: 0     67762 non-null  int64  
 1   price          67762 non-null  int64  
 2   date           67762 non-null  object 
 3   time           67762 non-null  object 
 4   geo_lat        67762 non-null  float64
 5   geo_lon        67762 non-null  float64
 6   region         67761 non-null  float64
 7   building_type  67761 non-null  float64
 8   level          67761 non-null  float64
 9   levels         67761 non-null  float64
 10  rooms          67761 non-null  float64
 11  area           67761 non-null  float64
 12  kitchen_area   67761 non-null  float64
 13  object_type    67761 non-null  float64
dtypes: float64(10), int64(2), object(2)
memory usage: 7.2+ MB


In [None]:
profiler = pandas_profiling.ProfileReport(russian_prices_df, dark_mode=True)

- El perfilamiento se encuentra en los anexos.

In [None]:
if not os.path.exists("profiling_reports"):
    os.makedirs("profiling_reports")
profiler.to_file("profiling_reports/russian_prices_profile.html")

- Las siguientes columnas se eliminaron bajo el supuesto de que no son necesarias para el objetivo de negocio. La primera columna es un identificador de propiedad, ergo, no es significativa. Las columnas `time` y `date` son columnas relacionadas a la publicación, más no a la propiedad perse, por lo tanto, no son significativas para nuestro modelo.

In [5]:
columns_to_delete = [
    "Unnamed: 0",
    "time",
    "date",
]

In [6]:
russian_prices_df.drop(columns=columns_to_delete, inplace=True)

- Todas las columnas con valores nulos fueron removidas

In [7]:
russian_prices_df.dropna(inplace=True)

In [8]:
russian_prices_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 67761 entries, 0 to 67761
Data columns (total 11 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   price          67761 non-null  int64  
 1   geo_lat        67761 non-null  float64
 2   geo_lon        67761 non-null  float64
 3   region         67761 non-null  float64
 4   building_type  67761 non-null  float64
 5   level          67761 non-null  float64
 6   levels         67761 non-null  float64
 7   rooms          67761 non-null  float64
 8   area           67761 non-null  float64
 9   kitchen_area   67761 non-null  float64
 10  object_type    67761 non-null  float64
dtypes: float64(10), int64(1)
memory usage: 6.2 MB


In [9]:
russian_prices_df = russian_prices_df.apply(lambda x: x.astype("int32"))
russian_prices_df["object_type"] = russian_prices_df["object_type"].apply(
    lambda x: 2 if x == 11 else x
)
russian_prices_df["rooms"] = russian_prices_df["rooms"].apply(lambda x: -1 if x == -2 else x)

In [10]:
rows_to_drop = russian_prices_df.query(
    "kitchen_area + 5 >= area | area <= 10 | price <= 2000"
).index
russian_prices_df = russian_prices_df.drop(rows_to_drop).reset_index(drop=True)

In [11]:
X, y = russian_prices_df.drop("price", axis=1), russian_prices_df["price"]

In [12]:
full_X_train, X_test, full_y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=1234
)
X_train, X_val, y_train, y_val = train_test_split(
    full_X_train, full_y_train, test_size=0.2, random_state=1234
)

In [13]:
X_train.shape, y_train.shape

((43323, 10), (43323,))

In [14]:
X_val.shape, y_val.shape

((10831, 10), (10831,))

In [15]:
X_test.shape, y_test.shape

((13539, 10), (13539,))

## **Modelamiento**

### **Regresión Polinómial**
#### **Entrenamiento (Sin estandarización)**

Se define la clase para realizar la transformación polinomial de nuetras variables

In [18]:
class ToPolynomial(BaseEstimator, TransformerMixin):
    def __init__(self, k: int = 2) -> None:
        self.k = k

    def fit(self, X, y):
        return self

    def transform(
        self, 
        X: pd.DataFrame, 
        y: Optional[pd.Series] = None,
    ) -> pd.DataFrame:
        columns = X.columns
        X_train_pol = pd.concat(
            [X ** (i + 1) for i in range(self.k)], axis=1
        )  # Polinomios sin interacciones
        X_train_pol.columns = np.reshape(
            [[i + " " + str(j + 1) for i in columns] for j in range(self.k)], -1
        )
        temp = pd.concat(
            [X[i[0]] * X[i[1]] for i in list(itertools.combinations(columns, 2))], axis=1
        )  # Combinaciones sólo de grado 1
        temp.columns = [" ".join(i) for i in list(itertools.combinations(columns, 2))]
        X_train_pol = pd.concat([X_train_pol, temp], axis=1)
        return X_train_pol

Se crea un pipeline para encapsular los pasos de entrenamiento de nuestro modelo. Primero se realiza la transformación polinamial de las variables y estas se utilizan para entrenar el modelo de regresión lineal.

In [None]:
estimators = [("polinomial", ToPolynomial()), ("regresion", LinearRegression())]

pipe_pol = Pipeline(estimators)

pipe_pol.fit(X_train, y_train)

 Parámetros entrenados por la Regresión Polinomial

In [None]:
reg_lineal = pipe_pol["regresion"]

print("Intercept: ", reg_lineal.intercept_)
print("Coefficients: ", reg_lineal.coef_)

#### **Validación (Sin estandarización)**

In [None]:
y_pred = pipe_pol.predict(X_val)
y_pred

In [None]:
r2_poly = r2_score(y_val, y_pred)
mse_poly = mean_squared_error(y_val, y_pred)
mae_poly = mean_absolute_error(y_val, y_pred)

print("------------ Polynomial Regression ------------")
print(f"R2-score: {r2_poly:.7f}")
print(f"Residual sum of squares (MSE): {mse_poly:.5f}")
print(f"Mean absolute error: {mae_poly:.5f}")

#### **Comportamiento de los datos reales vs los datos predecidos**

In [None]:
%matplotlib inline


def draw_chart(y_val_p, y_pred_p, title, legend):
    fig, axs = plt.subplots(1, figsize=(20, 10))

    xvals = list(range(len(y_val_p[:50])))
    axs.plot(xvals, y_pred_p[:50], "bo-", label=legend)
    axs.plot(xvals, y_val_p[:50], "ro-", label="Real")

    axs.set(title=title, ylabel=y_train.name)
    axs.legend()

    plt.tight_layout()
    plt.show()

In [None]:
draw_chart(y_val, y_pred, "Predicción con Regresión Polinomial", "Regresión Polinomial")

#### **Entrenamiento (Con estandarización)**

In [19]:
estimators_2 = [
    ("polinomial", ToPolynomial()),
    ("normalizar", StandardScaler()),
    ("regresion", LinearRegression()),
]

pipe_pol_s = Pipeline(estimators_2)

pipe_pol_s.fit(X_train, y_train)

Pipeline(steps=[('polinomial', ToPolynomial()),
                ('normalizar', StandardScaler()),
                ('regresion', LinearRegression())])

 Parámetros entrenados por la Regresión Polinomial

In [None]:
reg_lineal_s = pipe_pol_s["regresion"]

print("Intercept: ", reg_lineal_s.intercept_)
print("Coefficients: ", reg_lineal_s.coef_)

#### **Validación (Con estandarización)**

In [None]:
y_pred_1b = pipe_pol_s.predict(X_val)
y_pred_1b

In [None]:
r2_poly_s = r2_score(y_val, y_pred_1b)
mse_poly_s = mean_squared_error(y_val, y_pred_1b)
mae_poly_s = mean_absolute_error(y_val, y_pred_1b)

print("------------ Polynomial Regression ------------")
print(f"R2-score: {r2_poly_s:.7f}")
print(f"Residual sum of squares (MSE): {mse_poly_s:.5f}")
print(f"Mean absolute error: {mae_poly_s:.5f}")

#### **Comportamiento de los datos reales vs los datos predecidos**

In [None]:
%matplotlib inline
draw_chart(
    y_val,
    y_pred_1b,
    "Predicción con Regresión Polinomial (con estandarización)",
    "Regresión Polinomial",
)

### **Regresión Ridge**
#### **Entrenamiento (Sin estandarización)**



In [None]:
ridge_reg = Ridge()
ridge_reg.fit(X_train, y_train)

In [None]:
ridge_coef = dict(zip(X_train.columns, ridge_reg.coef_))
ridge_coef

#### **Validación**

In [None]:
y_pred_2 = ridge_reg.predict(X_val)

In [None]:
r2_ridge = r2_score(y_val, y_pred_2)
mse_ridge = mean_squared_error(y_val, y_pred_2)
mae_ridge = mean_absolute_error(y_val, y_pred_2)

print("------------ Ridge ------------")
print(f"R2-score: {r2_ridge:.7f}")
print(f"Residual sum of squares (MSE): {mse_ridge:.5f}")
print(f"Mean absolute error: {mae_ridge:.5f}")

#### **Comportamiento de los datos reales vs los datos predecidos**

In [None]:
%matplotlib inline
draw_chart(y_val, y_pred_2, "Predicción con regresion Ridge", "Regresion Ridge")

#### **Entrenamiento (Con estandarización)**

In [None]:
pipeline_ridge = Pipeline(
    [
        ("scaler", StandardScaler()),
        ("regressor", Ridge()),
    ],
)

pipeline_ridge.fit(X_train, y_train)

In [None]:
ridge_coef = dict(zip(X_train.columns, pipeline_ridge.steps[1][1].coef_))
ridge_coef

#### **Validación**

In [None]:
y_pred_2b = pipeline_ridge.predict(X_val)

In [None]:
r2_ridge_s = r2_score(y_val, y_pred_2b)
mse_ridge_s = mean_squared_error(y_val, y_pred_2b)
mae_ridge_s = mean_absolute_error(y_val, y_pred_2b)

print("------------ Ridge (Con estandarización) ------------")
print(f"R2-score: {r2_ridge_s:.7f}")
print(f"Residual sum of squares (MSE): {mse_ridge_s:.5f}")
print(f"Mean absolute error: {mae_ridge_s:.5f}")

#### **Comportamiento de los datos reales vs los datos predecidos**

In [None]:
%matplotlib inline
draw_chart(
    y_val, y_pred_2b, "Predicción con regresion Ridge (Con estandarización)", "Regresion Ridge"
)

### **Regresión Lasso**
#### **Entrenamiento (Sin estandarización)**

In [None]:
lasso_reg = Lasso()
lasso_reg.fit(X_train, y_train)

In [None]:
lasso_coef = dict(zip(X_train.columns, lasso_reg.coef_))
lasso_coef

#### **Validación**

In [None]:
y_pred_3 = lasso_reg.predict(X_val)

In [None]:
r2_lasso = r2_score(y_val, y_pred_3)
mse_lasso = mean_squared_error(y_val, y_pred_3)
mae_lasso = mean_absolute_error(y_val, y_pred_3)

print("------------ Lasso ------------")
print(f"R2-score: {r2_lasso:.4f}")
print(f"Residual sum of squares (MSE): {mse_lasso:.5f}")
print(f"Mean absolute error: {mae_lasso:.5f}")

#### **Comportamiento de los datos reales vs los datos predecidos**

In [None]:
%matplotlib inline
draw_chart(y_val, y_pred_3, "Predicción con regresion Lasso", "Regresion Lasso")

#### **Entrenamiento (Con estandarización)**

In [None]:
pipeline = Pipeline(
    [
        ("scaler", StandardScaler()),
        ("regressor", Lasso()),
    ],
)

pipeline.fit(X_train, y_train)

In [None]:
lasso_coef = dict(zip(X_train.columns, pipeline.steps[1][1].coef_))
lasso_coef

#### **Validación**

In [None]:
y_pred_3b = pipeline.predict(X_val)

In [None]:
r2_lasso_s = r2_score(y_val, y_pred_3b)
mse_lasso_s = mean_squared_error(y_val, y_pred_3b)
mae_lasso_s = mean_absolute_error(y_val, y_pred_3b)

print("------------ Lasso (Con estandarización) ------------")
print(f"R2-score: {r2_lasso_s:.7f}")
print(f"Residual sum of squares (MSE): {mse_lasso_s:.5f}")
print(f"Mean absolute error: {mae_lasso_s:.5f}")

#### **Comportamiento de los datos reales vs los datos predecidos**

In [None]:
%matplotlib inline
draw_chart(
    y_val, y_pred_3b, "Predicción con regresion Lasso (Con estandarización)", "Regresion Lasso"
)

### **Selección del mejor modelo**
Tabla comparativa con los resultados de las métricas R2, MSE y MAE para los 3 modelos entrenados.

In [None]:
info = {
    "Model": [
        "Poly Regression",
        "Poly Regression (con S)",
        "Ridge",
        "Ridge (con S)",
        "Lasso",
        "Lasso (con S)",
    ],
    "R2": [r2_poly, r2_poly_s, r2_ridge, r2_ridge_s, r2_lasso, r2_lasso_s],
    "MSE": [mse_poly, mse_poly_s, mse_ridge, mse_ridge_s, mse_ridge, mse_lasso_s],
    "MAE": [mae_poly, mae_poly_s, mae_ridge, mae_ridge_s, mae_lasso, mae_lasso_s],
}

print(tabulate(info, headers="keys", tablefmt="fancy_grid"))

### **Optimización de hiperparámetros para el mejor modelo**

In [20]:
parameters = {"polinomial__k": [2, 3, 4, 5], "normalizar": [StandardScaler(), "passthrough"]}

grid_search = GridSearchCV(
    pipe_pol_s, parameters, verbose=2, scoring="neg_mean_squared_error", cv=5, n_jobs=-1
)

In [21]:
grid_search.fit(X_train, y_train)

Fitting 5 folds for each of 8 candidates, totalling 40 fits


GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('polinomial', ToPolynomial()),
                                       ('normalizar', StandardScaler()),
                                       ('regresion', LinearRegression())]),
             n_jobs=-1,
             param_grid={'normalizar': [StandardScaler(), 'passthrough'],
                         'polinomial__k': [2, 3, 4, 5]},
             scoring='neg_mean_squared_error', verbose=2)

In [22]:
best_model = grid_search.best_estimator_

pd.DataFrame(grid_search.cv_results_)

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_normalizar,param_polinomial__k,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score
0,2.230566,0.728896,0.043791,0.025204,StandardScaler(),2,"{'normalizar': StandardScaler(), 'polinomial__...",-18822780000000.0,-599030600000000.0,-138196700000000.0,-259860400000000.0,-83059970000000.0,-219794100000000.0,205508900000000.0,2
1,2.4583,0.355636,0.0385,0.018551,StandardScaler(),3,"{'normalizar': StandardScaler(), 'polinomial__...",-18526450000000.0,-596717800000000.0,-138097600000000.0,-257995300000000.0,-98271600000000.0,-221921800000000.0,202711100000000.0,5
2,4.64809,1.04226,0.062044,0.026727,StandardScaler(),4,"{'normalizar': StandardScaler(), 'polinomial__...",-18385500000000.0,-596738500000000.0,-137775100000000.0,-258733300000000.0,-97693360000000.0,-221865200000000.0,202870900000000.0,3
3,4.713295,1.206722,0.079715,0.023995,StandardScaler(),5,"{'normalizar': StandardScaler(), 'polinomial__...",-18416280000000.0,-596901500000000.0,-137772400000000.0,-263393900000000.0,-97758500000000.0,-222848500000000.0,203094800000000.0,7
4,0.909371,0.203408,0.045597,0.007191,passthrough,2,"{'normalizar': 'passthrough', 'polinomial__k': 2}",-18721070000000.0,-598868400000000.0,-138302300000000.0,-259867700000000.0,-83136710000000.0,-219779200000000.0,205450600000000.0,1
5,1.057661,0.208787,0.059519,0.004267,passthrough,3,"{'normalizar': 'passthrough', 'polinomial__k': 3}",-18526450000000.0,-596717800000000.0,-138097600000000.0,-257995300000000.0,-98271600000000.0,-221921800000000.0,202711100000000.0,6
6,1.272626,0.098832,0.061997,0.007388,passthrough,4,"{'normalizar': 'passthrough', 'polinomial__k': 4}",-18385500000000.0,-596738500000000.0,-137775100000000.0,-258733300000000.0,-97693360000000.0,-221865200000000.0,202870900000000.0,4
7,1.048163,0.041725,0.027402,0.007712,passthrough,5,"{'normalizar': 'passthrough', 'polinomial__k': 5}",-18416280000000.0,-596901500000000.0,-137772400000000.0,-263393900000000.0,-97758500000000.0,-222848500000000.0,203094800000000.0,8


In [23]:
grid_search.best_params_

{'normalizar': 'passthrough', 'polinomial__k': 2}

In [27]:
y_pred_final = best_model.predict(X_val)
y_pred_final_e = best_model.predict(X_train)

In [28]:
r2_final_e = r2_score(y_train, y_pred_final_e)
mse_def_e = mean_squared_error(y_train, y_pred_final_e)
mae_poly_final_e = mean_absolute_error(y_train, y_pred_final_e)

print("------------ Polynomial Regression Entrenamiento------------")
print(f"R2-score: {r2_final_e:.7f}")
print(f"Residual sum of squares (MSE): {mse_def_e:.5f}")
print(f"Mean absolute error: {mae_poly_final_e:.5f}")

r2_final = r2_score(y_val, y_pred_final)
mse_def = mean_squared_error(y_val, y_pred_final)
mae_poly_final = mean_absolute_error(y_val, y_pred_final)

print("------------ Polynomial Regression Validacion ------------")
print(f"R2-score: {r2_final:.7f}")
print(f"Residual sum of squares (MSE): {mse_def:.5f}")
print(f"Mean absolute error: {mae_poly_final:.5f}")

------------ Polynomial Regression Entrenamiento------------
R2-score: 0.1258109
Residual sum of squares (MSE): 210995665535383.34375
Mean absolute error: 2261211.54627
------------ Polynomial Regression Validacion ------------
R2-score: 0.0468936
Residual sum of squares (MSE): 829070445907963.25000
Mean absolute error: 2711813.51661


#### **Comportamiento de los datos reales vs los datos predecidos**

In [None]:
draw_chart(y_val, y_pred_final, "Predicción con Regresión Polinomial", "Regresión Polinomial")

Variables del modelo

In [24]:
reg_model = best_model["regresion"]
fake_df = best_model["polinomial"].transform(X_val)
print(f"Intercepto: {reg_model.intercept_}")
coef = list(
    zip(["Intercepto"] + list(fake_df.columns), [reg_model.intercept_] + list(reg_model.coef_))
)
coef = pd.DataFrame(coef, columns=["Variable", "Parámetro"])
coef

Intercepto: 25817566.72375145


Unnamed: 0,Variable,Parámetro
0,Intercepto,2.581757e+07
1,geo_lat 1,-8.920764e+05
2,geo_lon 1,5.642950e+04
3,region 1,-1.082217e+03
4,building_type 1,2.765057e+06
...,...,...
61,rooms kitchen_area,1.113114e+05
62,rooms object_type,7.938914e+05
63,area kitchen_area,-8.627138e+02
64,area object_type,-1.151075e+05


In [None]:
coef.sort_values("Parámetro")

In [None]:
coef[coef["Parámetro"].between(-1, 1)]

In [33]:
mlflow.sklearn.log_model(best_model, "taller_1_model")

ModelInfo(artifact_path='taller_1_model', flavors={'python_function': {'model_path': 'model.pkl', 'loader_module': 'mlflow.sklearn', 'python_version': '3.9.5', 'env': 'conda.yaml'}, 'sklearn': {'pickled_model': 'model.pkl', 'sklearn_version': '1.0.2', 'serialization_format': 'cloudpickle'}}, model_uri='runs:/08d1e242de90475a99139f391851cf4b/taller_1_model', model_uuid='459716fad5904814b75c15b9fdd3a0b7', run_id='08d1e242de90475a99139f391851cf4b', saved_input_example_info=None, signature_dict=None, utc_time_created='2022-02-23 03:32:54.581765')