# MODELADO

## LSTM


<hr>

<code> **Proyecto de Datos II** </code>

## Índice

- [Importación de los datos](#importación-de-los-datos)
- [Preprocesamiento](#preprocesamiento)
- [Entrenamiento](#entrenamiento)
- [Análisis del modelo](#análisis-del-modelo)
- [Registro del modelo en MLflow](#registro-del-modelo-en-mlflow)


In [1]:
import time
import mlflow
import pandas as pd
from evaluation.evaluator import Evaluator

SEED = 22 # replicabilidad

## Importación de los datos

In [2]:
# Iniciamos la sesión de spark
import findspark
findspark.init()

In [3]:
from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("Spark en local") \
    .config("spark.master", "local[*]") \
    .config("spark.hadoop.fs.defaultFS", "file:///") \
    .config("spark.sql.warehouse.dir", "file:///tmp/spark-warehouse") \
    .config("spark.driver.extraJavaOptions", "-Dderby.system.home=/tmp/derby") \
    .getOrCreate()

sc = spark.sparkContext

25/04/27 22:39:19 WARN Utils: Your hostname, neutron.local resolves to a loopback address: 127.0.0.1; using 10.8.63.80 instead (on interface en0)
25/04/27 22:39:19 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
25/04/27 22:39:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


## Separación en dos modelos

In [4]:
df_train = spark.read.parquet("/Users/maria/Downloads/train_spark.parquet")
df_test = spark.read.parquet("/Users/maria/Downloads/test_spark.parquet")

# Quitamos las columnas de ICAO, Callsign y Timestamp
col_to_drop = ['timestamp', 'icao', 'callsign']
df_train2 = df_train.drop(*col_to_drop)
df_test2 = df_test.drop(*col_to_drop)

# Separamos las variables de la variable objetivo
X_train, y_train = df_train2.drop("takeoff_time"), df_train2.select("takeoff_time")
X_test, y_test = df_test2.drop("takeoff_time"), df_test2.select("takeoff_time") 

In [5]:
(X_train.count(), len(X_train.columns)), (X_test.count(), len(X_test.columns))

((123733, 58), (27791, 58))

Vamos a separar los df en dos grupos, en función del tiempo que tardan en despegar, ya que hay un grupo pequeño pero significativo de aviones que tardan mucho más tiempo y eso hace que el modelo abarque demasiado y falle mucho más en las predicciones de tiempos más largos.

In [6]:
df_train.columns

['takeoff_time',
 'timestamp',
 'icao',
 'callsign',
 'holding_point',
 'runway',
 'operator',
 'turbulence_category',
 'last_min_takeoffs',
 'last_min_landings',
 'last_event_turb_cat',
 'time_since_last_event_seconds',
 'time_before_holding_point',
 'time_at_holding_point',
 'hour',
 'weekday',
 'is_holiday',
 'Z1',
 'KA6',
 'KA8',
 'K3',
 'K2',
 'K1',
 'Y1',
 'Y2',
 'Y3',
 'Y7',
 'Z6',
 'Z4',
 'Z2',
 'Z3',
 'LF',
 'L1',
 'LA',
 'LB',
 'LC',
 'LD',
 'LE',
 '36R_18L',
 '32R_14L',
 '36L_18R',
 '32L_14R',
 'temperature_2m (°C)',
 'relative_humidity_2m (%)',
 'dew_point_2m (°C)',
 'precipitation (mm)',
 'snowfall (cm)',
 'weather_code (wmo code)',
 'surface_pressure (hPa)',
 'cloud_cover (%)',
 'cloud_cover_low (%)',
 'cloud_cover_mid (%)',
 'cloud_cover_high (%)',
 'is_day ()',
 'wind_speed_10m (km/h)',
 'wind_direction_10m (°)',
 'wind_direction_100m (°)',
 'soil_moisture_0_to_7cm (m³/m³)',
 'soil_temperature_100_to_255cm (°C)',
 'soil_moisture_100_to_255cm (m³/m³)',
 'et0_fao_evapotra

In [7]:
# Buscamos el punto de corte para separar los datos y entrenar 2 modelos
from pyspark.sql import functions as F

TAKEOFF_TIME_Q3 = (
    df_train
    .withColumn("day", F.dayofyear("timestamp"))
    .groupBy("callsign", "day")
    .agg(F.first("takeoff_time").alias("takeoff_time"))
    .agg(F.expr('percentile(takeoff_time, array(0.75))')[0].alias('Q3_takeoff_time'))
)
TAKEOFF_TIME_Q3.show()

                                                                                

+---------------+
|Q3_takeoff_time|
+---------------+
|          237.0|
+---------------+



In [8]:
TAKEOFF_TIME_CUTOFF = TAKEOFF_TIME_Q3.first()["Q3_takeoff_time"]

In [14]:
from pyspark.sql import functions as F
from pyspark.sql.window import Window

# Primero, crear la columna 'day' a partir de timestamp
df_train = df_train.withColumn("day", F.dayofyear("timestamp"))

# Definir una ventana para cada 'callsign' y 'day' ordenada por 'timestamp'
window_spec = Window.partitionBy("callsign", "day").orderBy("timestamp")

# Agregar la columna con el primer takeoff_time del grupo
df_train = df_train.withColumn(
    "first_takeoff_time",
    F.first("takeoff_time").over(window_spec)
)

# Ahora puedes separar los datos según el 'first_takeoff_time'
df_train_low = df_train.filter(F.col("first_takeoff_time") <= TAKEOFF_TIME_CUTOFF)
df_train_high = df_train.filter(F.col("first_takeoff_time") > TAKEOFF_TIME_CUTOFF)

# Borramos la columna auxiliar
df_train_low = df_train_low.drop("first_takeoff_time")
df_train_high = df_train_high.drop("first_takeoff_time")

In [15]:
# Para los datos de test, para saber a qué modelo mandarlos, vamos a cortar el waiting time con el 
# valor que usamos para separar los datos de entrenamiento entre los dos modelos

# Separamos los datos de test en función del time_at_holding_point

# Crear la columna 'day' a partir de timestamp
df_test = df_test.withColumn("day", F.dayofyear("timestamp"))

# Definir una ventana para cada 'callsign' y 'day' ordenada por 'timestamp'
window_spec = Window.partitionBy("callsign", "day").orderBy("timestamp")

# Agregar la columna con el máximo 'time_at_holding_point' del grupo
df_test = df_test.withColumn(
    "max_time_at_holding_point",
    F.max("time_at_holding_point").over(window_spec)
)

# Separar los datos según el valor máximo de 'time_at_holding_point'
df_test_low = df_test.filter(F.col("max_time_at_holding_point") <= TAKEOFF_TIME_CUTOFF)
df_test_high = df_test.filter(F.col("max_time_at_holding_point") > TAKEOFF_TIME_CUTOFF)


# Modelo Low (despegues rápidos)

In [16]:
MODEL_NAME = "LSTM-Low" 

In [17]:
# Separamos las variables de la variable objetivo
df_train_low_2 = df_train_low.drop(*col_to_drop)
df_test_low_2 = df_test_low.drop(*col_to_drop)

X_train, y_train = df_train_low_2.drop("takeoff_time"), df_train_low_2.select("takeoff_time")
X_test, y_test = df_test_low_2.drop("takeoff_time"), df_test_low_2.select("takeoff_time") 

# Modelo High (despegues lentos)

## Preprocesamiento

In [18]:
# =====================================
from pyspark.ml import Pipeline
from pyspark.ml.feature import VectorAssembler, MinMaxScaler, StringIndexer

# 1. Crear indexadores para las columnas categóricas (tipo string)
indexers = [
    StringIndexer(inputCol=col, outputCol=f"{col}_index", handleInvalid="keep")
    for col in X_train.columns
    if str(X_train.schema[col].dataType) == 'StringType()'
]

# 2. Definir columnas de entrada para el ensamblador
# (indexadas si son categóricas, originales si son numéricas)
assembler_inputs = [
    f"{col}_index" if str(X_train.schema[col].dataType) == 'StringType()' else col
    for col in X_train.columns
]

# 3. Construir el pipeline: indexación -> ensamblado -> escalado
pipeline = Pipeline(stages=[
    *indexers,
    VectorAssembler(inputCols=assembler_inputs, outputCol="features_raw"),
    MinMaxScaler(inputCol="features_raw", outputCol="features")
])

# 4. Ajustar el pipeline SOLO en X_train
pipeline_model = pipeline.fit(X_train)

# 5. Transformar X_train y X_test usando el mismo pipeline
X_train_prepared = pipeline_model.transform(X_train)
X_test_prepared = pipeline_model.transform(X_test)

# =====================================

## Entrenamiento

In [None]:
# ========================================
import time
import numpy as np
import joblib
import pandas as pd

from tensorflow.keras import regularizers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout, Input, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, LearningRateScheduler

from scikeras.wrappers import KerasRegressor

from sklearn.model_selection import GridSearchCV, TimeSeriesSplit
from sklearn.preprocessing import MinMaxScaler

start_time = time.time()

# 1. Convertir Spark DataFrame a NumPy para las características (ya preparado en X_train_prepared)
pdf_train = X_train_prepared.select("features").toPandas()
X = np.stack(pdf_train["features"].values)

# 2. Definir la función de preprocesamiento del objetivo
def preprocess_target(y_train):
    y_train_np = y_train.toPandas().values.reshape(-1, 1)
    y_log = np.log1p(y_train_np) # Transformación logarítmica por ser una variable sesgada
    scaler = MinMaxScaler()
    y_scaled = scaler.fit_transform(y_log)
    return y_scaled, scaler

def inverse_preprocess_target(y_scaled, scaler):
    y_log = scaler.inverse_transform(y_scaled)
    y_original = np.expm1(y_log)
    return y_original


# Preprocesar y_train
y_train_scaled, y_scaler = preprocess_target(y_train)

# Redimensionar X para que coincida con la entrada de LSTM (muestras, pasos de tiempo, características)
TIME_STEPS = 1
X_reshaped = X.reshape((X.shape[0], 1, X.shape[1]))

# 3. Definir el modelo LSTM
def build_model(units=64, dropout_rate=0.2, l2_reg=0.01):
    model = Sequential([
        Input(shape=(TIME_STEPS, X.shape[1])),
        LSTM(units, kernel_regularizer=regularizers.l2(l2_reg), return_sequences=True),
        LSTM(units, kernel_regularizer=regularizers.l2(l2_reg)),
        BatchNormalization(), 
        Dropout(dropout_rate),
        Dense(1, kernel_regularizer=regularizers.l2(l2_reg))
    ])
    model.compile(optimizer='adam', loss='mse')
    return model


# 4. Definir el regressor usando KerasRegressor
regressor = KerasRegressor(
    model=build_model,
    units=64,
    dropout_rate=0.3,
    verbose=1
)

# 5. EarlyStopping para evitar sobreajuste
early_stop = EarlyStopping(monitor='loss', patience=5, restore_best_weights=True)

# 6. Ajustar la tasa de aprendizaje con un LearningRateScheduler
def lr_schedule(epoch, lr):
    if epoch < 10:
        return lr
    else:
        return lr * 0.9  # Reduce la tasa de aprendizaje un 10% cada 10 épocas

# 6. Definir el grid de hiperparámetros para la búsqueda
param_grid = {
    "units": [64],
    "dropout_rate": [0.3],
    "epochs": [50],
    "batch_size": [32]
}

# 7. Definir la validación cruzada con TimeSeriesSplit
tscv = TimeSeriesSplit(n_splits=2)

grid = GridSearchCV(
    estimator=regressor,
    param_grid=param_grid,
    scoring='neg_root_mean_squared_error',
    cv=tscv,
    verbose=1
)

# 8. Entrenar el modelo con GridSearchCV
grid_result = grid.fit(X_reshaped,
                       y_train_scaled,
                       callbacks=[early_stop, LearningRateScheduler(lr_schedule)])


# ========================================

end_time = time.time()
execution_time = end_time - start_time

print(f"Tiempo de ejecución: {execution_time} segundos")

# 11. Imprimir el mejor resultado de la búsqueda
print(f"Mejores parámetros encontrados: {grid_result.best_params_}")
print(f"Mejor score obtenido: {grid_result.best_score_}")



Fitting 2 folds for each of 1 candidates, totalling 2 fits
Epoch 1/50


2025-04-27 22:42:34.247174: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1
2025-04-27 22:42:34.247199: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB
2025-04-27 22:42:34.247204: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB
2025-04-27 22:42:34.247220: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2025-04-27 22:42:34.247229: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2025-04-27 22:42:34.922812: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.


[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 25ms/step - loss: 0.5837
Epoch 2/50
[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 25ms/step - loss: 0.0264
Epoch 3/50
[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 24ms/step - loss: 0.0225
Epoch 4/50
[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 24ms/step - loss: 0.0211
Epoch 5/50
[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m21s[0m 25ms/step - loss: 0.0199
Epoch 6/50
[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 24ms/step - loss: 0.0196
Epoch 7/50
[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 24ms/step - loss: 0.0189
Epoch 8/50
[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 25ms/step - loss: 0.0190
Epoch 9/50
[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 24ms/step - loss: 0.0190
Epoch 10/50
[1m815/815[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 24ms/

[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 25ms/step - loss: 0.0191
Epoch 29/50
[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 25ms/step - loss: 0.0191
Epoch 30/50
[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 25ms/step - loss: 0.0193
Epoch 31/50
[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 25ms/step - loss: 0.0191
Epoch 32/50
[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 25ms/step - loss: 0.0192
Epoch 33/50
[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m44s[0m 27ms/step - loss: 0.0192
Epoch 34/50
[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 24ms/step - loss: 0.0193
Epoch 35/50
[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m39s[0m 24ms/step - loss: 0.0190
Epoch 36/50
[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 25ms/step - loss: 0.0192
Epoch 37/50
[1m1630/1630[0m [32m━━━━━━━━━━━━━━━━━━━━[0

In [17]:
# 1. Obtener el mejor modelo (el que tiene los mejores hiperparámetros)
best_model = grid_result.best_estimator_

# 2. Obtener los mejores hiperparámetros
best_params = grid_result.best_params_

# 3. Obtener el puntaje de validación del modelo elegido
best_score = grid_result.best_score_

In [18]:
best_model

## Análisis del modelo

In [20]:
# ===============================================================

from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np


# 1. Predicciones en el conjunto de entrenamiento
y_train_pred_scaled = grid_result.predict(X_reshaped)
y_train_pred = inverse_preprocess_target(y_train_pred_scaled.reshape(-1, 1), y_scaler).flatten()

# 2. Convertir y_train real a numpy array
y_train_np = y_train.toPandas().values.flatten()

# 3. Calcular MAE y RMSE
mae_train = mean_absolute_error(y_train_np, y_train_pred)
rmse_train = np.sqrt(mean_squared_error(y_train_np, y_train_pred))


[1m2862/2862[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 4ms/step


In [21]:
mae_train, rmse_train

(32.2800141746465, 39.16603774220215)

In [None]:
# ===============================================================
# Generar predicciones en test

# 1. Convertir el DataFrame de Spark de test a Pandas
pdf_test = X_test_prepared.select("features").toPandas()

# Convertir las características de Spark (Vector) en un array numpy
X_test = np.stack(pdf_test["features"].values)

# 2. Redimensionar X_test para que coincida con la entrada de LSTM (muestras, pasos de tiempo, características)
X_test = X_test.reshape((X_test.shape[0], 1, X_test.shape[1]))  # (n_samples, 1, n_features)

# 3. Hacer las predicciones en el conjunto de test usando el mejor modelo
y_test_pred_scaled = grid_result.best_estimator_.predict(X_test)

# 4. Desescalar las predicciones
y_test_pred = inverse_preprocess_target(y_test_pred_scaled.reshape(-1, 1), y_scaler).flatten()

# ===============================================================
# Convertir df_test de Spark a Pandas
df_test_low = df_test_low.toPandas()

# 5. Añadir las predicciones al DataFrame de test
df_test_low['prediction'] = y_test_pred

# Mostrar el DataFrame con las predicciones
print(df_test_low[['prediction']].head())


In [25]:
y_test_np = y_test.toPandas().takeoff_time.to_numpy()

mae_test = mean_absolute_error(y_test_np, y_test_pred)
rmse_test = np.sqrt(mean_squared_error(y_test_np, y_test_pred))
mae_test

198.78595828314727

In [28]:
from evaluation.evaluator import Evaluator
# Nota: df_test tiene que tener la columna 'prediction'
mae_val = None
rmse_val = None

ev = Evaluator(df_test_low, MODEL_NAME, mae_val, rmse_val)
report = ev.getReport()
ev.visualEvaluation()

In [43]:
report

{'global': {'mae': 80.59104347098604,
  'rmse': 106.47107933793535,
  'mse': 11336.090735384925,
  'r2': 0.04250677667514269,
  'mape': 48.33818510797021},
 'by_runway': {'32L/14R': {'mae': 100.50540708528338,
   'rmse': 131.2773149799497},
  '32R/14L': {'mae': 77.11424707775703, 'rmse': 101.26736771369758},
  '36L/18R': {'mae': 79.60461535735402, 'rmse': 105.294016872584},
  '36R/18L': {'mae': 70.43884122541853, 'rmse': 91.11007996910652}},
 'by_holding_point': {'K1': {'mae': 115.45991589946132,
   'rmse': 171.77260458717342},
  'K2': {'mae': 70.87230255734029, 'rmse': 84.6258790846638},
  'K3': {'mae': 151.99290313720704, 'rmse': 187.46256297590537},
  'LA': {'mae': 103.36139083344382, 'rmse': 135.8826663837165},
  'LB': {'mae': 97.80254079889954, 'rmse': 130.46898241341307},
  'LC': {'mae': 106.31957201687794, 'rmse': 151.8946835326238},
  'LE': {'mae': 98.29135130595385, 'rmse': 116.77872076485512},
  'Y1': {'mae': 67.31269910293803, 'rmse': 87.43701914214775},
  'Y2': {'mae': 79.0

### Influencia de las variables

In [None]:
# ===============================================================
# INFLUENCIA DE LAS VARIABLES
# En el caso de el modelo LSTM no se puede saber la influencia de cada variable
# ===============================================================

## Registro del modelo en MLflow

In [32]:
import mlflow
SEED = 22

In [35]:
mlflow.set_tracking_uri("./mlflow_maria")
mlflow.set_experiment("takeoff_time_prediction")

with mlflow.start_run():

    # - Datos generales -

    # ========================================================================
    mlflow.set_tag("model_type", MODEL_NAME)
    mlflow.set_tag("framework", "tensorflow.keras") # scikit-learn, tensorflow, etc.
    mlflow.set_tag("target_variable", "takeoff_time") # variable respuesta
    mlflow.set_tag("preprocessing", "StringIndexer+VectorAssembler+MinMaxScaler+logParaTarget") # transformaciones separadas por un +
    mlflow.set_tag("dataset", "solo takeoff_time bajos <=237") # indicar si se ha modificado el conjunto de datos
    mlflow.set_tag("seed", SEED) # semilla para replicabilidad
    mlflow.set_tag("layers", "Input+LSTM+LSTM+BatchNormalization+Dropout+Dense") # capas de la red neuronal
    # ========================================================================
    
    # - Hiperparámetros óptimos -
    
    # =====================================
    # AÑADIR HIPERPARÁMETROS
    best_params = grid_result.best_params_
    for param_name, param_value in best_params.items():
        mlflow.log_param(param_name, param_value)
        
    # Hiperparámetros que estaban fijos en este modelo
    mlflow.log_param("l2_reg", 0.01)
    
    mlflow.log_param("model", MODEL_NAME)
    # =====================================
    
    # - Métricas -

    mlflow.log_metric("execution_time_s", execution_time)

    #mlflow.log_metric("mae_val", mae_val)
    #mlflow.log_metric("rmse_val", rmse_val)

    mlflow.log_metric("mae_train", mae_train)
    mlflow.log_metric("rmse_train", rmse_train)

    # Registrar métricas globales en test
    for metric_name, value in report["global"].items():
        mlflow.log_metric(f"{metric_name}_test", value)
    
    # Registrar métricas por runway
    for runway, metrics in report["by_runway"].items():
        for metric_name, value in metrics.items():
            mlflow.log_metric(f"{metric_name}_test_runway_{runway}", value)
    
    # Registrar métricas por holding point
    for hp, metrics in report["by_holding_point"].items():
        for metric_name, value in metrics.items():
            mlflow.log_metric(f"{metric_name}_test_hp_{hp}", value)

    # - Modelo -

    # ========================================================================
    # NOTA - Dependiendo de con qué has hecho el modelo esto hay que cambiarlo
    mlflow.sklearn.log_model(grid_result.best_estimator_, MODEL_NAME)
    # ========================================================================
    

In [11]:
# Separamos las variables de la variable objetivo
df_train_high_2 = df_train_high.drop(*col_to_drop)
df_test_high_2 = df_test_high.drop(*col_to_drop)

X_train, y_train = df_train_high_2.drop("takeoff_time"), df_high_low_2.select("takeoff_time")
X_test, y_test = df_test_high_2.drop("takeoff_time"), df_test_high_2.select("takeoff_time") 

In [None]:
# - Visualizar experimentos -
!mlflow ui --backend-store-uri ./mlflow_maria

* 'schema_extra' has been renamed to 'json_schema_extra'
[2025-04-27 21:38:01 +0200] [74112] [INFO] Starting gunicorn 21.2.0
[2025-04-27 21:38:01 +0200] [74112] [INFO] Listening at: http://127.0.0.1:5000 (74112)
[2025-04-27 21:38:01 +0200] [74112] [INFO] Using worker: sync
[2025-04-27 21:38:01 +0200] [74114] [INFO] Booting worker with pid: 74114
[2025-04-27 21:38:01 +0200] [74115] [INFO] Booting worker with pid: 74115
[2025-04-27 21:38:01 +0200] [74116] [INFO] Booting worker with pid: 74116
[2025-04-27 21:38:01 +0200] [74117] [INFO] Booting worker with pid: 74117
[2025-04-27 22:22:23 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:74114)
[2025-04-27 22:22:23 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:74115)
[2025-04-27 22:22:23 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:74116)
[2025-04-27 22:22:23 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:74117)
[2025-04-27 22:22:23 +0200] [74115] [INFO] Worker exiting (pid: 74115)
[2025-04-27 22:22:23 +0200] [74116] [INFO] Worker exiting (pid:

25/04/27 22:22:23 WARN HeartbeatReceiver: Removing executor driver with no recent heartbeats: 160016 ms exceeds timeout 120000 ms
25/04/27 22:22:23 WARN SparkContext: Killing executors is not supported by current scheduler.


[2025-04-27 22:22:24 +0200] [74112] [ERROR] Worker (pid:74114) exited with code 1
[2025-04-27 22:22:24 +0200] [74112] [ERROR] Worker (pid:74114) exited with code 1.
[2025-04-27 22:22:24 +0200] [74112] [ERROR] Worker (pid:74117) was sent SIGKILL! Perhaps out of memory?
[2025-04-27 22:22:24 +0200] [74112] [ERROR] Worker (pid:74116) was sent SIGKILL! Perhaps out of memory?
[2025-04-27 22:22:24 +0200] [74112] [ERROR] Worker (pid:74115) was sent SIGKILL! Perhaps out of memory?
[2025-04-27 22:22:24 +0200] [75011] [INFO] Booting worker with pid: 75011
[2025-04-27 22:22:24 +0200] [75012] [INFO] Booting worker with pid: 75012
[2025-04-27 22:22:24 +0200] [75013] [INFO] Booting worker with pid: 75013
[2025-04-27 22:22:24 +0200] [75014] [INFO] Booting worker with pid: 75014


25/04/27 22:22:31 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
	at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
	at org.apache.spark.storage.BlockManagerMasterE

25/04/27 22:22:41 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
	at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
	at org.apache.spark.storage.BlockManagerMasterE

25/04/27 22:22:51 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
	at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
	at org.apache.spark.storage.BlockManagerMasterE

25/04/27 22:23:01 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
	at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
	at org.apache.spark.storage.BlockManagerMasterE

[2025-04-27 22:24:14 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:75011)
[2025-04-27 22:24:14 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:75012)
[2025-04-27 22:24:14 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:75013)
[2025-04-27 22:24:14 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:75014)
[2025-04-27 22:24:14 +0200] [75011] [INFO] Worker exiting (pid: 75011)
[2025-04-27 22:24:14 +0200] [75013] [INFO] Worker exiting (pid: 75013)
[2025-04-27 22:24:14 +0200] [75012] [INFO] Worker exiting (pid: 75012)
[2025-04-27 22:24:14 +0200] [75014] [INFO] Worker exiting (pid: 75014)
[2025-04-27 22:24:14 +0200] [74112] [ERROR] Worker (pid:75011) exited with code 1
[2025-04-27 22:24:14 +0200] [74112] [ERROR] Worker (pid:75011) exited with code 1.
[2025-04-27 22:24:14 +0200] [74112] [ERROR] Worker (pid:75012) exited with code 1
[2025-04-27 22:24:14 +0200] [74112] [ERROR] Worker (pid:75012) exited with code 1.
[2025-04-27 22:24:14 +0200] [75027] [INFO] Booting worker with pid: 75027
[2025-04-27 22:2

25/04/27 22:24:15 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:24:25 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:24:35 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
	at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
	at org.apache.spark.storage.BlockManagerMasterE

25/04/27 22:24:45 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:24:55 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:26:47 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
	at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
	at org.apache.spark.storage.BlockManagerMasterE

25/04/27 22:26:57 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
	at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
	at org.apache.spark.storage.BlockManagerMasterE

25/04/27 22:27:07 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:27:17 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:27:27 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
	at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
	at org.apache.spark.storage.BlockManagerMasterE

[2025-04-27 22:38:05 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:75027)
[2025-04-27 22:38:05 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:75028)
[2025-04-27 22:38:05 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:75029)
[2025-04-27 22:38:05 +0200] [74112] [CRITICAL] WORKER TIMEOUT (pid:75030)
[2025-04-27 22:38:05 +0200] [75027] [INFO] Worker exiting (pid: 75027)
[2025-04-27 22:38:05 +0200] [75028] [INFO] Worker exiting (pid: 75028)
[2025-04-27 22:38:05 +0200] [75030] [INFO] Worker exiting (pid: 75030)
[2025-04-27 22:38:05 +0200] [75029] [INFO] Worker exiting (pid: 75029)
[2025-04-27 22:38:05 +0200] [74112] [ERROR] Worker (pid:75028) exited with code 1
[2025-04-27 22:38:05 +0200] [74112] [ERROR] Worker (pid:75028) exited with code 1.
[2025-04-27 22:38:05 +0200] [74112] [ERROR] Worker (pid:75027) exited with code 1
[2025-04-27 22:38:05 +0200] [74112] [ERROR] Worker (pid:75027) exited with code 1.
[2025-04-27 22:38:05 +0200] [74112] [ERROR] Worker (pid:75029) exited with code 1
[2025-04

25/04/27 22:38:12 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:38:22 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:38:32 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:38:42 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)

25/04/27 22:38:52 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
	at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
	at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.driverEndpoint$lzycompute(BlockManagerMasterEndpoint.scala:124)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.org$apache$spark$storage$BlockManagerMasterEndpoint$$driverEndpoint(BlockManagerMasterEndpoint.scala:123)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.isExecutorAlive$lzycompute$1(BlockManagerMasterEndpoint.scala:688)
	at org.apache.spark.storage.BlockManagerMasterE

25/04/27 22:39:02 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.SparkThreadUtils$.awaitResult(SparkThreadUtils.scala:56)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:310)
	at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:101)
	at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:85)
	at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:80)
	at org.apache.spark.storage.BlockManager.reregister(BlockManager.scala:642)
	at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1223)
	at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:295)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1928)