## Задание №12

Необходимо восстановить поле, которое образуется в результате продувки в аэродинамической трубе.

**Формат ввода**

Вам задан набор данных (`train.csv`), состоящий из следующих полей:

- $x$, $y$, $z$ — координаты точки в пространстве;
- $u_x$, $u_y$, $u_z$ — вектор скорости потока в данной точке;
- $p$ — давление в данной точке.

Набор данных `test.csv` содержит только координаты $(x, y, z)$.

**Формат вывода**

Вам необходимо послать в тестирующую систему файл `submission.csv`, $k$-я строка которого будет содержать предсказанные значения $u_x$, $u_y$, $u_z$, $p$ для $k$-ой строки файла `test.csv`. Предсказанные таргеты в каждой строке разделяются запятыми. Нулевая строка должна содержать название колонок (игнорируются при чтении). Файл `sample_submission.csv` представляет из себя пример посылки.

**Примечания**

Оценка решения будет проводится на основе метрики MAE по формуле:

$$points = \min\left(10, \max\left(0, \frac{MaxScore - Score}{MaxScore - MinScore}\right)\right)$$

где $Score$ — значение MAE вашей посылки, $MinScore$ — порог по метрике снизу, $MaxScore$ — порог по метрике сверху.

Для различных полей значения $MinScore$ и $MaxScore$ различаются:

- Для $u_x$, $u_y$, $u_z$ значения $MinScore = 1.1$ и $MaxScore = 1.9$;
- Для $p$ значения $MinScore = 500$ и $MaxScore = 1900$.

После подсчёта $points$ по каждому таргету по-отдельности итоговые значения $points$ усредняются по всем четырём таргетам.

**Корректировка:**
Максимальный балл при `MAE` < 0.13

In [12]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
%matplotlib inline

import os
import tensorflow as tf
import pandas as pd
import numpy as np

import seaborn as sns
import matplotlib.pyplot as plt

from tensorflow.keras.utils import plot_model
from tensorflow.keras import layers, losses, metrics, regularizers, Sequential
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint


sns.set()

train_path = 'train.csv'
test_path = 'test.csv'
sample_path = 'sample_submission.csv'
submission_path = 'submission.csv'

In [13]:
train_csv = pd.read_csv(train_path)
test_csv = pd.read_csv(test_path)
train_csv

Unnamed: 0,x,y,z,u_x,u_y,u_z,p
0,-2.182809,-0.029668,-4.177883,410.436506,0.013901,-0.054928,100030.436265
1,-5.697199,-1.168055,2.704167,410.437352,-0.030757,0.021365,100020.971632
2,-2.388080,0.489320,-4.422072,410.443844,0.028441,-0.042093,100021.156537
3,-6.859628,0.177311,-1.745587,410.430337,0.008494,-0.011427,100018.526121
4,2.668538,-0.702350,3.033114,417.831936,-0.717926,0.116151,97710.352064
...,...,...,...,...,...,...,...
8527,-4.240097,-0.190458,-0.233550,406.192347,19.419476,-7.903454,102458.496025
8528,-2.490715,-0.443627,-3.005777,410.190644,-0.012390,-0.552977,100225.470127
8529,1.819966,-0.805841,-4.031370,409.337590,-0.234673,-1.674802,100720.371138
8530,-1.464362,-1.064565,3.252417,410.305496,-0.110896,0.647835,100233.838109


In [14]:
test_csv

Unnamed: 0,x,y,z
0,12.595389,-0.676479,9.448017
1,-5.115985,0.332546,-1.952567
2,-2.182809,-0.133158,4.463463
3,2.233563,-0.857583,-3.054615
4,-0.847552,-0.185422,-1.177141
...,...,...,...
3651,-2.695986,-0.598862,-3.103453
3652,12.134517,-0.883459,-7.321329
3653,-3.927776,-1.165944,0.220514
3654,1.306790,-0.909331,3.745842


In [15]:
feat_cols = ['x', 'y', 'z']
target_cols = ['u_x', 'u_y', 'u_z', 'p']

n_feats = len(feat_cols)
n_targets = len(target_cols)

In [16]:
def generate_train_data():
    mean, std = train_csv[target_cols].mean(), train_csv[target_cols].std()
    for (i, data) in train_csv.iterrows():
        yield (data[feat_cols], (data[target_cols] - mean) / std)


def prep_train_data():
    return tf.data.Dataset.from_generator(
        generate_train_data, 
        output_signature=(
            tf.TensorSpec(shape=(n_feats), dtype=tf.float32), 
            tf.TensorSpec(shape=(n_targets), dtype=tf.float32)
        )
    )

In [17]:
def prep_linear(inputs, outputs, layer_sizes, use_dropout=False, use_regularizers=False):
    model = Sequential()
    model.add(layers.InputLayer(input_shape=(inputs)))
    
    for layer_size in layer_sizes:
        if use_regularizers:
            model.add(layers.Dense(layer_size, kernel_regularizer=regularizers.L2(1e-4)))
        else:
            model.add(layers.Dense(layer_size))
        model.add(layers.PReLU()) # PReLU > LeakyReLU
        if use_dropout:
            model.add(layers.Dropout(.25))
    
    model.add(layers.Dense(outputs))
    return model


def prep_unet(inputs, outputs, layer_sizes):
    inputs = tf.keras.Input((inputs))
    x = inputs

    skips = []

    for layer_size in layer_sizes:
        x = layers.Dense(layer_size)(x)
        x = layers.PReLU()(x) # PReLU > LeakyReLU
        skips.append(x)

    for layer_size, skip in zip(layer_sizes[-2::-1], skips[-2::-1]):
        x = layers.Dense(layer_size)(x)
        x = layers.PReLU()(x) # PReLU > LeakyReLU
        x = tf.keras.layers.concatenate([x, skip])

    model = tf.keras.Model(inputs, layers.Dense(outputs)(x))
    return model

In [18]:
def build_regression(model):
    model.compile(
        optimizer='adam', 
        loss=losses.MeanAbsoluteError(),
        metrics=[metrics.RootMeanSquaredError(name='rmse')]
    )


def prep_simple():
    dense_model = prep_linear(n_feats, n_targets, [32 * n_feats, 64 * n_feats, 16 * n_feats], use_dropout=True)
    build_regression(dense_model)
    return dense_model
    

def prep_parallel():
    inputs = tf.keras.Input((n_feats))
    
    speed_model = prep_linear(n_feats, n_targets - 1, [32 * n_feats, 64 * n_feats, 16 * n_feats], use_dropout=True, use_regularizers=True)
    pressure_model = prep_linear(n_feats, 1, [16 * n_feats, 32 * n_feats, 8 * n_feats], use_dropout=True, use_regularizers=False)

    outputs = layers.concatenate([speed_model(inputs), pressure_model(inputs)])
    parallel_model = tf.keras.Model(inputs, outputs)

    build_regression(parallel_model)
    
    return parallel_model


def prep_parallel_unet():
    inputs = tf.keras.Input((n_feats))
    
    speed_model = prep_unet(n_feats, n_targets - 1, [8 * n_feats, 16 * n_feats, 32 * n_feats])
    pressure_model = prep_unet(n_feats, 1, [4 * n_feats, 8 * n_feats, 16 * n_feats])

    outputs = layers.concatenate([speed_model(inputs), pressure_model(inputs)])
    parallel_model = tf.keras.Model(inputs, outputs)

    build_regression(parallel_model)
    
    return parallel_model

In [19]:
model = prep_parallel_unet()
plot_model(model, '12.png', show_shapes=True)
model.summary()

You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
Model: "model_5"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
 input_4 (InputLayer)        [(None, 3)]                  0         []                            
                                                                                                  
 model_3 (Functional)        (None, 3)                    13371     ['input_4[0][0]']             
                                                                                                  
 model_4 (Functional)        (None, 1)                    3469      ['input_4[0][0]']             
                                                                                                  
 concatenate_9 (Concatenate  (None, 4)    

In [20]:
BATCH_SIZE = 64
BUFFER_SIZE = 2000
VAL_SIZE = len(train_csv.index) // 7

ds = prep_train_data()

train_ds = ds.skip(VAL_SIZE).shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
val_ds = ds.take(VAL_SIZE).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

In [21]:
training_folder = os.path.join('dnn', '12')
tensoboard_path = os.path.join(training_folder, 'logs_cpu')
model_path = os.path.join(training_folder, 'best_cpu')

model_callbacks = [
    EarlyStopping(patience=20),
    ModelCheckpoint(model_path, monitor='val_loss', save_best_only=True, mode='min')
]

In [22]:
model.fit(train_ds, validation_data=val_ds, epochs=200, callbacks=model_callbacks)

Epoch 1/200


     83/Unknown - 8s 48ms/step - loss: 0.4859 - rmse: 1.0628INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 2/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 3/200
Epoch 4/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 5/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 6/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 7/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 8/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 9/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 10/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 11/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 12/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 13/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 14/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 15/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 16/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 17/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 18/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 19/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 20/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 21/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 22/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 23/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 24/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 25/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 26/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 27/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 28/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 29/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 30/200
Epoch 31/200
Epoch 32/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 33/200
Epoch 34/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 35/200
Epoch 36/200
Epoch 37/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 38/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 39/200
Epoch 40/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 41/200
Epoch 42/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 43/200
Epoch 44/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 49/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 68/200
Epoch 69/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 70/200
Epoch 71/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 72/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 77/200
Epoch 78/200
Epoch 79/200
Epoch 80/200
Epoch 81/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 82/200
Epoch 83/200
Epoch 84/200
Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 97/200
Epoch 98/200
Epoch 99/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 100/200
Epoch 101/200
Epoch 102/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 103/200
Epoch 104/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200
Epoch 127/200
Epoch 128/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 129/200
Epoch 130/200
Epoch 131/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 132/200
Epoch 133/200
Epoch 134/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 135/200
Epoch 136/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 137/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 157/200
Epoch 158/200
Epoch 159/200


INFO:tensorflow:Assets written to: dnn\12\best_cpu\assets


Epoch 160/200
Epoch 161/200
Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200


<keras.src.callbacks.History at 0x23356fb7410>

In [23]:
model = tf.keras.models.load_model(model_path)







In [24]:
train_metrics = model.evaluate(ds.batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE))



In [25]:
def generate_test_data():
    for (i, data) in test_csv.iterrows():
        yield data[feat_cols]


def prep_test_data():
    return tf.data.Dataset.from_generator(
        generate_test_data, 
        output_signature=tf.TensorSpec(shape=(n_feats), dtype=tf.float32)
    )

In [26]:
test_ds = prep_test_data().batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
test_outputs = model.predict(test_ds)

mean, std = train_csv[target_cols].mean(), train_csv[target_cols].std()

submission_df = pd.DataFrame(test_outputs, columns=target_cols)
submission_df = submission_df * std + mean
submission_df



Unnamed: 0,u_x,u_y,u_z,p
0,410.491489,0.032219,0.335554,100378.855488
1,410.330740,0.101628,-0.070368,100008.853789
2,410.667191,-0.034974,0.044498,100018.734863
3,412.305513,-4.186017,-8.036711,100979.736268
4,410.169659,2.644421,-5.074940,100893.132631
...,...,...,...,...
3651,410.344329,0.029326,-0.171527,100058.659013
3652,410.377901,0.205342,0.469247,100054.545156
3653,405.690230,-10.395499,1.949544,102946.001675
3654,409.296526,0.072955,1.758138,100174.975668


In [27]:
submission_df.to_csv(submission_path, index=False)

# Результат

* До требуемого результата модель обучилась за **~50** эпох (**0.13** MSE)
* За **~160** эпох модель обучилась до ошибки MSE **0.08**