# Infer 1DCNN


The training code is [here](https://www.kaggle.com/takamichitoda/ump-train-1dcnn-on-tpu), and standerd scaler model is [here](https://www.kaggle.com/takamichitoda/ump-npy-dataset).


`update`

- Version 7: baseline, CV=0.9105 / LB=0.135
- Version 8: add dropout, CV=0.9101 / LB=0.132
- Version 9: dropout ratio 0.2 -> 0.1, CV=0.9135 / LB=0.117
- Version 10: dropout ratio 0.1 -> 0.4, CV=0.9142 / LB=0.124
- Version 11: [MC dropout](https://arxiv.org/pdf/1506.02142.pdf), CV: 0.9142 / LB=0.125
- Version 12: remove dropout & [add lag feature](https://www.kaggle.com/takamichitoda/ump-lag-freatures), CV=0.9046
- Version 13: emove lag feature & use small batch, StratifiedKFold, ReduceLROnPlateau
- Version 14: batch=4096, use correlationLoss
- Version 15: use MSE loss, small model(param 1/4)
- Version 16: TimeSeriesSplit
- Version 17: MC Dropout(0.75), large model
- Version 19: MC Dropout(0.75), small model, correlationLoss
- Version 20: use MC Dropout
- Version 21: skip connect model
- Version 22: skip connect model, only fold-4
- Version 23: skip connect model, only fold-3
- Version 24: skip connect model, only fold-2
- Version 25: skip connect model, only fold-1
- Version 26: skip connect model, only fold-0
- Version 27: weight average
- Version 28: early stopping correlationLoss
- Version 29: weight average: https://www.kaggle.com/c/ubiquant-market-prediction/discussion/303916

In [None]:
import gc
import pickle
import numpy as np

from sklearn.metrics import mean_squared_error
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

import ubiquant

device_name = tf.test.gpu_device_name()
if "GPU" not in device_name:
    print("GPU device not found")
print('Found GPU at: {}'.format(device_name))

In [None]:
class GCF:
    MODEL_ROOT = "/kaggle/input/k/takamichitoda/ump-train-1dcnn-on-tpu"
    SCALER_PATH = "/kaggle/input/ump-npy-dataset/std_scaler.pkl"
    
    N_FOLDS = 5
    FEAT_COLS = [f"f_{i}" for i in range(300)]

In [None]:
models = []
for fold in range(GCF.N_FOLDS):
    model = tf.keras.models.load_model(f"{GCF.MODEL_ROOT}/ump_1dcnn_f{fold}.h5", compile=False)
    models.append(model)
models[0].summary()

In [None]:
scaler = pickle.load(open(GCF.SCALER_PATH, "rb"))
scaler

In [None]:
def get_weighted(n):
    w = []
    for j in range(1, n + 1):
        j = 2 if j == 1 else j
        w.append(1 / (2**(n + 1 - j)))
    return w

In [None]:
%%time
env = ubiquant.make_env()   # initialize the environment
iter_test = env.iter_test()    # an iterator which loops over the test set and sample submission
for (test_df, sample_prediction_df) in iter_test:
    x = scaler.transform(test_df[GCF.FEAT_COLS].values)
    
    preds = []
    for model in models:
        with tf.device('/GPU:0'):
            pred = model.predict(x)
        preds.append(pred)
    #pred_avg = np.hstack(preds).mean(1)
    #pred_avg = np.average(np.hstack(preds), weights=[1,1,1,3,4], axis=1)
    pred_avg = np.average(np.hstack(preds), weights=get_weighted(5), axis=1)
    
    sample_prediction_df['target'] = pred_avg  # make your predictions here
    env.predict(sample_prediction_df)   # register your predictions

In [None]:
sample_prediction_df