After concluding that changing hyper-parameters is not having the expected impact on performance, and after seeing that neural networks are predicting either negative values or zero, a different approach must be tried. 

An initial hypothesis is that one of the input variables being used is ruining the training of the model **somehow**. In this notebook **one single model** will be tried out:

* MAPE + Leaky ReLU + 20% dropout + linear output

Although the linear output was giving negative values in the past, it might be better to keep it for this run to prevent having dead neurons. With dead neurons most of the models could end up predicting zero which minimizes the error but gives absolutely no information. By having a linear output, models can go crazy and errors could go through the roof. But only **ONE** model needs to perform well.  

One input variable will be dropped for each model, producing 18 different data-sets and thus 18 different models will be trained and evaluated using MSPE to keep things constant. 

For the sake of continuity, only **HC** will be predicted in this run. If something promising is found, other pollutants will be attempted.

In [1]:
from keras.models import Sequential, load_model, Model
from keras.layers import Input, Dense, Dropout, advanced_activations, BatchNormalization, LeakyReLU, PReLU
from keras import losses, optimizers, activations
import keras.backend as K

import h5py

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

from sklearn.externals import joblib
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

import time
import datetime
import os

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [2]:
output_path = os.path.join('.','output')

## Load Data

In [3]:
complete_data_scaled_shuffled = pd.read_csv('Dataset_Scaled_Shuffled.csv')
print('Shuffled dataset loaded.')

Shuffled dataset loaded.


## Prepare Data

In [4]:
# Get number of data points
data_points = complete_data_scaled_shuffled.shape[0]

# Set sizes for train, dev, test sets
train_percent = 0.8
train_size = round(train_percent*data_points)

if (data_points-train_size)%2 == 0:
    dev_size = int((data_points-train_size)/2)
    test_size = dev_size
    print('Train Size = {}'.format(train_size))
    print('Dev Size = {}'.format(dev_size))
    print('Test Size = {}'.format(test_size))
    print('Remainder = {}'.format(train_size+dev_size+test_size-data_points))
    
else:
    train_size = train_size-1
    dev_size = int((data_points-train_size)/2)
    test_size = dev_size 
    print('Train Size = {}'.format(train_size))
    print('Dev Size = {}'.format(dev_size))
    print('Test Size = {}'.format(test_size))
    print('Remainder = {}'.format(train_size+dev_size+test_size-data_points))

Train Size = 62511
Dev Size = 7814
Test Size = 7814
Remainder = 0


## Prepare Datasets

Create functions to modify the input variables and thus create different input sets

In [5]:
# Save the names of the columns in a list that can be looped
input_names = complete_data_scaled_shuffled.columns[:-4]

In [6]:
for variable in input_names:
    print('{}'.format(variable))

Year
Vehicle_Code
Manufacturer_Code
Displacement
Fuel_System
Gears
Transmission_Code
ETW
HP
Drive_System_Code
Fuel_Code
V_avg
V_max
V_std
a_pos
a_neg
Peak_pos
Peak_neg


In [1]:
def prepare_data_sets(data, drop_variable):
    
    # Create a local copy of the entire dataset
    data_scaled_shuffled = data.copy()
    
    # Drop the variable that will be ignored during the run
    data_scaled_shuffled.drop(columns=drop_variable, inplace=True)
    print('{} Column Dropped'.format(drop_variable))
    
    print('Preparing Data-sets')
    # Divide data into train, dev, and test sets
    train_set = data_scaled_shuffled[ : train_size]
    dev_set = data_scaled_shuffled[train_size : train_size+dev_size]
    test_set = data_scaled_shuffled[train_size+dev_size : train_size+dev_size+test_size]

    # Reset index for all sets
    train_set = train_set.reset_index(drop=True)
    dev_set = dev_set.reset_index(drop=True)
    test_set = test_set.reset_index(drop=True)

    # Get values
    train_set_values = train_set.values
    dev_set_values = dev_set.values
    test_set_values = test_set.values
    
    # Number of emissions: HC, CO, CO2, NOX
    n_out = 4
    
    print('Splitting into inputs and outputs')
    # SLICING: [start row:end row , start column:end column]
    # Split into inputs and outputs
    x_train = train_set_values[:,:-n_out]
    x_dev = dev_set_values[:,:-n_out]
    x_test = test_set_values[:,:-n_out]
    
    print('Inputs = {}'.format(x_train.shape[1]))
    
    # Get the outputs (only HC)
    HC_train = train_set_values[:,-n_out]
    HC_dev = dev_set_values[:,-n_out]
    HC_test = test_set_values[:,-n_out]
    
    print('Data-sets complete')
    print('----------------------------------')
    
    return x_train, x_dev, x_test, HC_train, HC_dev, HC_test

## Inverse Scaling of Data

* This will be used later in the code to evaluate models

#### Import scalers

In [8]:
# Create an empty list to put all the scalers
scalers = []

for i in range(np.size(complete_data_scaled_shuffled.columns)):
    
    scaler_filename = "Scalers/scaler{}.save".format(i)
    scaler = joblib.load(scaler_filename)
    
    scalers.append(scaler)

#### Inverse Scale Data

In [9]:
# First, inverse transform all original values from the test_set
test_set_scaled = complete_data_scaled_shuffled[train_size+dev_size : train_size+dev_size+test_size]
test_set_inverse = test_set_scaled.copy()

for i in range(np.size(complete_data_scaled_shuffled.columns)):
    
    col_name = complete_data_scaled_shuffled.columns[i]
    
    values = test_set_inverse[col_name].values
    values = values.astype('float64')
    values = values.reshape(values.shape[0],1)
    
    test_set_inverse[col_name] = scalers[i].inverse_transform(values)
    
    print('Success with feature: {}'.format(col_name))

Success with feature: Year
Success with feature: Vehicle_Code
Success with feature: Manufacturer_Code
Success with feature: Displacement
Success with feature: Fuel_System
Success with feature: Gears
Success with feature: Transmission_Code
Success with feature: ETW
Success with feature: HP
Success with feature: Drive_System_Code
Success with feature: Fuel_Code
Success with feature: V_avg
Success with feature: V_max
Success with feature: V_std
Success with feature: a_pos
Success with feature: a_neg
Success with feature: Peak_pos
Success with feature: Peak_neg
Success with feature: HC
Success with feature: CO
Success with feature: CO2
Success with feature: Nox


-----------------
## Models

#### Basics

In [10]:
# Mini-batch size, epochs
batch_size = 64
epochs = 300
dd = 0.2

#### Build Model

In [11]:
def build_model(number, x_train):
    
    # Create model
    model = Sequential(name='Model_{}'.format(number))

    model.add(Dense(256, input_dim=x_train.shape[1]))
    model.add(advanced_activations.LeakyReLU())
    model.add(Dropout(dd))
    model.add(BatchNormalization())

    model.add(Dense(128))
    model.add(advanced_activations.LeakyReLU())
    model.add(Dropout(dd))
    model.add(BatchNormalization())

    model.add(Dense(64))
    model.add(advanced_activations.LeakyReLU())
    model.add(Dropout(dd))
    model.add(BatchNormalization())

    model.add(Dense(32))
    model.add(advanced_activations.LeakyReLU())
    model.add(Dropout(dd))
    model.add(BatchNormalization())

    model.add(Dense(16))
    model.add(advanced_activations.LeakyReLU())
    model.add(Dropout(dd))
    model.add(BatchNormalization())

    model.add(Dense(1))

    #Compile model
    model.compile(loss=losses.mean_absolute_percentage_error, optimizer=optimizers.Adam(), metrics = ['accuracy'])
    
    print('{} Created'.format(model.name))
    print('----------------------------------')
    
    return model

#### Train Model

In [12]:
def train_models(model, x_train, y_train, x_dev, y_dev):
    
    print('{} - Training'.format(model.name))
    print('- Started on {} at {}'.format(str(datetime.datetime.now())[5:-16], str(datetime.datetime.now())[11:-10]))
    # Start timer
    start_time = time.time()

    # fit network
    history = model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size, 
                        validation_data=(x_dev, y_dev), verbose=0, shuffle=True)

    # End timer
    end_time = time.time() - start_time
    print('{} - Training Complete'.format(model.name))
    print('- Time: {:.3f} min'.format(end_time/60))
    print('- Loss = {:.5f}'.format(history.history['loss'][-1]))
    print('- Val Loss = {:.5f}'.format(history.history['val_loss'][-1]))
    print('----------------------------------')
        
    return history

#### Make Predictions and Calculate Error

In [13]:
# Function to define MSPE
def msp_error(true,pred):
    error = 100*np.sum(((true-pred)/true)**2)/np.size(true)
    return error

In [14]:
def predict_get_error(model, x_test):
    
    print('Predicting with {}'.format(model.name))
    scaled_predictions = model.predict(x_test)
    
    print('Inverse Scaling Operation') 
     
    # Inverse the scaling operation on the predictions
    predictions = scalers[-4].inverse_transform(scaled_predictions)
    
    print('- Prediction Mean = {:.5f}'.format(np.mean(predictions)))
    print('- Prediction Min = {:.5f}'.format(np.min(predictions)))
    print('- Prediction Max = {:.5f}'.format(np.max(predictions)))

    print('Calculating HC Error')
    mspe = msp_error(test_set_inverse['HC'].values, predictions)
        
    print('- HC Error  = {:.2e}'.format(mspe))
    print('----------------------------------')
    
    return mspe

#### Process Models and Rank with MSPE

In [15]:
def process_models():
    
    count = 1
    model_list = []
    history_list = []
    HC_error_list = []

    for variable in input_names:

        # Print model variables
        print('Model_{} Variables:'.format(count))
        print('- Loss: MAPE')
        print('- Activation: Leaky ReLU')
        print('- Optimizer: Adam')
        print('- Dropout: 20%')
        print('- Input Variable Dropped: {}'.format(variable))
        print('----------------------------------')

        # Get the dataset WITHOUT the input variable
        x_train, x_dev, x_test, y_train, y_dev, y_test = prepare_data_sets(complete_data_scaled_shuffled, variable)

        # Create model
        model = build_model(count, x_train)

        # Train model
        history = train_models(model, x_train, y_train, x_dev, y_dev)
        history_list.append(history)

        # Make predictions and calculate error
        error = predict_get_error(model, x_test)

        # Add error to error list
        HC_error_list.append([model.name, variable, error])

        # Announce one model process ended
        print('============== MODEL {} PROCESS END =============='.format(count))
        print(' ')

        # Increase counter by 1
        count = count+1

        # Add TRAINED model to list
        model_list.append(model)

    print('Creating DataFrame')                
    HC_error = pd.DataFrame(HC_error_list)

    print('Changing DataFrame column names')
    HC_error.columns = ['Model', 'Dropped Variable', 'MSPE']

    print('Ranking Models')
    HC_error.sort_values(by=['MSPE'], inplace=True)

    count = 0
    
    return HC_error, model_list, history_list

In [26]:
HC_ranking, models, histories = process_models()

Model_1 Variables:
- Loss: MAPE
- Activation: Leaky ReLU
- Optimizer: Adam
- Dropout: 20%
- Input Variable Dropped: Year
----------------------------------
Year Column Dropped
Preparing Data-sets
Splitting into inputs and outputs
Inputs = 17
Data-sets complete
----------------------------------
Model_1 Created
----------------------------------
Model_1 - Training
- Started on 03-27 at 13:37
Model_1 - Training Complete
- Time: 42.687 min
- Loss = 671.63486
- Val Loss = 208.74305
----------------------------------
Predicting with Model_1
Inverse Scaling Operation
- Prediction Mean = -0.00424
- Prediction Min = -0.00465
- Prediction Max = 0.03096
Calculating HC Error
- HC Error  = 7.40e+08
----------------------------------
 
Model_2 Variables:
- Loss: MAPE
- Activation: Leaky ReLU
- Optimizer: Adam
- Dropout: 20%
- Input Variable Dropped: Vehicle_Code
----------------------------------
Vehicle_Code Column Dropped
Preparing Data-sets
Splitting into inputs and outputs
Inputs = 17
Data-sets

Model_11 Created
----------------------------------
Model_11 - Training
- Started on 03-27 at 21:02
Model_11 - Training Complete
- Time: 45.820 min
- Loss = 602.58239
- Val Loss = 101.93962
----------------------------------
Predicting with Model_11
Inverse Scaling Operation
- Prediction Mean = 0.00114
- Prediction Min = 0.00095
- Prediction Max = 0.00117
Calculating HC Error
- HC Error  = 4.32e+07
----------------------------------
 
Model_12 Variables:
- Loss: MAPE
- Activation: Leaky ReLU
- Optimizer: Adam
- Dropout: 20%
- Input Variable Dropped: V_avg
----------------------------------
V_avg Column Dropped
Preparing Data-sets
Splitting into inputs and outputs
Inputs = 17
Data-sets complete
----------------------------------
Model_12 Created
----------------------------------
Model_12 - Training
- Started on 03-27 at 21:48
Model_12 - Training Complete
- Time: 46.127 min
- Loss = 662.58637
- Val Loss = 146.24322
----------------------------------
Predicting with Model_12
Inverse Scal

In [27]:
HC_ranking

Unnamed: 0,Model,Dropped Variable,MSPE
1,Model_2,Vehicle_Code,33937240.0
10,Model_11,Fuel_Code,43244850.0
3,Model_4,Displacement,78609950.0
2,Model_3,Manufacturer_Code,163639600.0
13,Model_14,V_std,178632100.0
12,Model_13,V_max,228125700.0
15,Model_16,a_neg,403547000.0
17,Model_18,Peak_neg,407432300.0
9,Model_10,Drive_System_Code,512892100.0
16,Model_17,Peak_pos,559728500.0


In [None]:
epoch_vector=np.linspace(1,epochs,epochs)

for i in range(len(models)):
    model = models[i]
    variable = input_names[i]
    history = histories[i]
    
    model.save(os.path.join(output_path,'{}_{}'.format(model.name,variable)))
    
    hist_data =[epoch_vector,history.history['loss'],history.history['val_loss']]
    hist_data =pd.DataFrame(hist_data).transpose()
    hist_data.columns=['Epochs','loss','val_loss']
    
    hist_data.to_csv(os.path.join(output_path,'Training_History_{}.csv'.format(model.name)),index=False)

## Load Models and Histories

There was a computer error which required a restart. The models and histories were exported before this.

In [30]:
models = []
histories = []
print(len(models))
for i in range(len(input_names)):
    
    count = i + 1
    
    variable = input_names[i]
    
    print('Model_{}_{}'.format(count,variable))
    model = load_model(os.path.join(output_path,'Model_{}_{}.h5'.format(count,variable)))
    
    history = pd.read_csv(os.path.join(output_path,'Training_History_Model_{}.csv'.format(count)))
    
    models.append(model)
    histories.append(history)
print(len(models))

0
Model_1_Year
Model_2_Vehicle_Code
Model_3_Manufacturer_Code
Model_4_Displacement
Model_5_Fuel_System
Model_6_Gears
Model_7_Transmission_Code
Model_8_ETW
Model_9_HP
Model_10_Drive_System_Code
Model_11_Fuel_Code
Model_12_V_avg
Model_13_V_max
Model_14_V_std
Model_15_a_pos
Model_16_a_neg
Model_17_Peak_pos
Model_18_Peak_neg
18


Make predictions with the loaded models and rank them by MSPE. Include the mean, max, and min values for the predictions to see what models are predicting negative values.

In [33]:
def make_predictions(model_list):
    
    count = 1
    HC_error_list = []

    for i in range(len(model_list)-1):
        
        variable = input_names[i]
        model = model_list[i]

        # Print model variables
        print('{} Variables:'.format(model.name))
        print('- Input Variable Dropped: {}'.format(variable))
        print('----------------------------------')

        # Get the dataset WITHOUT the input variable
        x_train, x_dev, x_test, y_train, y_dev, y_test = prepare_data_sets(complete_data_scaled_shuffled, variable)

        # Make predictions and calculate error
        error = predict_get_error(model, x_test)
        
        # Make predictions and save them
        predict_mean = np.mean(model.predict(x_test))
        predict_max = np.max(model.predict(x_test))
        predict_min = np.min(model.predict(x_test))

        # Add error to error list
        HC_error_list.append([model.name, variable, predict_mean, predict_max, predict_min, error])

        # Announce one model process ended
        print('============== MODEL {} PROCESS END =============='.format(count))
        print(' ')

        # Increase counter by 1
        count = count+1

    print('Creating DataFrame')                
    HC_error = pd.DataFrame(HC_error_list)

    print('Changing DataFrame column names')
    HC_error.columns = ['Model', 'Dropped Variable', 'Mean', 'Max', 'Min', 'MSPE']

    print('Ranking Models')
    HC_error.sort_values(by=['MSPE'], inplace=True)

    count = 0
    
    return HC_error

In [34]:
ranking = make_predictions(models)

Model_1 Variables:
- Input Variable Dropped: Year
----------------------------------
Year Column Dropped
Preparing Data-sets
Splitting into inputs and outputs
Inputs = 17
Data-sets complete
----------------------------------
Predicting with Model_1
Inverse Scaling Operation
- Prediction Mean = -0.00424
- Prediction Min = -0.00465
- Prediction Max = 0.03096
Calculating HC Error
- HC Error  = 7.40e+08
----------------------------------
 
Model_2 Variables:
- Input Variable Dropped: Vehicle_Code
----------------------------------
Vehicle_Code Column Dropped
Preparing Data-sets
Splitting into inputs and outputs
Inputs = 17
Data-sets complete
----------------------------------
Predicting with Model_2
Inverse Scaling Operation
- Prediction Mean = 0.00038
- Prediction Min = -0.01551
- Prediction Max = 0.00061
Calculating HC Error
- HC Error  = 3.39e+07
----------------------------------
 
Model_3 Variables:
- Input Variable Dropped: Manufacturer_Code
----------------------------------
Manufac

Inverse Scaling Operation
- Prediction Mean = 0.00412
- Prediction Min = 0.00247
- Prediction Max = 0.00423
Calculating HC Error
- HC Error  = 5.60e+08
----------------------------------
 
Creating DataFrame
Changing DataFrame column names
Ranking Models


In [35]:
ranking

Unnamed: 0,Model,Dropped Variable,Mean,Max,Min,MSPE
1,Model_2,Vehicle_Code,0.000508,0.000808,-0.020723,33937240.0
10,Model_11,Fuel_Code,0.001521,0.001555,0.001261,43244850.0
3,Model_4,Displacement,0.000915,0.001576,-0.02503,78609950.0
2,Model_3,Manufacturer_Code,-0.002952,-0.0028,-0.006031,163639600.0
13,Model_14,V_std,0.00293,0.003148,-0.016351,178632100.0
12,Model_13,V_max,-0.003395,-0.003077,-0.011228,228125700.0
15,Model_16,a_neg,-0.004658,-0.004614,-0.005854,403547000.0
9,Model_10,Drive_System_Code,0.004786,0.027829,0.003819,512892100.0
16,Model_17,Peak_pos,0.005498,0.005642,0.003292,559728500.0
4,Model_5,Fuel_System,0.002026,0.031656,-0.022835,560910100.0


Same function as above, but this one only saves the entries that predict POSITIVE values. 

In [40]:
def filter_negatives(model_list):
    
    count = 1
    HC_error_list = []

    for i in range(len(model_list)-1):
        
        variable = input_names[i]
        model = model_list[i]

        # Print model variables
        print('{} Variables:'.format(model.name))
        print('- Input Variable Dropped: {}'.format(variable))
        print('----------------------------------')

        # Get the dataset WITHOUT the input variable
        x_train, x_dev, x_test, y_train, y_dev, y_test = prepare_data_sets(complete_data_scaled_shuffled, variable)

        # Make predictions and calculate error
        error = predict_get_error(model, x_test)
        
        # Make predictions and save them
        predict_mean = np.mean(model.predict(x_test))
        predict_max = np.max(model.predict(x_test))
        predict_min = np.min(model.predict(x_test))

        # Add error to error list
        if np.min(model.predict(x_test)) >= 0:
            HC_error_list.append([model.name, variable, predict_mean, predict_max, predict_min, error])

        # Announce one model process ended
        print('============== MODEL {} PROCESS END =============='.format(count))
        print(' ')

        # Increase counter by 1
        count = count+1

    print('Creating DataFrame')                
    HC_error = pd.DataFrame(HC_error_list)

    print('Changing DataFrame column names')
    HC_error.columns = ['Model', 'Dropped Variable', 'Mean', 'Max', 'Min', 'MSPE']

    print('Ranking Models')
    HC_error.sort_values(by=['MSPE'], inplace=True)

    count = 0
    
    return HC_error

In [41]:
ranking_positives = filter_negatives(models)

Model_1 Variables:
- Input Variable Dropped: Year
----------------------------------
Year Column Dropped
Preparing Data-sets
Splitting into inputs and outputs
Inputs = 17
Data-sets complete
----------------------------------
Predicting with Model_1
Inverse Scaling Operation
- Prediction Mean = -0.00424
- Prediction Min = -0.00465
- Prediction Max = 0.03096
Calculating HC Error
- HC Error  = 7.40e+08
----------------------------------
 
Model_2 Variables:
- Input Variable Dropped: Vehicle_Code
----------------------------------
Vehicle_Code Column Dropped
Preparing Data-sets
Splitting into inputs and outputs
Inputs = 17
Data-sets complete
----------------------------------
Predicting with Model_2
Inverse Scaling Operation
- Prediction Mean = 0.00038
- Prediction Min = -0.01551
- Prediction Max = 0.00061
Calculating HC Error
- HC Error  = 3.39e+07
----------------------------------
 
Model_3 Variables:
- Input Variable Dropped: Manufacturer_Code
----------------------------------
Manufac

Inverse Scaling Operation
- Prediction Mean = 0.00412
- Prediction Min = 0.00247
- Prediction Max = 0.00423
Calculating HC Error
- HC Error  = 5.60e+08
----------------------------------
 
Creating DataFrame
Changing DataFrame column names
Ranking Models


In [42]:
ranking_positives

Unnamed: 0,Model,Dropped Variable,Mean,Max,Min,MSPE
3,Model_11,Fuel_Code,0.001521,0.001555,0.001261,43244850.0
2,Model_10,Drive_System_Code,0.004786,0.027829,0.003819,512892100.0
5,Model_17,Peak_pos,0.005498,0.005642,0.003292,559728500.0
0,Model_8,ETW,0.006925,0.022525,0.006542,922221100.0
4,Model_15,a_pos,0.010941,0.047792,0.01022,2267279000.0
1,Model_9,HP,0.011039,0.031111,0.008033,2340374000.0


In [43]:
print(np.mean(test_set_inverse['HC']))
print(np.max(test_set_inverse['HC']))
print(np.min(test_set_inverse['HC']))

0.05058747626745175
0.587195595
2.9825808e-06


## Next Steps

All the data will be moved from the *output* folder to the *Gen_5* folder. For now, all models and histories are saved so they can be loaded later. But eventually only the models that produced positive outputs will be saved with their respective history.

Models with positive predictions still have very big errors. 

Options:
* Try removing two input variables at the same time (only from the variables that, when removed, got positive results). This would result in 30 data-sets and 30 models to be tried out. 
* Out of the dropped variables 2 belong to the driving cycle (Peak_pos which is the number of times positive acceleration exceeds a threshold, and a_pos which is the average positive acceleration during the cycle). Each of these variables actually come in pairs (positive and negative), so it makes sense that if one is removed, the other should also be removed. 
    * Mugdal **might** have shown in his paper that acceleration had no correlation whatsoever to his emissions, so maybe they should all be removed (a_pos, a_neg, Peak_pos, Peak_neg). This would represent an extra data-set to try and an extra model to try. 
    
All the combinations would be:

In [51]:
count = 1
for i in range(len(ranking_positives)):
    
    variable_1 = ranking_positives.iloc[i,:]['Dropped Variable']
    
    for j in range(len(ranking_positives)):
        
        variable_2 = ranking_positives.iloc[j,:]['Dropped Variable']
        
        if variable_1 != variable_2:
        
            print('Combination {}: {} -- {}'.format(count, variable_1,variable_2))
        
            count = count + 1
        
count = 0

Combination 1: Fuel_Code -- Drive_System_Code
Combination 2: Fuel_Code -- Peak_pos
Combination 3: Fuel_Code -- ETW
Combination 4: Fuel_Code -- a_pos
Combination 5: Fuel_Code -- HP
Combination 6: Drive_System_Code -- Fuel_Code
Combination 7: Drive_System_Code -- Peak_pos
Combination 8: Drive_System_Code -- ETW
Combination 9: Drive_System_Code -- a_pos
Combination 10: Drive_System_Code -- HP
Combination 11: Peak_pos -- Fuel_Code
Combination 12: Peak_pos -- Drive_System_Code
Combination 13: Peak_pos -- ETW
Combination 14: Peak_pos -- a_pos
Combination 15: Peak_pos -- HP
Combination 16: ETW -- Fuel_Code
Combination 17: ETW -- Drive_System_Code
Combination 18: ETW -- Peak_pos
Combination 19: ETW -- a_pos
Combination 20: ETW -- HP
Combination 21: a_pos -- Fuel_Code
Combination 22: a_pos -- Drive_System_Code
Combination 23: a_pos -- Peak_pos
Combination 24: a_pos -- ETW
Combination 25: a_pos -- HP
Combination 26: HP -- Fuel_Code
Combination 27: HP -- Drive_System_Code
Combination 28: HP -- Pe

Combination 31: Peak_pos -- Peak_neg -- a_pos -- a_neg

There alse seems to be a big difference between the training loss and the dev loss, but not as expected: the training loss is much higher than the dev loss. Is the model under-fitting? Maybe dropout is too high. Maybe a bigger network needs to be used. Maybe optimization is not working that well. 