## Background

You will use the data contained in the train.csv file to train a model that will predict **dissolved inorganic carbon (DIC)** content in the water samples.

## Setup

In [3]:
# load libraries
import pandas as pd
import numpy as np
import math
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l1_l2
from keras_tuner import HyperModel
from keras_tuner import RandomSearch

2024-03-20 23:44:19.559460: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-20 23:44:19.670166: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-03-20 23:44:19.673847: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/R/4.2.2/lib/R/lib:/lib:/usr/local/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/j

In [4]:
# Set the environment variable to change the log level
#os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # 0 = default, 1 = no INFO, 2 = no INFO and WARNING, 3 = no INFO, WARNING, and ERROR

In [5]:
# Turn off scientific notation
#pd.set_option('display.float_format', lambda x: '%.3f' % x)

# Set seed
#np.random.seed(123)

## Import & pre-process training data

In [6]:
# import training data
train_df = pd.read_csv('data/train.csv')
train_df.columns = train_df.columns.str.lower().str.replace(' ', '_') # clean column names

# inspect data
print(train_df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1454 entries, 0 to 1453
Data columns (total 19 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   id                 1454 non-null   int64  
 1   lat_dec            1454 non-null   float64
 2   lon_dec            1454 non-null   float64
 3   no2um              1454 non-null   float64
 4   no3um              1454 non-null   float64
 5   nh3um              1454 non-null   float64
 6   r_temp             1454 non-null   float64
 7   r_depth            1454 non-null   int64  
 8   r_sal              1454 non-null   float64
 9   r_dynht            1454 non-null   float64
 10  r_nuts             1454 non-null   float64
 11  r_oxy_micromol.kg  1454 non-null   float64
 12  unnamed:_12        0 non-null      float64
 13  po4um              1454 non-null   float64
 14  sio3um             1454 non-null   float64
 15  ta1.x              1454 non-null   float64
 16  salinity1          1454 

We will remove column 12 b/c there are 0 non-null values. We will also remove the 'id' column because we don't expect it to be a relevant predictor.

In [8]:
# remove 'id' and 'unnamed:_12' columns
train_df = train_df.drop(['id', 'unnamed:_12'], axis=1)

In [None]:
# define feature matrix for training data
X_train = train_df.drop('dic', axis=1).values

# define target vector for training data
y_train = train_df['dic'].values

## Build & train model

In [65]:
# initialize new HyperModel object
class MyHyperModel(HyperModel):

    def __init__(self, input_shape):
        self.input_shape = input_shape  # store input shape as an instance attribute

    # build model
    def build(self, hp):
        model = Sequential() # create a Sequential model
        
        # add dense layer with ReLU (based on preliminary training results)
        model.add(Dense(units=hp.Int('neurons_0',  # tune units (number of neurons)
                                     min_value=32, 
                                     max_value=512, 
                                     step=32),
                        activation='relu', # select ReLU activator (based on preliminary training results)
                        kernel_regularizer=l1_l2(l1=0.01, l2=0.01), # set L1 and L2
                        input_shape=self.input_shape)) # specify input shape
        
        # add dense layer with ELU activator (based on preliminary training results)
        model.add(Dense(units=hp.Int('neurons_1', # tune units (number of neurons)
                                     min_value=32, 
                                     max_value=512, 
                                     step=32),
                        activation='elu', # select ELU activator (based on preliminary training results)
                        kernel_regularizer=l1_l2(l1=0.01, l2=0.01)) # set L1 and L2
                  
        # add dropout layer
        model.add(Dropout(rate=hp.Float('dropout_1', # tune dropout rate
                                         min_value=0.0,
                                         max_value=0.3,
                                         step=0.1)))
        
        # tune for additional hidden layers
        for i in range(1, hp.Int('num_layers', 2, 4)): 
            
            # add up to 3 additional dense layers with ReLU, ELU, or Sigmoid activators
            model.add(Dense(units=hp.Int('neurons_' + str(i), # tune units (number of neurons)
                                         min_value=32, 
                                         max_value=512, 
                                         step=32),
                            activation=hp.Choice('activation_' + str(i), 
                                                 ['relu', 'elu', 'sigmoid']), # select from ReLU, ELU, and Sigmoid activation functions
                            kernel_regularizer=l1_l2(l1=0.01, l2=0.01))) # set L1 and L2
                  
            # add up to 3 additional dropout layers
            model.add(Dropout(rate=hp.Float('dropout_' + str(i), # tune dropout rate
                                            min_value=0.0,
                                            max_value=0.3,
                                            step=0.1)))
        
        # add output layer with linear activation function
        model.add(Dense(1, activation='linear', kernel_regularizer=l1_l2(l1=0.01, l2=0.01)))

        # set tuning grid for optimizer
        learning_rate = hp.Float('learning_rate', min_value=1e-5, max_value=1e-2, sampling='log')
        beta_1 = hp.Float('beta_1', min_value=0.7, max_value=0.99, step=0.01)
        optimizer = Adam(learning_rate=learning_rate, beta_1=beta_1)
        
        # compile model
        model.compile(optimizer=optimizer, loss='mean_squared_error') # set MSE as loss function
        
        return model
                  
# store HyperModel object with specified input shape based on number of columns in feature matrix
hypermodel = MyHyperModel(input_shape=X_train.shape[1:])


In [None]:
# 'polyfill' for math.prod from Python 3.8
# create function that calculates the product of all the elements in an iterable (i.e., a sequence of numbers whose product is to be computed)
if not hasattr(math, 'prod'):
    def prod(iterable, start=1):
        total = start
        for i in iterable:
            total *= i
        return total
    math.prod = prod

# create RandomSearch object for tuning my hypermodel
tuner = RandomSearch(
    hypermodel,
    objective='loss', # set objective to minimize loss function
    max_trials=1000,  # set number of trials to run
    executions_per_trial=1,  # set number of models that are built and fit in each trial
    directory='my_dir22',  # specify directory to save tuning information
    project_name='activation_tuning22'
)

# create EarlyStopping object to use when tuning hypermodel
early_stopping = EarlyStopping(
    monitor='loss',
    min_delta=0.01, # set minimum decrease in loss function read as improvement
    patience=10, # 
    verbose=1,
    mode='min',
    restore_best_weights=True
)

tuner.search(X_train, y_train, epochs=50, callbacks=[early_stopping])

# Get the optimal hyperparameters
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]

In [66]:


# Polyfill for math.prod in Python versions older than 3.8
if not hasattr(math, 'prod'):
    def prod(iterable, start=1):
        total = start
        for i in iterable:
            total *= i
        return total
    math.prod = prod


input_shape = (X_train.shape[1],)  # Assuming X is your feature matrix

hypermodel = MyHyperModel(input_shape=input_shape)

tuner = RandomSearch(
    hypermodel,
    objective='loss',
    max_trials=1000,  # Number of trials to run
    executions_per_trial=1,  # Number of models that should be built and fit for each trial
    directory='my_dir22',  # Directory to save logs and models
    project_name='activation_tuning22'
)

early_stopping = EarlyStopping(
    monitor='loss', # specify to monitor loss
    min_delta=0.01,
    patience=10,
    verbose=1,
    mode='min',
    restore_best_weights=True
)

tuner.search(X_train, y_train, epochs=50, callbacks=[early_stopping])

# Get the optimal hyperparameters
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]


Trial 239 Complete [00h 00m 13s]
loss: 15961.1875

Best loss So Far: 104.9612808227539
Total elapsed time: 01h 05m 21s

Search: Running Trial #240

Value             |Best Value So Far |Hyperparameter
320               |512               |neurons_0
160               |128               |neurons_1
0                 |0                 |dropout_1
5                 |3                 |num_layers
96                |160               |neurons_2
sigmoid           |relu              |activation_2
0.1               |0                 |dropout_2
0.00085169        |0.00075948        |learning_rate
0.7               |0.77              |beta_1
64                |96                |neurons_3
sigmoid           |elu               |activation_3
0.2               |0.1               |dropout_3
160               |128               |neurons_4
sigmoid           |sigmoid           |activation_4
0.1               |0.2               |dropout_4

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
E

KeyboardInterrupt: 

In [35]:
tuner.results_summary()

Results summary
Results in my_dir8/activation_tuning8
Showing 10 best trials
Objective(name="loss", direction="min")

Trial 17 summary
Hyperparameters:
initial_units: 128
num_layers: 2
units_1: 192
activation_1: elu
dropout_1: 0.0
learning_rate: 0.000711942011908114
beta_1: 0.71
units_2: 480
activation_2: relu
dropout_2: 0.2
units_3: 416
activation_3: sigmoid
dropout_3: 0.4
Score: 86.92713165283203

Trial 08 summary
Hyperparameters:
initial_units: 32
num_layers: 2
units_1: 480
activation_1: elu
dropout_1: 0.0
learning_rate: 0.00026293419425032724
beta_1: 0.86
units_2: 32
activation_2: sigmoid
dropout_2: 0.1
units_3: 224
activation_3: relu
dropout_3: 0.1
Score: 100.59132385253906

Trial 09 summary
Hyperparameters:
initial_units: 224
num_layers: 2
units_1: 320
activation_1: relu
dropout_1: 0.0
learning_rate: 0.002608879298372233
beta_1: 0.8799999999999999
units_2: 320
activation_2: elu
dropout_2: 0.1
units_3: 32
activation_3: tanh
dropout_3: 0.4
Score: 102.07423400878906

Trial 01 summar

## Import & process testing data

In [None]:
# import testing data
test_df = pd.read_csv('data/test.csv')
test_df.columns = train_df.columns.str.lower().str.replace(' ', '_') # clean column names

# define feature matrix for testing data
X_test = test_df.drop('dic', axis=1).values

# define target vector for testing data (currently empty)
y_test = test_df['dic'].values

## Predict DIC for testing data & export submission

In [None]:
# import submission template
submission_df = pd.read_csv('data/sample_submission.csv')
submission_df.columns = submission_df.columns.str.lower().str.replace(' ', '_')

# bind predictions to 'dic' column
submission_df['dic'] = predictions
submission_df

In [None]:
# export submission
submission_df.to_csv('linus_submission5.csv', index=False)