# Making FoG predictions through segmentation and ensemble of 1D-CNN Models

### Intro
This notebook is heavily based on the following two notebooks:
* [[Practicum] - CNN skeleton](https://www.kaggle.com/code/ernestglukhov/practicum-cnn-skeleton)
* [Locating steps. Wavelets -> step rate feature](https://www.kaggle.com/code/vrbaryshev/locating-steps-wavelets-step-rate-feature)

In this Kaggle notebook, we present a comprehensive approach aimed at predicting Freezing of Gait (FoG) using an ensemble of 1D-CNN models and a sophisticated segmentation process. The segmentation process helps capture distinct patterns in the acceleration data, while the ensemble strategy involves training multiple 1D-CNN models using K-fold cross-validation. Our model architecture consists of multiple Convolutional Blocks, each containing convolutional layers with different kernel sizes and dilations. The models in the ensemble are trained separately on the "tdcsfog" and "defog" datasets. By treating the `tdcsfog` and `defog` datasets separately, we aimed to optimize the models' performance for each specific type of data.

As a result, our best model enabled us to achieve a score of 0.404 on the public leaderboard, with a private leaderboard score of 0.246. \
We were disappointed by the notable divergence in scores between the public and private datasets, which we believe is likely due to differences in their composition. Despite this, we must emphasize the educational value of the project, which offered a rich, hands-on learning experience. As a competition, however, it did leave a taste of disappointment due to the unexpected discrepancy in results across different data sets.

In [1]:
import warnings
warnings.filterwarnings("ignore")

In [2]:
import gc
import os
import numpy as np
import pandas as pd
import pathlib

import pywt

from sklearn.model_selection import KFold
from sklearn.metrics import average_precision_score
from sklearn.preprocessing import MinMaxScaler

import pytorch_lightning as pl
from pytorch_lightning.callbacks import ModelCheckpoint

import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import Dataset, DataLoader

## Segmentation process and feature engineering

**Segmentation process**:

We have devised a segmentation process on raw acceleration data using a sliding window approach. This approach allows us to capture temporal patterns and variations in the acceleration signals.
To accomplish this, we implemented a custom function that divides the continuous time series data into smaller, fixed-length segments. Each segment represents a specific time interval, capturing a snapshot of the acceleration data within that interval. Furthermore, we have incorporated padding techniques within our segmentation function to ensure consistent segment lengths. The padding operation fills incomplete segments caused by irregular data lengths, enabling uniform processing and compatibility with downstream tasks.
We used the following parameters for the segmentation process:

    Window size: 72 seconds (9216 data points) for events from `tdscfog` set
    Window size: 10.24 seconds (1024 data points) for events from `defog` set
    
**Feature engineering**:

To capture different aspects of the data, we employed various techniques, including rolling window operations and exponential weighted moving averages (EWMA). The computation of these features is performed separately for each acceleration channel, namely AccV, AccML, and AccAP. These techniques allow us to derive meaningful insights about the temporal characteristics of the acceleration data. In addtition to temporal data analysis, we have also included a spectral based image of data. Technically, all sprectal based features are convolutions, so a sufficiently large and trained convolutional neural net should be able to recognize them without explicit help from feature engineering. However, including some spectral features equals to pretraining the model for easier recognizing repetitive patterns in particual frequency bands, such as steps and step-related mechanical processes. In this work we have included an eight pixed wide wavelet image covering frequencies from 0.4Hz to 3Hz as premade features.

In [3]:
class SegmentedDataset(Dataset):
    def __init__(self, folder_path, files, is_train=True, is_defog=False, segment_length=9216):
        self.folder_path = folder_path
        self.is_train = is_train
        self.is_defog = is_defog
        self.segment_length = segment_length
        self.data = []
        self.time = []
        self.ids = []
        self.targets = []
        
        if self.is_defog:
            self.segment_length = 1024
        
        # Read and segment all data at initialization
        for file_name in files:
            file_path = os.path.join(folder_path, file_name)
            data = pd.read_csv(file_path)
            
            acc_cols = ['AccV', 'AccML', 'AccAP']
            window_size = self.segment_length

            
            # Create new features based on accelerometer data columns
            wavelet_scales = np.exp(np.arange(2, 6, 0.5))
            wavelet_list = [ f'wave_{scale}' for scale in wavelet_scales ] 
            wavelet='morl'
            
            coeff, freq = pywt.cwt(data['AccML'], wavelet_scales, wavelet)
            for i, scale in enumerate(wavelet_scales):
                data[f'wave_{scale}'] = coeff[i] 
            
            for acc in acc_cols:
                data[f'{acc}_cumsum'] = data[acc].cumsum()
                data[f'{acc}_running_sum'] = data[acc].rolling(window=window_size, min_periods=8).sum()
                data[f'{acc}_rolling_mean'] = data[acc].rolling(window=window_size, min_periods=8).mean()
                data[f'{acc}_rolling_std'] = data[acc].rolling(window=window_size, min_periods=8).std()
                data[f'{acc}_rolling_min'] = data[acc].rolling(window=window_size, min_periods=8).min()
                data[f'{acc}_rolling_max'] = data[acc].rolling(window=window_size, min_periods=8).max()
                data[f'{acc}_rolling_delta'] = data[f'{acc}_rolling_max'] - data[f'{acc}_rolling_min']        
                data[f'{acc}_EWMA_02'] = data[acc].ewm(alpha=0.2).mean()

                
            data.fillna(method='backfill', inplace=True)

            file_id = file_name.replace('.csv', '')

            time = data['Time'].values
            features_cols = [col for col in data.columns if col not in ['Time', 'StartHesitation', 'Turn', 'Walking', 'Valid', 'Task']]
            features_cols_nowave = [col for col in data.columns if col not in ['Time', 'StartHesitation', 'Turn', 'Walking', 'Valid', 'Task', 'wave']]
            
            scaler = MinMaxScaler()
            scaled_features = scaler.fit_transform(data[features_cols_nowave])
            data[features_cols_nowave] = scaler.fit_transform(data[features_cols_nowave])
            
            features = data[features_cols].values.astype(np.float32)
            
            if self.is_train:
                targets = data[['StartHesitation', 'Turn', 'Walking']].values
                # add extra target column for "Normal" case
                targets = np.concatenate([targets, (1 - targets.sum(axis=1)).reshape(-1,1)], axis=1)
                self.segment_and_pad_data(file_id, features, time, targets)
            else:
                self.segment_and_pad_data(file_id, features, time)
                
                
    # Segments the data into fixed-length segments and handles padding for the last segment if it's shorter than the desired length
    def segment_and_pad_data(self, file_id, features, time, targets=None):
        num_segments = len(features) // self.segment_length
        remainder = len(features) % self.segment_length
        
            
        # First, handle the full segments
        for i in range(num_segments):
            feature_segment = features[i*self.segment_length:(i+1)*self.segment_length]
            
            self.data.append(feature_segment)
            
            time_segment = time[i*self.segment_length:(i+1)*self.segment_length]
            self.time.append(time_segment)
            
            
            if self.is_train:
                target_segment = targets[i*self.segment_length:(i+1)*self.segment_length]
                self.targets.append(target_segment)
                
            self.ids.append(file_id)

        # Now handle the last segment if it's shorter than segment_length
        if remainder > 0:
            padding_length = self.segment_length - remainder

            feature_segment = np.pad(features[-remainder:], ((0, padding_length), (0, 0)), mode='constant')
            self.data.append(feature_segment)
            
            time_segment = np.pad(time[-remainder:], (0, padding_length), mode='constant', constant_values=-1)

            self.time.append(time_segment)
            
            if self.is_train:
                target_segment = np.pad(targets[-remainder:], ((0, padding_length), (0, 0)), mode='constant')
                self.targets.append(target_segment)
                
            self.ids.append(file_id)
            


    def get_segment_indices(self):
        return self.segment_indices

    def get_segment_padding(self):
        return self.segment_padding

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        if self.is_train:
            return torch.Tensor(self.data[idx]), torch.Tensor(np.argmax(self.targets[idx],axis=1)).to(torch.int64)
        else:
            return torch.Tensor(self.data[idx]), self.ids[idx], self.time[idx]

## 1D-Convolutional Neural Network (1D-Conv NN)

To make predictions for the classification task on each segment, we decided to choose a 1D-Convolutional Neural Network with flexible hyperparameter tuning.

Our model consists of multiple Convolutional Blocks (the optimal number of ConvBlocks based on the training speed and validation score turned out to be 3), each containing convolutional layers with different kernel sizes and dilations. These blocks allow the network to capture various temporal patterns and features present in the acceleration data. We apply batch normalization and dropout regularization within each block to enhance the model's generalization capability.
The output of the convolutional blocks is fed into a linear layer, which maps the learned features to the corresponding number of classes. To handle class imbalance, we use a weighted negative log-likelihood loss function. The model's predictions are obtained by applying a softmax activation function to the output logits.

During training, we optimize the model parameters using the Adam optimizer with a learning rate of 0.001. We monitor the training and validation losses throughout the epochs to assess the model's performance. Additionally, we evaluate the mean average precision score as a performance metric for the validation set.

In the output of the model we have LogSoftmax and 4 classes - `StartHesitation`, `Turn`, `Walking` and `Normal` with weights: 1.0, 1.0, 1.0, 0.1.

In [4]:
import itertools

class ConvBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_sizes, dropout_rate, dilations):
        super(ConvBlock, self).__init__()

        self.convs = nn.ModuleList([
            nn.Conv1d(in_channels, out_channels, kernel_size, padding=(kernel_size+(kernel_size -1)*(dilation-1))//2, dilation=dilation)
            for kernel_size, dilation in itertools.product(kernel_sizes, dilations)
        ])

        self.batch_norm = nn.BatchNorm1d(out_channels * len(kernel_sizes)*len(dilations))
        self.dropout = nn.Dropout(dropout_rate)

        self.skip_connection = nn.Conv1d(in_channels, out_channels * len(kernel_sizes)*len(dilations), kernel_size=1) \
            if in_channels != out_channels * len(kernel_sizes)*len(dilations) else nn.Identity()


    def forward(self, x):
        x = x.transpose(1, 2)
        skip = self.skip_connection(x)

        x = torch.cat([conv(x) for conv in self.convs], dim=1)

        x = self.batch_norm(x + skip)
        x = F.relu(x)
        x = self.dropout(x)
        x = x.transpose(1, 2)

        return x
        

class Model(pl.LightningModule):
    def __init__(self, in_channels, out_channels, kernel_sizes, dilations, dropout_rate, num_blocks, lr=0.001):
        super(Model, self).__init__()
        self.train_loss_history = []
        self.val_loss_history = []
        self.val_score_history = []
        self.lr = lr
        self.blocks = nn.Sequential(*[
            ConvBlock(in_channels if i == 0 else out_channels * len(kernel_sizes)*len(dilations), 
                      out_channels, 
                      kernel_sizes, 
                      dropout_rate,
                      dilations)
            for i in range(num_blocks)
        ])

        num_classes = 4
        self.linear = nn.Linear(out_channels * len(kernel_sizes)*len(dilations), num_classes)
        weights = torch.tensor([1.0, 1.0, 1.0, 0.1])
        self.loss = torch.nn.NLLLoss(weight=weights)
        self.logsoftmax = nn.LogSoftmax(dim=2)
        self.output = []

    def forward(self, x):
        x = self.blocks(x)
        x = self.linear(x)
        return x

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        y_hat = self.logsoftmax(y_hat)
        
        # Reshape the outputs and labels
        loss = self.loss(y_hat.view(-1, y_hat.size(-1)), y.view(-1))
        
        self.log('train_loss', loss, on_step=False, on_epoch=True, prog_bar=True, logger=True)
        self.train_loss_history.append(loss.item())

        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        y_hat = self.logsoftmax(y_hat)
        
        # Reshape the outputs and labels
        loss = self.loss(y_hat.view(-1, y_hat.size(-1)), y.view(-1))
        self.log('val_loss', loss, on_step=False, on_epoch=True, prog_bar=True, logger=True)
        
        num_classes = 4
        
        # Create an one-hot encoded array of y
        ohe_y = np.zeros((len(y.view(-1)), num_classes))
        ohe_y[np.arange(len(y.view(-1))), y.view(-1).cpu()] = 1
        
        y_hat_probs = y_hat.view(-1, y_hat.size(-1)).detach().cpu().numpy()

        scores = []
        for i in range(3):
            if np.sum(ohe_y[:, i]) > 0:
                score = average_precision_score(ohe_y[:, i], y_hat_probs[:, i])
            else:
                score = 0.0  # Set default score if no positive class
            scores.append(score)
    
        mean_score = np.mean(scores)  
        self.log('val_score', mean_score, on_step=False, on_epoch=True, prog_bar=True, logger=True)
        self.val_loss_history.append(loss.item())
        self.val_score_history.append(mean_score)
        return loss
    
    
    def test_step(self, batch, batch_idx):
        x, file_id, time = batch
        y_hat = self.forward(x)
        y_hat = F.softmax(y_hat, dim=2) 
        self.output.append((y_hat, file_id, time))
        
    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.lr)
        return optimizer

In [5]:
sample_submission = pd.read_csv("/kaggle/input/tlvmc-parkinsons-freezing-gait-prediction/sample_submission.csv")
sample_submission = sample_submission.set_index('Id')

## Ensemble of 1D-Convolutional Neural Network (1D-Conv NN) Models

In addition to the 1D-Convolutional Neural Network architecture, we decided to further improve the model's performance by using an ensemble of models. We employed a K-fold cross-validation strategy with `N_SPLITS` splits to train and evaluate multiple models.

For each fold, we split the dataset into training and validation sets using the K-fold indices. We created separate datasets and data loaders for both the training and validation sets. Then, we instantiated a new instance of the model for each fold, using the same hyperparameters as before.

Using KFold, we created an ensemble of 5 group models, whose validation scores were averaged. For each dataset, we took 5 models with 9 combinations of kernel sizes and dilations. Kernel sizes, dropouts, and dilation rates have been determined through trial and error. Optimal parameters have been selected.

**It's important to note** that we made predictions for the `tdcsfog` and `defog` datasets separately. This means that we trained the ensemble models on the `tdcsfog` dataset and evaluated their performance on the corresponding validation set. Likewise, we trained another ensemble on the `defog` dataset and evaluated its performance on the corresponding validation set.

By treating the `tdcsfog` and `defog` datasets separately, we aimed to optimize the models' performance for each specific type of data. This approach allowed us to capture the unique characteristics and patterns present in both datasets, leading to improved predictions for each dataset individually.

In [6]:
N_SPLITS = 5

model_params = {
    'in_channels': 35,
    'out_channels': 7,
    'kernel_sizes': [3, 5, 7],
    'dilations': [2, 4, 8],
    'dropout_rate': 0.1,
    'num_blocks': 3,  
}


### Implementing ensemble of 1D-CNN models for the `tdcsfog` dataset

In [7]:
folder_path = '/kaggle/input/tlvmc-parkinsons-freezing-gait-prediction/train/tdcsfog/'
all_files = [file for file in os.listdir(folder_path)]

test_folder_path = '/kaggle/input/tlvmc-parkinsons-freezing-gait-prediction/test/tdcsfog/'
test_files = [file for file in os.listdir(test_folder_path)]
test_dataset = SegmentedDataset(test_folder_path, test_files, is_train=False)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=256, shuffle=False)

In [8]:
kf = KFold(n_splits=N_SPLITS, random_state=404, shuffle=True)

scores = []

for fold, (train_idx, val_idx) in enumerate(kf.split(all_files)):
    train_files = [all_files[i] for i in train_idx]
    val_files = [all_files[i] for i in val_idx]
    
    train_dataset = SegmentedDataset(folder_path, train_files)
    val_dataset = SegmentedDataset(folder_path, val_files)

    train_loader = DataLoader(train_dataset, batch_size=256, num_workers=2)
    val_loader = DataLoader(val_dataset, batch_size=256, num_workers=2)
    
    model = Model(**model_params)

    checkpoint_callback = ModelCheckpoint(
        dirpath=f'fold_{fold}_checkpoints',
        filename='best_model',
        monitor='val_score',
        mode='max',
        save_top_k=1
    )

    
    trainer = pl.Trainer(max_epochs=10, callbacks=[checkpoint_callback], accelerator="gpu")
    trainer.fit(model, train_loader, val_loader)
    print(f'Model {fold}, score: ',  checkpoint_callback.best_model_score.item())
    scores.append(checkpoint_callback.best_model_score.item())
    
    # Load the best model checkpoint
    best_model_path = os.path.join(checkpoint_callback.dirpath, f'{checkpoint_callback.filename}.ckpt')
    best_model = Model.load_from_checkpoint(best_model_path, **model_params)
    
    # Evaluate the best model on the test set
    trainer.test(best_model, test_loader)

    result = pd.DataFrame()
    for batch in best_model.output:
        preds = batch[0].cpu().numpy()
        ids = batch[1]
        times = batch[2].cpu().numpy()
        for i, time, pred in zip(ids, times, preds):
            id_time = [f'{i}_{t}' for t in time]
            tmp = pd.DataFrame(pred[:,:3], index=id_time).loc[[t != -1 for t in time]]
            result = pd.concat([result, tmp])

    result.columns = ['StartHesitation', 'Turn', 'Walking']

    sample_submission = sample_submission.add(result, fill_value=0)
    
    del train_dataset, val_dataset, val_loader, train_loader, checkpoint_callback, trainer, best_model, result
    torch.cuda.empty_cache()
    gc.collect()
    
print('Mean score: ', sum(scores)/N_SPLITS)

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 0, score:  0.6553076081872089


Testing: 0it [00:00, ?it/s]

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 1, score:  0.5366350103305342


Testing: 0it [00:00, ?it/s]

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 2, score:  0.5298121051739403


Testing: 0it [00:00, ?it/s]

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 3, score:  0.5991198455435256


Testing: 0it [00:00, ?it/s]

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 4, score:  0.6412328914013887


Testing: 0it [00:00, ?it/s]

Mean score:  0.5924214921273195


### Implementing ensemble of 1D-CNN models for the `defog` dataset

In [9]:
folder_path = '/kaggle/input/tlvmc-parkinsons-freezing-gait-prediction/train/defog/'
all_files = [file for file in os.listdir(folder_path)]

test_folder_path = '/kaggle/input/tlvmc-parkinsons-freezing-gait-prediction/test/defog/'
test_files = [file for file in os.listdir(test_folder_path)]
test_dataset = SegmentedDataset(test_folder_path, test_files, is_train=False, is_defog=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=256, shuffle=False)

In [10]:
kf = KFold(n_splits=N_SPLITS, random_state=404, shuffle=True)

scores = []

for fold, (train_idx, val_idx) in enumerate(kf.split(all_files)):
    train_files = [all_files[i] for i in train_idx]
    val_files = [all_files[i] for i in val_idx]
    
    train_dataset = SegmentedDataset(folder_path, train_files, is_defog=True)
    val_dataset = SegmentedDataset(folder_path, val_files, is_defog=True)

    train_loader = DataLoader(train_dataset, batch_size=256, num_workers=2)
    val_loader = DataLoader(val_dataset, batch_size=256, num_workers=2)
    
    model = Model(**model_params)
    
    checkpoint_callback = ModelCheckpoint(
        dirpath=f'fold_{fold}_checkpoints_defog',
        filename='best_model',
        monitor='val_score',
        mode='max',
        save_top_k=1
    )

    
    trainer = pl.Trainer(max_epochs=10, callbacks=[checkpoint_callback], accelerator="gpu")
    trainer.fit(model, train_loader, val_loader)
    print(f'Model {fold}, score: ',  checkpoint_callback.best_model_score.item())
    scores.append(checkpoint_callback.best_model_score.item())
        
    # Load the best model checkpoint
    best_model_path = os.path.join(checkpoint_callback.dirpath, f'{checkpoint_callback.filename}.ckpt')
    best_model = Model.load_from_checkpoint(best_model_path, **model_params)
    
    # Evaluate the best model on the test set
    trainer.test(best_model, test_loader)

    result = pd.DataFrame()
  #  for batch in model.output:
    for batch in best_model.output:
        preds = batch[0].cpu().numpy()
        ids = batch[1]
        times = batch[2].cpu().numpy()
        for i, time, pred in zip(ids, times, preds):
            id_time = [f'{i}_{t}' for t in time]
            tmp = pd.DataFrame(pred[:,:3], index=id_time).loc[[t != -1 for t in time]]
            result = pd.concat([result, tmp])

    result.columns = ['StartHesitation', 'Turn', 'Walking']

    sample_submission = sample_submission.add(result, fill_value=0)
    
    del train_dataset, val_dataset, val_loader, train_loader, checkpoint_callback, trainer, best_model, result
    torch.cuda.empty_cache()
    gc.collect()
    
print('Mean score: ', sum(scores)/N_SPLITS)

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 0, score:  0.4046239811320574


Testing: 0it [00:00, ?it/s]

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 1, score:  0.43532523683017926


Testing: 0it [00:00, ?it/s]

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 2, score:  0.46310618922753466


Testing: 0it [00:00, ?it/s]

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 3, score:  0.3997664229629237


Testing: 0it [00:00, ?it/s]

Sanity Checking: 0it [00:00, ?it/s]

Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Model 4, score:  0.4274656690492101


Testing: 0it [00:00, ?it/s]

Mean score:  0.42605749984038105


Finally, we record all the results in the submission. The submission is created from the average over each fold.

In [11]:
sample_submission = sample_submission.div(N_SPLITS)
sample_submission = sample_submission.reset_index()
sample_submission.columns = ['Id', 'StartHesitation', 'Turn', 'Walking']
sample_submission.to_csv('submission.csv', index=False)