This notebook can be used to train a new CNN from scratch to predict cellular forces from given displacement fields. As of now we only support input shapes of 104 x 104 x 2.

## Imports

In [12]:
from scripts.tracNet import TracNet
from scripts.data_preparation import matFiles_to_npArray, extract_fields, reshape
from scripts.training_and_evaluation import initialize_weights, fit

import matplotlib.pyplot as plt
import numpy as np
import torch

from datetime import datetime
from gc import collect
from os import cpu_count
from sklearn.model_selection import train_test_split
from torch.optim.lr_scheduler import StepLR
from torch.utils.data import TensorDataset, DataLoader
from torch.utils.tensorboard import SummaryWriter
from torchinfo import summary

Set seeds for reproducability.

In [2]:
random_seed = 1
np.random.seed(random_seed)
torch.manual_seed(random_seed)
torch.cuda.manual_seed(random_seed)
torch.backends.cudnn.benchmark = False

Use CUDA if available.

In [3]:
collect()
torch.cuda.empty_cache()
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
print(f"Running on device: {device}")

Running on device: cpu


## Data loading and preprocessing

Download data from https://cmu.app.box.com/s/n34hbfopwa3r6rftvtfn4ckc403hk43d and place it in a .data/ folder in your directory. We store the inputs and targets of resolution 104 x 104 in ndarrays of dicts with the following commands.

In [4]:
samples = matFiles_to_npArray('data/train/trainData104/foo_dspl') # each dict has keys ['brdx', 'brdy', 'dspl', 'name']
dspl_radials = matFiles_to_npArray('data/train/trainData104/foo_dsplRadial') # each dict has keys ['dspl', 'name']
targets = matFiles_to_npArray('data/train/trainData104/foo_trac') # each dict has keys ['brdx', 'brdy', 'trac', 'name']
trac_radials = matFiles_to_npArray('data/train/trainData104/foo_tracRadial') # each dict has keys ['trac', 'name']

Split training data into train and validation set using stratified samples.

In [5]:
radial_X_train, radial_X_val, radial_y_train, radial_y_val = train_test_split(dspl_radials, trac_radials, test_size=0.05)
X_train, X_val, y_train, y_val = train_test_split(samples, targets, test_size=0.05)
X_train, X_val, y_train, y_val = np.append(radial_X_train, X_train), np.append(radial_X_val, X_val), np.append(radial_y_train, y_train), np.append(radial_y_val, y_val)

Extract displacement and traction fields from the data and drop (meta-) data which is not needed for training purposes.

In [6]:
X_train = extract_fields(X_train)
X_val = extract_fields(X_val)
y_train = extract_fields(y_train)
y_val = extract_fields(y_val)

Current shape of the datasets is (samples, width, height, depth). 
Reshape them to (samples, channels, depth, height, width) to allow 3D-Convolutions during training.

In [7]:
X_train = reshape(X_train)
X_val = reshape(X_val)
y_train = reshape(y_train)
y_val = reshape(y_val)

Convert datasets to Pytorch tensors.

In [8]:
X_train = torch.from_numpy(X_train).double()
X_val = torch.from_numpy(X_val).double()
y_train = torch.from_numpy(y_train).double()
y_val = torch.from_numpy(y_val).double()

Specify batch sizes and number of workers (the proposed choice is based on my experience).

In [9]:
train_set = TensorDataset(X_train, y_train)
val_set = TensorDataset(X_val, y_val)

batch_size = 8

if device == 'cpu':
    num_workers = os.cpu_count()
else:
    num_workers = 4 * torch.cuda.device_count()

dataloaders = {}
dataloaders['train'] = DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=num_workers, pin_memory=True)
dataloaders['val'] = DataLoader(val_set, batch_size=10*batch_size, num_workers=num_workers, pin_memory=True)

## Training

Define custom loss function corresponding to the forward loss function in the MATLAB regression layer for image-to-image networks (HWC: height * width * channels of inputs):
 
$${loss} = \frac{1}{2} \sum \limits _{p=1} ^{HWC} (t_{p} - y_{p})^{2}$$

In [10]:
class Custom_Loss(torch.nn.Module):
    def __init__(self):
        super(Custom_Loss, self).__init__();
    
    def forward(self, predictions, target):
        loss = 0.5 * torch.sum(torch.pow(target - predictions, 2))
        return loss

Instantiate the model (including logs for evaluation), the optimizer and start training.

In [11]:
NAME = "TracNet104-{:%Y-%b-%d %H:%M:%S}".format(datetime.now())
writer = SummaryWriter(log_dir='{}'.format(NAME))
model = TracNet(n_channels=1).double()
model.to(device)
model.apply(initialize_weights)

optimizer = torch.optim.Adam(model.parameters(), lr=0.0006, weight_decay=0.0005)
scheduler = StepLR(optimizer, step_size=10, gamma=0.7943, verbose=True)
loss_fn = Custom_Loss()

fit(model, loss_fn, scheduler, dataloaders, optimizer, device, writer, NAME, max_epochs=1, patience=5)

Adjusting learning rate of group 0 to 6.0000e-04.


  return F.conv3d(
Epoch 1: 100%|██████████| 2/2 [02:02<00:00, 61.12s/batch]


Adjusting learning rate of group 0 to 6.0000e-04.


Epoch 1: 100%|██████████| 1/1 [00:01<00:00,  1.77s/batch]


Epoch 1/1, train_loss: 239.031, train_rmse: 21.865, val_loss: 334.076, val_rmse: 25.849
best val_rmse: 25.849, epoch: 1, best_epoch: 1, current_patience: 5
