<h1 style="color:rgb(0,120,170)">ITS 3D Camery only challenge</h1>

The task was to predict cars in an image without the availability of LiDAR sensing. The model should output for every pixel if the pixel belongs to a car (1) or not (0).

<h2 style="color:rgb(0,120,170)">Dataset</h2>
Here is an example of an image with the ground truth:

![input and output for a random image in the test dataset](https://i.imgur.com/GD8FcB7.png)

The yellow area belongs to the car (=1).
The model should predict the yellow area, which belongs to the car.
For every pixel the target tensor contains either the value 0(=no car) or 1(=car).

In our case the data isn't that beatiful and we only have boxes given, where the car is included.

![Car Boxes](./images_notebook/example_boxes.png)

Therefore, our Target are basically boxes where the car is included. Here is an example:

![Input1](./images_notebook/example_image_input.png)

<h2 style="color:rgb(0,120,170)">Approach</h2>

First lets import the neceassary packages. We are using PyTorch.

In [1]:
from CNN import CNN
from UNet import UNet
from train import train

import torch
import torch.utils.data
import os
import warnings

from dataloader import get_dataloaders
from ImageDataset import ImageDataset
from evaluate import evaluate_model
from plot import plot

import numpy as np
from torch.utils.tensorboard import SummaryWriter
import tqdm


  from .autonotebook import tqdm as notebook_tqdm


Next we load one of our Models, in this case our UNet

In [2]:
net = UNet(n_channels=1, n_classes=1, bilinear=False)  # classes are 1 and 0 (car, no car) therefore 1 class

Next we define some Hyperparameters:

In [3]:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

lr = 1e-3

weight_decay = 1e-5

optimizer = torch.optim.Adam(net.parameters(), lr=lr, weight_decay=weight_decay)

batchsize = 16

loss_fn = torch.nn.L1Loss()
loss_fn_new = torch.nn.BCEWithLogitsLoss()

nupdates = 5000

testset_ratio = 1 / 5

validset_ratio = 1 / 5

num_workers = 0

seed = 1234

resultpath = 'results/unet'

datapath = os.path.join(os.getcwd(), 'data/new_data.pkl')

print_stats_at = 100  # print status to tensorboard every x updates
validate_at = 200  # evaluate model on validation set and check for new best model every x updates
plot_images_at = 50 # plot model every 100 updates

np.random.seed(seed=seed)
torch.manual_seed(seed=seed);

Next we load our ImageDataset

In [4]:
image_dataset = ImageDataset(frame_path=datapath)

From the dataset we create our train, test and validation set loader

In [5]:
train_loader, valid_loader, test_loader = get_dataloaders(image_dataset, testset_ratio, validset_ratio, batchsize,
                                                              num_workers)

Next we initialize our tensorboard writer. Tensorboard is a great way to provide visualization and check how our Model is doing. 

In [6]:
writer = SummaryWriter(log_dir=os.path.join(resultpath, 'tensorboard'))



We use early stopping, which means our training stops if the validation loss doesn't increase anymore.

In [7]:
update = 0  # current update counter
best_validation_loss = np.inf  # best validation loss so far
no_update_counter = 0  # counter for how many updates have passed without a new best validation loss

We use the adam optimizer and the BCE with logits loss, which combines BCE with a sigmoid layer

In [8]:
# Move network to device
net.to(device)

# get loss
loss_fn = loss_fn_new

# get adam optimizer
optimizer = torch.optim.Adam(net.parameters(), lr=lr, weight_decay=weight_decay)

# Save initial model as "best" model (will be overwritten later)
torch.save(net, os.path.join(resultpath, 'best_model.pt'))

This is our evaluation function. It returns the average loss for a given dataset. 

In [9]:
def evaluate_model(model: torch.nn.Module, dataloader: torch.utils.data.DataLoader, device: torch.device, loss_fn):
    model.eval()
    loss = 0
    with torch.no_grad():
        for batch in dataloader:
            image_array = batch['image']
            target_array = batch['boxes']

            # move to device
            image_array = image_array.to(device, dtype=torch.float32)
            target_array = target_array.to(device, dtype=torch.float32)

            # get output
            output = model(image_array)

            loss += loss_fn(output, target_array)

    model.train() # setting model back to training mode
    loss /= len(dataloader)
    return loss

Now comes our training loop. The structure is rather basic. Our training ends if the maximum number of updates is reached or the validation loss hasn't increased after 3 checks.
We first load the data and pass it to the model. We calculate the weights update and check if our Model performs better than before with the help of the validation loss. If this is the case, the new model is saved. We also write some statistics to our tensorboard and the console. In addition, we create some plots.

In [10]:
# TRAIN

# initialize progressbar
update_progessbar = tqdm.tqdm(total=nupdates, desc=f"loss: {np.nan:7.5f}", position=0)  # progressbar
while update < nupdates and no_update_counter < 3: # stop training if val loss doesn't increase after 3 times or nupdates is reached
    for batch in train_loader:

        # get data
        image_array = batch['image']
        target_array = batch['boxes']
            
        # move image and target to device
        image_array = image_array.to(device, dtype=torch.float32)
        target_array = target_array.to(device, dtype=torch.float32)

        # Reset gradients
        optimizer.zero_grad()

        # Forward pass
        output = net(image_array)

        # Calculate loss
        loss = loss_fn(output, target_array)

        # Backward pass
        loss.backward()

        # Update weights
        optimizer.step()

        # Update progress bar
        update_progessbar.set_description(f"loss: {loss.item():7.5f}")
        update_progessbar.update(1)

        # Evaluate model on validation set
        if (update + 1) % validate_at == 0 and update > 0:
            val_loss = evaluate_model(net, dataloader=valid_loader, device=device, loss_fn=loss_fn)
            writer.add_scalar(tag="validation/loss", scalar_value=val_loss, global_step=update)
            # Add weights as arrays to tensorboard
            for i, param in enumerate(net.parameters()):
                writer.add_histogram(tag=f'validation/param_{i}', values=param,
                                        global_step=update)
            # Add gradients as arrays to tensorboard
            for i, param in enumerate(net.parameters()):
                writer.add_histogram(tag=f'validation/gradients_{i}',
                                         values=param.grad.cpu(),
                                         global_step=update)
            # Save best model for early stopping
            if best_validation_loss > val_loss:
                no_update_counter = 0
                best_validation_loss = val_loss
                print("\nNew best model found, saving...")
                torch.save(net, os.path.join(resultpath, 'best_model.pt'))
                print("Finished saving")

        # Print current status and score
        if (update + 1) % print_stats_at == 0:
            writer.add_scalar(tag="training/loss",
                                  scalar_value=loss.cpu(),
                                  global_step=update)

        if (update + 1) % plot_images_at == 0:
            prediction = (torch.nn.functional.sigmoid(output) > 0.5).float()
            plot(image_array.detach().cpu().numpy()[0, 0], target_array.detach().cpu().numpy()[0, 0],
                    prediction.detach().cpu().numpy()[0, 0], os.path.join(resultpath, "plots"), update)

        update += 1
        if update >= nupdates or no_update_counter >= 3:
            break

loss: 0.56803:   4%|██▌                                                             | 200/5000 [02:07<41:59,  1.91it/s]


New best model found, saving...
Finished saving


loss: 0.53395:  12%|███████▋                                                        | 600/5000 [08:39<44:02,  1.67it/s]


New best model found, saving...
Finished saving


loss: 0.47235:  20%|████████████▌                                                  | 1000/5000 [13:20<43:24,  1.54it/s]


New best model found, saving...
Finished saving


loss: 0.38343: 100%|█████████████████████████████████████████████████████████████| 5000/5000 [1:00:00<00:00,  1.91it/s]

After the training we evaluate our model on our training, validation and test set

In [11]:
print(f"Computing scores for best model")
net = torch.load(os.path.join(resultpath, 'best_model.pt'))
test_loss = evaluate_model(net, dataloader=test_loader, device=device, loss_fn=loss_fn)
val_loss = evaluate_model(net, dataloader=valid_loader, device=device, loss_fn=loss_fn)
train_loss = evaluate_model(net, dataloader=train_loader, device=device, loss_fn=loss_fn)

print(f"Scores:")
print(f"test loss: {test_loss}")
print(f"validation loss: {val_loss}")
print(f"training loss: {train_loss}")


Computing scores for best model
Scores:
test loss: 0.5455533266067505
validation loss: 0.5114364624023438
training loss: 0.490740031003952


<h2> Plots</h2>

Here we can see some plots after training:

![Ouptut1](./images_notebook/example_output_1.png)

![Ouptut2](./images_notebook/example_output_2.png)

![Ouptut3](./images_notebook/example_output_3.png)

## Interpretation

Like it can be seen in the images above, the model somehow makes a let's say acceptable job, knowing that the target image does not really represent the situation correctly. For the last image, we assume that the sequence of images is rather constant, that's why the prediction on such an input is pretty good. This also kind of shows that the model would make a good job, if there would've been totally right target arrays.