# Train a proprioception-tuned CNN

We create a sensor processing model using CNN-based visual encoding finetuned with proprioception.

We create an encoding for the robot starting from a pretrained CNN model. As the feature vector of this is still large (eg 512 * 7 * 7), we reduce this to the encoding with an MLP. 

We finetune the encoding with information from proprioception.  

The sensor processing object associated with the network trained like this is in sensorprocessing/sp_propriotuned_cnn.py

In [1]:
import sys
sys.path.append("..")

from exp_run_config import Config, Experiment
Config.PROJECTNAME = "BerryPicker"

import pathlib
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm

from demonstration.demonstration import Demonstration

import sensorprocessing.sp_helper as sp_helper
from sensorprocessing.sp_propriotuned_cnn import VGG19ProprioTunedRegression, ResNetProprioTunedRegression
from robot.al5d_position_controller import RobotPosition

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cpu


In [2]:
# Dr. Boloni's path
# external_path = pathlib.Path("/Users/lboloni/Documents/Code/_ExternalExp")
print(Config())
external_path = pathlib.Path(Config()["experiment_external"])
# Sahara's path
# external_path = pathlib.Path("/home/sa641631/SaharaBerryPickerData/experiment_data")
external_path.exists()
Config().set_experiment_path(external_path)
Config().copy_experiment("sensorprocessing_propriotuned_cnn")
Config().copy_experiment("robot_al5d")
Config().copy_experiment("demonstration")

Config().create_exprun_variant("sensorprocessing_propriotuned_cnn","resnet50_128", {"epochs": 17}, new_run_name="boo")

***ExpRun**: Loading pointer config file:
	/Users/lboloni/.config/BerryPicker/mainsettings.yaml
***ExpRun**: Loading machine-specific config file:
	/Users/lboloni/Google Drive/My Drive/LotziStudy/Code/PackageTracking/BerryPicker/settings/settings-szenes.yaml
<exp_run_config.Config object at 0x132bc4590>
***ExpRun**: Experiment config path changed to /Users/lboloni/Documents/Develop/Data/BerryPicker-external
***ExpRun**: Experiment sensorprocessing_propriotuned_cnn copied to /Users/lboloni/Documents/Develop/Data/BerryPicker-external/sensorprocessing_propriotuned_cnn
***ExpRun**: Experiment robot_al5d copied to /Users/lboloni/Documents/Develop/Data/BerryPicker-external/robot_al5d
***ExpRun**: Experiment demonstration copied to /Users/lboloni/Documents/Develop/Data/BerryPicker-external/demonstration
***ExpRun**: Configuration for exp/run: sensorprocessing_propriotuned_cnn/resnet50_128 successfully loaded
***ExpRun**: Exp/run variant sensorprocessing_propriotuned_cnn/boo created in /Users/

In [3]:
# The experiment/run we are going to run: the specified model will be created
experiment = "sensorprocessing_propriotuned_cnn"
# run = "vgg19_128"
run = "resnet50_128"
# run = "vgg19_256"
# run = "resnet50_256"
# run = "boo"
exp = Config().get_experiment(experiment, run)

exp_robot = Config().get_experiment(exp["robot_exp"], exp["robot_run"])

***ExpRun**: Configuration for exp/run: sensorprocessing_propriotuned_cnn/resnet50_128 successfully loaded
***ExpRun**: Configuration for exp/run: robot_al5d/position_controller_00 successfully loaded


### Create regression training data (image to proprioception)
The training data (X, Y) is all the pictures from a demonstration with the corresponding proprioception data. 

In [4]:
def load_images_as_proprioception_training(exp: Experiment, exp_robot: Experiment):
    """Loads the training images specified in the exp/run. Processes them as two tensors as input and target data for proprioception training.
    Caches the processed results into the input and target file specified in the exp/run.

    Remove those files to recalculate
    """
    retval = {}
    proprioception_input_path = pathlib.Path(exp.data_dir(), "proprio_input.pth")
    proprioception_target_path = pathlib.Path(exp.data_dir(), "proprio_target.pth")

    if proprioception_input_path.exists():
        retval["inputs"] = torch.load(proprioception_input_path, weights_only=True)
        retval["targets"] = torch.load(proprioception_target_path, weights_only=True)
    else:
        inputlist = []
        targetlist = []
        transform = sp_helper.get_transform_to_sp(exp)
        for val in exp["training_data"]:
            run, demo_name, camera = val
            #run = val[0]
            #demo_name = val[1]
            #camera = val[2]
            exp_demo = Config().get_experiment("demonstration", run)
            demo = Demonstration(exp_demo, demo_name)
            for i in range(demo.metadata["maxsteps"]):
                sensor_readings, _ = demo.get_image(i, device=device, transform=transform, camera=camera)
                inputlist.append(sensor_readings[0])
                a = demo.get_action(i)
                rp = RobotPosition.from_vector(exp_robot, a)
                anorm = rp.to_normalized_vector(exp_robot)
                targetlist.append(torch.from_numpy(anorm))
        retval["inputs"] = torch.stack(inputlist)
        retval["targets"] = torch.stack(targetlist)
        torch.save(retval["inputs"], proprioception_input_path)
        torch.save(retval["targets"], proprioception_target_path)

    # Separate the training and validation data.
    # We will be shuffling the demonstrations
    length = retval["inputs"].size(0)
    rows = torch.randperm(length)
    shuffled_inputs = retval["inputs"][rows]
    shuffled_targets = retval["targets"][rows]

    training_size = int( length * 0.67 )
    retval["inputs_training"] = shuffled_inputs[1:training_size]
    retval["targets_training"] = shuffled_targets[1:training_size]

    retval["inputs_validation"] = shuffled_inputs[training_size:]
    retval["targets_validation"] = shuffled_targets[training_size:]

    return retval

In [5]:
exp["training_data"]

[['random-both-cameras-video', '2025_03_08__14_15_53', 'dev2']]

In [6]:
tr = load_images_as_proprioception_training(exp, exp_robot)
inputs_training = tr["inputs_training"]
targets_training = tr["targets_training"]
inputs_validation = tr["inputs_validation"]
targets_validation = tr["targets_validation"]

***ExpRun**: Experiment default config /Users/lboloni/Documents/Develop/Data/BerryPicker-external/demonstration/_defaults_demonstration.yaml was empty, ok.
***ExpRun**: Configuration for exp/run: demonstration/random-both-cameras-video successfully loaded


### Create a model that performs proprioception regression

In [7]:

if exp['model'] == 'VGG19ProprioTunedRegression':
    model = VGG19ProprioTunedRegression(exp, device)
elif exp['model'] == 'ResNetProprioTunedRegression':
    model = ResNetProprioTunedRegression(exp, device)
else:
    raise Exception(f"Unknown model {exp['model']}")

if exp['loss'] == 'MSELoss':
    criterion = nn.MSELoss()
elif exp['loss'] == 'L1Loss':
    criterion = nn.L1Loss()

optimizer = optim.Adam(model.parameters(), lr=exp['learning_rate'])



In [8]:
# Create DataLoaders for batching
batch_size = exp['batch_size']
train_dataset = TensorDataset(inputs_training, targets_training)
test_dataset = TensorDataset(inputs_validation, targets_validation)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

In [9]:
def train_and_save_proprioception_model(model, criterion, optimizer, modelfile, device="cpu", epochs=20):
    """Trains and saves the proprioception model
    FIXME: must have parameters etc to investigate alternative models.
    """
    model = model.to(device)
    criterion = criterion.to(device)
    # Training loop
    num_epochs = epochs
    for epoch in tqdm(range(num_epochs)):
        model.train()
        total_loss = 0
        for batch_X, batch_y in train_loader:
            batch_X = batch_X.to(device)
            batch_y = batch_y.to(device)
            predictions = model.forward(batch_X)
            loss = criterion(predictions, batch_y)
            # Backward pass and optimization
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            total_loss += loss.item()

        if (epoch + 1) % 1 == 0:
            print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {total_loss / len(train_loader):.4f}')

    # Evaluate the model
    model.eval()
    test_loss = 0
    with torch.no_grad():
        for batch_X, batch_y in test_loader:
            batch_X = batch_X.to(device)
            batch_y = batch_y.to(device)
            predictions = model(batch_X)
            loss = criterion(predictions, batch_y)
            test_loss += loss.item()

    test_loss /= len(test_loader)
    print(f'Test Loss: {test_loss:.4f}')
    torch.save(model.state_dict(), modelfile)

In [10]:
modelfile = pathlib.Path(
    exp["data_dir"], exp["proprioception_mlp_model_file"])
epochs = exp["epochs"]
if modelfile.exists():
    print("*** Train-Propriotuned-CNN ***: NOT training; model already exists, loading it")
    model.load_state_dict(torch.load(modelfile))
else:
    train_and_save_proprioception_model(model, criterion, optimizer, modelfile, device=device, epochs=epochs)

  2%|▎         | 1/40 [00:28<18:35, 28.59s/it]

Epoch [1/40], Loss: 8.5426


  5%|▌         | 2/40 [00:57<18:06, 28.60s/it]

Epoch [2/40], Loss: 6.9249


  8%|▊         | 3/40 [01:26<17:51, 28.95s/it]

Epoch [3/40], Loss: 6.1676


 10%|█         | 4/40 [01:56<17:32, 29.25s/it]

Epoch [4/40], Loss: 5.2943


 12%|█▎        | 5/40 [02:26<17:11, 29.46s/it]

Epoch [5/40], Loss: 4.9973


 15%|█▌        | 6/40 [02:55<16:46, 29.59s/it]

Epoch [6/40], Loss: 3.5385


 18%|█▊        | 7/40 [03:25<16:19, 29.67s/it]

Epoch [7/40], Loss: 2.9990


 20%|██        | 8/40 [03:55<15:52, 29.77s/it]

Epoch [8/40], Loss: 2.8541


 22%|██▎       | 9/40 [04:26<15:29, 29.99s/it]

Epoch [9/40], Loss: 2.8529


 25%|██▌       | 10/40 [04:57<15:09, 30.30s/it]

Epoch [10/40], Loss: 2.8983


 28%|██▊       | 11/40 [05:29<14:52, 30.77s/it]

Epoch [11/40], Loss: 3.0879


 30%|███       | 12/40 [06:01<14:38, 31.37s/it]

Epoch [12/40], Loss: 3.9244


 32%|███▎      | 13/40 [06:34<14:20, 31.88s/it]

Epoch [13/40], Loss: 2.6992


 35%|███▌      | 14/40 [07:07<13:57, 32.21s/it]

Epoch [14/40], Loss: 2.8960


 38%|███▊      | 15/40 [07:40<13:29, 32.38s/it]

Epoch [15/40], Loss: 2.2290


 40%|████      | 16/40 [08:13<12:57, 32.39s/it]

Epoch [16/40], Loss: 1.7186


 42%|████▎     | 17/40 [08:45<12:23, 32.31s/it]

Epoch [17/40], Loss: 1.7373


 45%|████▌     | 18/40 [09:17<11:49, 32.24s/it]

Epoch [18/40], Loss: 1.5508


 48%|████▊     | 19/40 [09:49<11:14, 32.14s/it]

Epoch [19/40], Loss: 1.5922


 50%|█████     | 20/40 [10:21<10:44, 32.20s/it]

Epoch [20/40], Loss: 1.4989


 52%|█████▎    | 21/40 [10:54<10:14, 32.37s/it]

Epoch [21/40], Loss: 1.6507


 55%|█████▌    | 22/40 [11:26<09:44, 32.47s/it]

Epoch [22/40], Loss: 1.4576


 57%|█████▊    | 23/40 [11:59<09:13, 32.55s/it]

Epoch [23/40], Loss: 1.3481


 60%|██████    | 24/40 [12:32<08:42, 32.68s/it]

Epoch [24/40], Loss: 1.1859


 62%|██████▎   | 25/40 [13:05<08:12, 32.84s/it]

Epoch [25/40], Loss: 1.1724


 65%|██████▌   | 26/40 [13:39<07:41, 32.97s/it]

Epoch [26/40], Loss: 1.1450


 68%|██████▊   | 27/40 [14:12<07:10, 33.14s/it]

Epoch [27/40], Loss: 1.3815


 70%|███████   | 28/40 [14:46<06:39, 33.26s/it]

Epoch [28/40], Loss: 0.9233


 72%|███████▎  | 29/40 [15:19<06:07, 33.41s/it]

Epoch [29/40], Loss: 1.0483


 75%|███████▌  | 30/40 [16:09<06:21, 38.15s/it]

Epoch [30/40], Loss: 1.0285


 78%|███████▊  | 31/40 [16:39<05:21, 35.74s/it]

Epoch [31/40], Loss: 1.5776


 80%|████████  | 32/40 [17:09<04:33, 34.16s/it]

Epoch [32/40], Loss: 1.3643


 82%|████████▎ | 33/40 [17:40<03:52, 33.24s/it]

Epoch [33/40], Loss: 0.9918


 85%|████████▌ | 34/40 [18:13<03:18, 33.16s/it]

Epoch [34/40], Loss: 0.8885


 88%|████████▊ | 35/40 [18:49<02:49, 33.96s/it]

Epoch [35/40], Loss: 0.7675


 90%|█████████ | 36/40 [19:28<02:21, 35.30s/it]

Epoch [36/40], Loss: 0.8055


 92%|█████████▎| 37/40 [20:07<01:49, 36.43s/it]

Epoch [37/40], Loss: 0.7391


 95%|█████████▌| 38/40 [20:43<01:13, 36.54s/it]

Epoch [38/40], Loss: 0.7553


 98%|█████████▊| 39/40 [21:19<00:36, 36.12s/it]

Epoch [39/40], Loss: 0.9380


100%|██████████| 40/40 [21:53<00:00, 32.84s/it]

Epoch [40/40], Loss: 0.7942





Test Loss: 1.1093
