Autoencoder based onboard image segmentation

This project addresses the problem of semantic segmentation of onboard images acquired by vehicles operating in rural environments.
The dataset consists of still frames extracted from sequences captured with a moving camera, and the goal is to classify each pixel into one of the following 8 classes:

sky
rough trail
smooth trail
traversable grass
high vegetation
non-traversable low vegetation
puddle
obstacle

The proposed approach is based on an autoencoder architecture, trained, validated, and tested using K-Fold cross-validation.
Evaluation is performed with respect to both loss and mean Intersection-over-Union (mIoU), and this repository includes the code, trained models, and supporting documentation.

📚 Training Dataset

The training dataset is the Yamaha-CMU Off-Road Dataset (YCOR), which consists of 1076 images collected across four different locations in Western Pennsylvania and Ohio, spanning three different seasons.

The dataset was labeled using a polygon-based interface with eight semantic classes (sky, rough trail, smooth trail, traversable grass, high vegetation, non-traversable low vegetation, obstacle). Labels were further refined using a Dense CRF to densify polygon annotations, followed by manual inspection and corrections to ensure accuracy.

Compared to other benchmarks such as DeepScene, the YCOR dataset is considered more diverse and challenging, with less predictable structure and higher pixelwise error-rate for baseline classifiers (0.51 vs. 0.30 in DeepScene). While relatively small compared to recent large-scale datasets, YCOR provides a valuable benchmark for evaluating segmentation in complex off-road rural environments.

📁 Directory Structure and Descriptions

.
├── BEST_MODELS/ # Best models selected based on evaluation metrics (loss or mIoU) 
|   ├── best_model.pth #Best model selected based on criteria detailed in the report                   
│   ├── loss/                          # Best models based on lowest loss
│   │   └── EXPERIMENT_NAME_i/
│   │       ├── json/                 # Contains metrics in JSON format for the best model (e.g., loss value, accuracy)
│   │       │   └── best_model_by_loss_metrics_ki.json
|   |       |   └── experiment_name_i_results_loss.png #Visualization of fold-wise performance metrics for the best model (selected by validation loss).
│   │       ├── pth/                  # Contains PyTorch weight file of the best model based on loss
│   │       │   └── best_model_by_loss_kn.pth
│   │       └── test.txt              # Numbers of test images to be used to evaluate the models
│   └── miou/                         # Best models based on highest mean Intersection-over-Union (mIoU)
│       └── EXPERIMENT_NAME_i/
│           ├── json/                 # JSON with evaluation metrics of the best mIoU model
│           │   └── best_model_by_miou_metrics_ki.json
|           |   └── experiment_name_i_results_miou.png #Visualization of fold-wise performance metrics for the best model (selected by mIoU).
│           ├── pth/                  # PyTorch weight file of the best model based on mIoU
│           │   └── best_model_by_miou_kn.pth
│           └── test.txt              # Numbers of test images to be used to evaluate the models set
│
├── EXPERIMENTS/                       # Main experiment directory with K-Fold setups
│   └── EXPERIMENT_NAME_i/
│       ├── k-i/                      # Data and models for each fold i of the K-Fold validation
│       │   ├── best_model_by_loss_kn.pth / best_model_by_miou_ki.pth      # Best model weights for this fold
│       │   ├── best_model_by_loss_metrics_kn.json / best_model_by_miou_metrics_ki.json  # Metrics per fold
│       │   ├── val_accuracies.pth   # Accuracy history during validation
│       │   └── val_losses.pth       # Loss history during validation
│       └── splits/                  # Dataset splits used during training and validation
│           ├── k-0/
│           │   ├── train.txt        # Numbers of images used for training in fold 0
│           │   └── val.txt          # Numbers of images used for validation in fold 0
│           ├── k-i/
│           │   ├── train.txt        # Train set for fold i
│           │   └── val.txt          # Validation set for fold i
│           └── test.txt             # Test set used for final model evaluation (excluded from train/val)
│
├── train_2025_ml_gr52.ipynb          # Google Colab notebook used to train models
├── test_2025_ml_gr52.ipynb           # Google Colab notebook to load and test trained models
└── report.pdf                     # Final report with all project documentation

1. Submission Checklist

▪ Training code (Colab notebook)

✅ train_2025_ml_gr51.ipynb

▪ Training/validation data

✅ Located in:
- EXPERIMENTS/EXPERIMENT_NAME_i/splits/k-*/train.txt
- EXPERIMENTS/EXPERIMENT_NAME_i/splits/k-*/val.txt

▪ Train/validation split protocol

✅ Fully detailed via:
- train.txt and val.txt files in each k-* fold subfolder
- test.txt in splits/ (used as a common test set for all folds)

▪ Trained models/weights

✅ Best models based on different metrics and folds:
- BEST_MODELS/loss/.../*.pth and *.json
- BEST_MODELS/miou/.../*.pth and *.json
- EXPERIMENTS/.../k-*/best_model_by_*.pth and *.json
✅ Overall best model across all folds and metrics:
- BEST_MODELS/best_model.pth (selected based on criteria detailed in the report)

▪ Test script (Colab notebook)

✅ test_2025_ml_gr51.ipynb

2. Test Script Requirements

The file test_2025_ml_gr51.ipynb includes:

✅ Code to load the trained model
✅ A function predict(model, X) with the following prototype:

predict(model, X: torch.Tensor) -> torch.Tensor

Where:

model is a preloaded PyTorch model
X has shape (batch_size, rows, cols, 3), dtype: uint8
Return value has shape (batch_size, rows, cols, 1), dtype: uint8
✅ The function:
- Performs all required preprocessing
- Applies the trained model
- Executes postprocessing
- Works with any batch size
- Processes input without requiring the whole test set at once

3. Report and Presentation

✅ report.pdf includes:
1. Rationale behind dataset collection
2. Description of the train/validation splitting strategy
3. Preprocessing pipeline details
4. Selected network architecture
5. Training hyperparameters and loss function
✅ 8-minute presentation in English

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Autoencoder based onboard image segmentation

📚 Training Dataset

📁 Directory Structure and Descriptions

1. Submission Checklist

▪ Training code (Colab notebook)

▪ Training/validation data

▪ Train/validation split protocol

▪ Trained models/weights

▪ Test script (Colab notebook)

2. Test Script Requirements

3. Report and Presentation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
BEST_MODELS		BEST_MODELS
EXPERIMENTS		EXPERIMENTS
.gitattributes		.gitattributes
README.md		README.md
report.pdf		report.pdf
test_2025_ml_gr51.ipynb		test_2025_ml_gr51.ipynb
train_2025_ml_gr51.ipynb		train_2025_ml_gr51.ipynb

SimoneFaraulo/Autoencoder-based-onboard-image-segmentation

Folders and files

Latest commit

History

Repository files navigation

Autoencoder based onboard image segmentation

📚 Training Dataset

📁 Directory Structure and Descriptions

1. Submission Checklist

▪ Training code (Colab notebook)

▪ Training/validation data

▪ Train/validation split protocol

▪ Trained models/weights

▪ Test script (Colab notebook)

2. Test Script Requirements

3. Report and Presentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages