Skip to content

azadhamid/dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CycleGAN-Assisted Domain Adaptation for UAV Payload Detection (DASC 2025)

This repository contains the code and pre-trained models used in our DASC 2025 paper on CycleGAN-assisted domain adaptation for UAV payload detection, where a classifier trained only on synthetic data is evaluated on real UAV flight imagery after a CycleGAN-based translation step.

The project includes:

  • PyTorch implementations of ResNet34 and EfficientNet-B2 classifiers.
  • Pre-trained weights trained on a simulated UAV payload dataset.
  • Scripts to reproduce the classification results (Table 1) and feature-space t-SNE visualization (Figure 3).


Project overview

The goal of this work is binary classification of a UAV as either:

  • loaded – UAV carrying a payload.
  • unloaded – UAV without payload.

Because collecting and annotating large real UAV datasets is difficult, we:

  1. Generate a synthetic dataset of UAV images in Microsoft AirSim under varied conditions (flight trajectories, lighting, backgrounds, sensor noise, camera settings, etc.).
  2. Train deep classifiers (ResNet34 and EfficientNet-B2) only on synthetic images.
  3. Translate real test images into the synthetic domain using a pre-trained CycleGAN.
  4. Evaluate the classifiers on:
    • real images directly (no adaptation),
    • CycleGAN-translated real images (with adaptation).

This repository provides the code and pre-trained weights for steps (2) and (4), plus a feature-space analysis script that compares simulated, real, and CycleGAN-translated feature distributions.


How to use this repository

1. Environment

Create a Python environment with (example versions):

  • Python >= 3.9
  • PyTorch and torchvision (CUDA support recommended)
  • NumPy
  • Matplotlib
  • scikit-learn

Example (conda):

conda create -n dasc2025 python=3.9
conda activate dasc2025
pip install torch torchvision matplotlib scikit-learn

Dataset details

Our experiments use two main types of data:

1. Synthetic UAV payload dataset (AirSim)

  • Total of 4,538 images, balanced between:
    • loaded class: 2,269 images
    • unloaded class: 2,269 images
  • Generated in Microsoft AirSim using four different quadrotor models and varied:
    • flight trajectories,
    • backgrounds and terrain,
    • lighting and weather conditions,
    • camera settings and sensor noise.
  • This dataset is used only for training the classifiers (no real data is used during training).

2. Real experimental UAV dataset

  • Real flight experiments with a target UAV (loaded/unloaded) observed by an RGB camera mounted on another UAV.
  • Real images are used only for testing:
    • once directly (no adaptation),
    • once after translation by CycleGAN into the synthetic style.

3. CycleGAN-translated real dataset

  • Real test images are passed through a pre-trained CycleGAN generator that translates them into the synthetic domain before classification.
  • Used to evaluate the effect of domain adaptation on classification performance and feature alignment.

Due to size and sharing constraints, the actual image datasets (DASC2025_datasetK) are not distributed with this repository. To reproduce our results, you should organize your data to match the expected folder structure.


Dataset structure

Each dataset is stored in a folder named:

DASC2025_datasetK/

where K is an integer (e.g., 1, 2, 3, ...). Inside each dataset folder, the expected structure is:

DASC2025_datasetK
├── Training
│   ├── loaded
│   └── unloaded
└── Testing
    ├── loaded
    └── unloaded
  • Training/loaded and Training/unloaded contain the synthetic (or synthetic-style) training images for each class.
  • Testing/loaded and Testing/unloaded contain the test images for each class, which may be:
    • real images (direct evaluation),
    • CycleGAN-translated real images,
    • or other variants depending on the experiment.

For the experiments reported in the DASC 2025 paper:

  • DASC2025_dataset1/

Contains the synthetic training data and the test images for direct real-data evaluation for both classifiers.

  • DASC2025_dataset2/

Contains the CycleGAN-translated real test images used to evaluate the adaptation pipeline.

  • DASC2025_dataset3/

Used together with DASC2025_dataset1/ by feature_compare_v02.py to build the three domains:

  • Simulated,
  • Real,
  • CycleGAN-adapted real

for the t-SNE visualization and domain alignment metrics.

You are free to define additional DASC2025_datasetK folders, as long as they follow the same Training/Testing/loaded/unloaded structure. The classifier scripts and visualization script only require this directory layout and the correct data_dir/folders paths.


Dataset stats

At minimum, the following statistics should hold for the synthetic training dataset:

Split Domain Class # Images
Training Synthetic loaded 2,269
Training Synthetic unloaded 2,269

In our experiments, we randomly divide the Training subset into:

  • Train: ~85% of images
  • Validation: ~15% of images

using a fixed random seed for reproducibility.

The real and CycleGAN-translated test sets follow the same loaded/unloaded class structure in their respective Testing folders; exact counts depend on your recorded data.


Models and training

Architectures

We use two ImageNet-pretrained models from torchvision.models:

1- ResNet34

  • Final fully-connected layer replaced by a 2-unit linear layer for binary classification.

2- EfficientNet-B2

  • The classifier head is replaced with a dropout + 2-unit linear layer:
   self.network.classifier = nn.Sequential(
       nn.Dropout(p=0.3, inplace=True),
       nn.Linear(1408, 2),
   )

Both networks are wrapped in a common ImageClassificationBase class with:

  • cross-entropy loss,
  • accuracy computation,
  • convenience methods for training, validation, and logging.

Input preprocessing

For all experiments, images are resized and converted to tensors:

transforms.Resize((255, 255)),
transforms.ToTensor()

applied separately to training and testing datasets.

Training procedure

Common settings:

  • Optimizer: torch.optim.SGD
  • Loss: cross-entropy
  • Batch size: 16
  • Device: GPU if available, else CPU
  • Train/validation split: 85% / 15% of Training subset

ResNet34

  • Epochs: 10
  • Max learning rate: 0.03
  • Weight decay: 1e-4
  • Gradient clipping: 0.1
  • Learning-rate schedule: one-cycle (OneCycleLR).

EfficientNet-B2

  • Epochs: 15
  • Max learning rate: 0.001
  • Weight decay: 1e-4
  • Gradient clipping: 0.1
  • Learning-rate schedule: one-cycle (OneCycleLR).

By default, the backbone is frozen for initial training, and then unfrozen for fine-tuning. The scripts then evaluate the final model on the Testing subset and (optionally) save the trained weights.


Feature-space visualization (Figure 3)

To reproduce the t-SNE visualization and domain alignment metrics (cosine similarity & MMD) shown in Figure 3 of the paper, use:

  • feature_compare_v02.py

This script:

1. Loads a pre-trained ResNet34 from dasc_SimBased.pth.

2. Strips any network. prefix from the state dict and inserts the weights into a vanilla torchvision.models.resnet34.

3. Builds a feature extractor by removing the final classification layer and flattening the output.

4. Iterates over three folders corresponding to:

  • Simulated data (Training from DASC2025_dataset1),
  • Real data (Testing from DASC2025_dataset1),
  • CycleGAN data (Testing from DASC2025_dataset3).

5. Extracts deep features, computes:

  • cosine similarity between domain means,
  • Maximum Mean Discrepancy (MMD) between domain feature distributions,
  • a 2-D t-SNE embedding for all features.

6. Plots a scatter plot with different colors for each domain. Before running:

  • Make sure the folders dictionary at the top of feature_compare_v02.py points to your actual dataset locations:
folders = {
    "Simulated": "/path/to/DASC2025_dataset1/Training",
    "Real": "/path/to/DASC2025_dataset1/Testing",
    "CycleGAN": "/path/to/DASC2025_dataset3/Testing"
}

Run:

python feature_compare_v02.py

The script will print cosine similarity and MMD values and display (or save, if you uncomment plt.savefig) the t-SNE figure.


Evaluation metrics and main results

Metrics

We use:

  • Classification accuracy on the real test set:

    • Directly (real images only),
    • With CycleGAN translation (CycleGAN → Classifier).
  • Feature-space alignment metrics:

    • Cosine similarity between synthetic and real feature means,
    • Maximum Mean Discrepancy (MMD) between synthetic and real feature distributions.

These metrics are computed using the feature vectors extracted by ResNet34.

Key results (Table 1, experimental data)

On real experimental images, we obtain:

Classifier Input Type Accuracy on real data
ResNet34 Real (Direct) 67%
ResNet34 CycleGAN → Classifier 82%
EfficientNet-B2 Real (Direct) 62%
EfficientNet-B2 CycleGAN → Classifier 80%

CycleGAN-based domain translation produces a substantial accuracy gain for both networks, demonstrating that aligning real images with the synthetic training domain significantly improves generalization.

Feature-space alignment (Figure 3)

Using ResNet34 features for three sets of images (Simulated, Real, CycleGAN-translated), we observed:

  • Cosine similarity (Simulated ↔ Real) ≈ 0.9031
  • Cosine similarity (Simulated ↔ CycleGAN-translated Real) ≈ 0.9898
  • MMD (Simulated ↔ Real) ≈ 0.0060
  • MMD (Simulated ↔ CycleGAN-translated Real) ≈ 0.0021

The t-SNE plot shows that CycleGAN-translated real samples cluster much closer to the simulated data than raw real images, explaining the improved classification accuracy.


Citation

If you use this code or pre-trained models in your research, please cite our DASC 2025 paper:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages