Skip to content

tamaki-lab/MoExDA

Repository files navigation

MoExDA: Domain Adaptation for Edge-based Action Recognition

MoExDA

This study proposes using edge detection to suppress static bias in action recognition tasks. However, introducing edge frames causes a domain shift with RGB information, leading to dropped recognition performance. To address this, we propose a lightweight domain adaptation method called MoExDA (Moment Exchange Domain Adaptation), which performs moment exchange within a Vision Transformer to bridge the gap between RGB and edge information, thereby mitigating performance degradation.

Installation

conda env create -f environment.yml

Dataset Preparation

Action Recognition Datasets

Use make_shards.py (inside the make_shards folder) to create dataset shards. For more details, refer to tamaki-lab/webdataset-video.

UCF101

Training Shards
python3 make_shards/make_shards.py \
  -s ./2024_sugimoto_edge/datasets/UCF_shards_train \
  -d /path/to/your/raw_UCF101_train \
  -p UCF101 \
  -w 32 \
  --max_size_gb 1
Validation Shards
python3 make_shards/make_shards.py \
  -s ./2024_sugimoto_edge/datasets/UCF_shards_val \
  -d /path/to/your/raw_UCF101_val \
  -p UCF101 \
  -w 32 \
  --max_size_gb 1

HMDB51

Training Shards
python3 make_shards/make_shards.py \
  -s ./2024_sugimoto_edge/datasets/HMDB51_shards_train \
  -d /path/to/your/raw_HMDB51_train \
  -p HMDB51 \
  -w 32 \
  --max_size_gb 0.3
Validation Shards
python3 make_shards/make_shards.py \
  -s ./2024_sugimoto_edge/datasets/HMDB51_shards_val \
  -d /path/to/your/raw_HMDB51_val \
  -p HMDB51 \
  -w 32 \
  --max_size_gb 0.1

Kinetics50

We use a subset of Kinetics50 overwrapping only Mimetics classes.

Training Shards
python3 make_shards/make_mimetics_shards.py \
  -s ./2024_sugimoto_edge/datasets/Kinetics50_shards \
  -d /path/to/your/raw_Kinetics400_train \
  -p Kinetics50 \
  -w 32 \
  --max_size_gb 10
Validation Shards
python3 make_shards/make_mimetics_shards.py \
  -s ./2024_sugimoto_edge/datasets/Kinetics50_shards \
  -d /path/to/your/raw_Kinetics400_val \
  -p Kinetics50 \
  -w 32 \
  --max_size_gb 1

After completion, organize the dataset into the following structure:

2024_sugimoto_edge/
├── datasets/
│   ├── UCF101_shards/
│   │   ├── train/
│   │   └── val/
│   ├── HMDB51_shards/
│   │   ├── train/
│   │   └── val/
│   └── Kinetics50_shards/
│       ├── train/
│       └── val/

Usage

Refer to ./model/Moex_Video_Visiontransformer/moex_video_visiontransformer.py for the model implementation.

usage: python main_pl.py [-h]
                        [--use_moex]
                        [--moex_layers ML [ML ...]]
                        [--norm_type {in,pono}]
                        [--position_moex {BeforeMHA,AfterMHA,BeforeMLP,AfterMLP,AfterResidual}]
                        [--exchange_direction {edge_to_rgb,rgb_to_edge,bidirectional}]
                        [--stop_gradient {True,False}]

Options

Moment Calculation Methods

  • PONO (Positional Normalization) To use PONO for moment calculation:

    -norm pono

    See the PONO paper for details.

  • IN (Instance Normalization) To use IN for moment calculation:

    -norm in

    See the IN paper for details.

Position of MoExDA Modules

Number of Layers

Select which layers (out of 12 ViT layers) to insert MoExDA:

-ml 0 1 2      # Inserted layer 0~2
-ml 0 1 ... 11 # Inserted all layers

Position of Layers

Select where to insert within the TransformerBlock:

-pos_moex AfterMHA # AfterMHA
-pos_moex AfterMLP # AfterMLP

Exchange Direction

Select the direction of moment exchange:

-ex_direction edge_to_rgb
-ex_direction rgb_to_edge
-ex_direction bidirectional

Stop Gradient

-stop_grad False  # Without stop gradient
-stop_grad True   # With stop gradient

Example Command

Configuration:

  • Moment Calculation Method: PONO

  • Exchange Direction: edge_to_rgb

  • Position of Moment Exchange: AfterMHA

  • Number of Layers: All layers(0–11)

  • Stop Gradient: False

python main_pl.py \
  -d Mimetics_wds \
  --shards_path ./datasets/Kinetics50_shards \
  -w 8 -b 2 -e 10 -lr 3e-4 \
  --optimizer SGD \
  -m Moexlayervit \
  --log_interval_steps 10 \
  --scheduler CosineAnnealingLR \
  --use_moex \
  -norm pono \
  -ex_direction edge_to_rgb \
  -pos_moex AfterMHA \
  -ml 0 1 2 3 4 5 6 7 8 9 10 11 \
  -stop_grad False \
  --use_pretrained

You can download the trained weights for this configuration from:

MoExDA_PONO_AfterMHA_All_layers_False.ckpt

Use the --use_pretrained option to load the downloaded .ckpt file.

Evaluation of Static Bias

We use a static bias evaluation dataset generated with the HAT toolkit. Download the dataset from princetonvisualai/HAT .

Option

To enable static bias evaluation, simply add --use_hat to the example command.

Example Command

python main_pl.py \
  -d Mimetics_wds \
  --shards_path ./datasets/Kinetics50_shards \
  -w 8 -b 2 -e 10 -lr 3e-4 \
  --optimizer SGD \
  -m Moexlayervit \
  --log_interval_steps 10 \
  --scheduler CosineAnnealingLR \
  --use_moex \
  -norm pono \
  -ex_direction edge_to_rgb \
  -pos_moex AfterMHA \
  -ml 0 1 2 3 4 5 6 7 8 9 10 11 \
  -stop_grad False \
  --use_pretrained \
  --use_hat

Citation

@inproceedings{sugimoto2025moexda,
  author       = {Takuya Sugimoto and Ning Ding and Toru Tamaki},
  title        = {MoExDA: Domain Adaptation for Edge‐based Action Recognition},
  booktitle    = {Proceedings of the 19th International Conference on Machine Vision Applications (MVA 2025)},
  year         = {2025},
  month        = jul,
  day          = {26--28},
  address      = {Kyoto, Japan},
  note         = {Oral presentation (O2‑1‑2)},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages