Skip to content

celicvml/2s-DAS

Repository files navigation

2s-DAS

论文链接 / 2s-DAS: Two-Stream Diffusion with Multi-Modal Fusion for Temporal Action Segmentation

Introduction

This project is an open-source implementation for 2s-DAS: Two-Stream Diffusion with Multi-Modal Fusion for Temporal Action Segmentation, including full training code, inference scripts, and dataset. It aims to provide an efficient and reproducible research framework for temporal action segmentation.

Environment Setup

  • Python == 3.8
  • PyTorch == 1.10
  • Cuda == 11.3

Dataset Download

The dataset is available at the links above.

Raw video files are needed to extract features. Please download the datasets with RGB videos from the official websites (Breakfast / GTEA /50Salads) and save them under the folder ./data/(name_dataset).

Extract Features

Extract features of 50salads, GTEA and Breakfast provided by Br-Prompt and I3D.

Train your own model

you can retrain the model by yourself with following command:

Generate config files by python default_configs.py

run by python main_two_stream.py --config configs/some_config.json --device gpu_id

Trained models and logs will be saved in the result folder

test by python eval.py

test one model by python predict.py --config configs/some_config.json --device gpu_id

Our model adapted form DiffAct.

Citation

About

2s-DAS: Two-Stream Diffusion with Multi-Modal Fusion for Temporal Action Segmentation, published on Jounral of Imaging 2026

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages