Skip to content

mars-sep/MARS-Sep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MARS-Sep: Multimodal-Aligned Reinforced Sound Separation

Zihan Zhang1,*, Xize Cheng1,*, Zhennan Jiang2,*, Dongjie Fu1, Jingyuan Chen1
Zhou Zhao1, Tao Jin1,†

1Zhejiang University     2CASIA
Corresponding author. *Equal contribution

ICLR 2026

Paper     Project Page    

📣 Overview

Method Overview Diagram

📦 Data Preparation

🎵 MUSIC Dataset

Please refer to the script under dataset/music.

🔊 VGGSound Dataset

Please refer to the script under dataset/vggsound.

🚀 Installation

Clone the repository and set up the environment:

git clone https://github.com/mars-sep/ImageBind.git
cd Imagebind
pip install .

git clone https://github.com/mars-sep/MARS-Sep.git
cd MARS-Sep/

conda create -n marssep python=3.10
conda activate marssep

pip install -r requirements.txt

🏃 Training

python train.py \
    -o exp/vggsound/marssep \
    -c conf/mars.yaml
    -t data/vggsound/train.csv \
    -v data/vggsound/val.csv \
    --batch_size 128 \
    --workers 20 \
    --emb_dim 1024 \
    --train_mode image text audio \
    --is_feature \
    --feature_mode imagebind

🔬 Evaluate

Evaluate on MUSIC and VGGSound.

OMP_NUM_THREADS=1 python evaluate.py -o exp/vggsound/marssep/ -c conf/mars.yaml -l exp/vggsound/marssep/eval_MUSIC_VGGS.txt -t data/MUSIC/solo/test.csv -t2 data/vggsound/test-good-no-music.csv --no-pit --prompt_ens --audio_source ./MUSIC-aq.npy

Evaluate on VGGSound-Clean+ and VGGSound.

OMP_NUM_THREADS=1 python evaluate.py -o exp/vggsound/marssep/ -c conf/mars.yaml -l exp/vggsound/marssep/eval_VGGS_VGGSN.txt -t data/vggsound/test-good.csv -t2 data/vggsound/test-no-music.csv --no-pit --prompt_ens --audio_source ./VGGSOUND-aq.npy

🔍 Inference

OMP_NUM_THREADS=1 python infer3.py -o exp/vggsound/marssep/  -i "demo/audio/hvCj8Dk0Su4.wav" --text_query "playing bagpipes" -f "exp/vggsound/marssep/hvCj8Dk0Su4/playing bagpipes.wav"

📜 Citation

If you find our work useful for your research, please feel free to cite our paper:

@misc{zhang2025marssepmultimodalalignedreinforcedsound,
      title={MARS-Sep: Multimodal-Aligned Reinforced Sound Separation}, 
      author={Zihan Zhang and Xize Cheng and Zhennan Jiang and Dongjie Fu and Jingyuan Chen and Zhou Zhao and Tao Jin},
      year={2025},
      eprint={2510.10509},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2510.10509}, 
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages