Skip to content

Project page for 'The Methodology for Stereo Image Learning Representation'. (Machine Listening 2020 Fall)

Notifications You must be signed in to change notification settings

changwoonchoi/ml2020

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stereo Image Learning Representation

Codes for the paper The Methodology for Stereo Image Learning Representation. [pdf] by Changwoon Choi, Seongrae Kim, Sangwoo Han.

Dataset gt_1 gt_2 gt_3
Gansynth[1] gs_1 gs_2 gs_3
Ours our_1 our_2 our_3

You can check the sample audios corresponding to the stereo image above at the following Google Drive link.
(Sample Audios)

Installation

To run the codebase, you need Anaconda. Once you have Anaconda installed, run the following command to create a conda environment.

conda create -n ml2020 python=3.8
conda activate ml2020
pip3 install -r requirements.txt

Dataset

You can download our dataset at following Google Drive link. [Dataset Link ]
In the compressed file, there are raw wav files and train.txt, test.txt for train - test split.

Train

1. Data preprocessing

You can get preprocessed mel-spec and IF chunks with simply run prepare_data.py in model/ directory.

python prepare_data.py

2. Train Network

You can easily train the network by running trian-MS.py
(You need to modify the root directories and data directories in train-MS.py)

python train-MS.py

Test

1. Sample from trained model

Run the following command from the terminal in ml2020 folder:

python infer.py --type MS --model MODEL_PATH --sample_num NUM_SAMPLES --save_dir PATH_TO_SAVE

2. Evaluate

If you want to evaluate your outputs, you can run the following command to get the results:

python test_metrics.py --gen_dir PATH_TO_YOUR_OUTPUTS

License

References

[1] Jesse Engel et al.,“GANSynth: Adversarial neural audio synthesis,” in International Conference on Learning Representations, 2019.

About

Project page for 'The Methodology for Stereo Image Learning Representation'. (Machine Listening 2020 Fall)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •