WaveUNet

implement Wave-U-Net by pytorch

result

Wave-U-Net	my improved version
SDR=11.83	SDR=12.51

Train network

if you just want to train the model, use commandTrain.py

python commandTrain.py --dataset both 
##(for both ccmixter and musdb18)
python commandTrain.py --dataset ccmixter 
##(for ccmixter only)
python commandTrain.py --dataset musdb18 
##(for musdb18 only)

If you want to change the neural network model
from modelStruct.pyramidnet import Unet[1]
from modelStruct.unet import Unet[2])
you can choose one between these two.

Inference

If you have one or more mix songs, you can use inference.py to predict accompaniments by using these mix songs and a saved model.

python inference.py --checkpoint pyramid --test_number 1
##(please type checkpoints name and number of test songs)
##(please name test songs as 0.wav, 1.wav, 2.wav etc)
##(please put test songs in folder ccmixter2/x)

Dataset

please put mix songs in folder ccmixter2/x
please put accompaniments in folder ccmixter2/y
please put vocal songs in folder ccmixter2/z
first 50 songs are ccmixter and last 150 songs are musdb18.
please name songs as 0.wav, 1.wav, 2.wav etc in folder ccmixter2/x, ccmixter2/y and ccmixter2/z respectively.
if you only use ccmixter, you should have 0.wav, 2.wav, to 49.wav.
if you only use musdb18, you should have 50.wav, 51.wav, to 199.wav.
if you want to use both ccmixter and musdb18, you should have 0.wav, 1.wav, 2.wav, 3.wav, to 199.wav.
all Audio rates I read are 16000 and Mono.
I use all ccmixter songs and musdb18 songs, which includes 200 songs.
training_set = Dataset(np.arange(150), 'ccmixter2/')
test_set = Testset(np.arange(140,160), 'ccmixter2/')
validation_set =Valtset(np.arange(150,200), 'ccmixter2/')
as shown here, I use first 150 songs as training set, last 50 songs as validation set(to visualize loss)
I will also write results(from 140th songs to 159th songs, which includes training set and validation set) generated from network to folder vsCorpus.

Installment

pytorch 0.4
tensorboardX (using tensorboard with pytorch, if you do not want to use tensorboard, set USEBOARD as False)
soundfile
h5py
numpy

Describe files

Different start points

trainForRandomGen.py (use ccmixter and musdb as dataset to train model)
trainchinese.py (use chinese songs as dataset to train model)
trainclassify.py (use classification instead regression, classification can also generalize as good as regression but much more noise)

Tools

transformData.py (same as utils file)

Read Dataset

readccmu.py (read ccmixter and musdb18)
readchinese.py (read 20000 songs)
readpiano.py (read piano songs which is download from youtube to train wavenet, but now it is useless)

Model structure(all in folder modelStruct)

pyramidnet.py(in the middle of nework, use different dilation rate filters to extract features, learned from deep lab series)
quanunet.py(use softmax as loss fuction)
randomunet.py(my experiment, use random dilation rate, which is inspired by [3])
unet.py(use classical wave-u-net[2])
unetd.py(use wave-u-net with dilation filters)
resunet.py(wanna combine unet and resnet)

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.idea		.idea
modelStruct		modelStruct
pytorch-pyrmaid		pytorch-pyrmaid
readDataset		readDataset
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
calMeanStd.py		calMeanStd.py
commandTrain.py		commandTrain.py
finetune.py		finetune.py
inference.py		inference.py
readccmu.py		readccmu.py
readchinese.py		readchinese.py
readpiano.py		readpiano.py
readxy.py		readxy.py
readxyquan.py		readxyquan.py
sdr.py		sdr.py
string		string
train.py		train.py
trainForRandomGen.py		trainForRandomGen.py
trainResV2.py		trainResV2.py
trainchinese.py		trainchinese.py
trainclassify.py		trainclassify.py
trainresuv2.py		trainresuv2.py
trainseresunet.py		trainseresunet.py
trainstft.py		trainstft.py
transformData.py		transformData.py

License

ShichengChen/WaveUNet

Folders and files

Latest commit

History

Repository files navigation

WaveUNet

result

Train network

Inference

Dataset

Installment

Describe files

Different start points

Tools

Read Dataset

Model structure(all in folder modelStruct)

Reference

About

Resources

License

Stars

Watchers

Forks

Languages