PSIML6 Voice Style Transfer

The objective of the project is Neural Style Transfer done on audio files. The project included the implementation of the StarGAN neural network architecture, in order to acchive many-to-many style transfer, applied on sound spectrograms. The neural network consisted of mostly convolutional generator, discriminator, as well as the pretrained mainly convolutional classifier.

Reference paper: https://arxiv.org/pdf/1806.02169.pdf git: https://github.com/liusongxiang/StarGAN-Voice-Conversion.git

Dataset: https://www.kaggle.com/andradaolteanu/gtzan-dataset-music-genre-classification

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Data		Data
README.md		README.md
Untitled.ipynb		Untitled.ipynb
best_classif_model.py		best_classif_model.py
dataloader.py		dataloader.py
diskriminator_3cls_170ep_param1052.pt		diskriminator_3cls_170ep_param1052.pt
diskriminator_allcls_train70ep.pt		diskriminator_allcls_train70ep.pt
ffmpeg.exe		ffmpeg.exe
ffplay.exe		ffplay.exe
ffprobe.exe		ffprobe.exe
gan_model-Copy1.ipynb		gan_model-Copy1.ipynb
gan_model.ipynb		gan_model.ipynb
generator_3cls_170ep_param1052.pt		generator_3cls_170ep_param1052.pt
generator_allcls_train70ep.pt		generator_allcls_train70ep.pt
main.ipynb		main.ipynb
model_best.pt		model_best.pt
test.wav		test.wav

Emilija2000/PSIML6_Voice_style_transfer

Folders and files

Latest commit

History

Repository files navigation

PSIML6 Voice Style Transfer

About

Resources

Stars

Watchers

Forks

Languages