Skip to content

Releases: albertaparicio/tfg-voice-conversion

State at which this project was presented for evaluation

03 Aug 16:46
Compare
Choose a tag to compare

This is the state at which this project was presented for evaluation of my bachelor's thesis.

At this point, the DNN-LSTM-GRU model outputs intelligible speech, although not natural. The seq2seq model does not work. Neither the TensorFlow implementation nor the PyTorch one give speech output.

We believe the culprit of this bad performance is the attention model (see section 4.2.4 of the thesis).

Future work includes training the PyTorch and TensorFlow models as autoencoders (similar to the experiment described in section 3.3.2.3 of the thesis), as well as redefining the attention model as an RNN instead of a DNN