Project from DeepLearning for Speech and Language Seminar (UPC - ETSETB - 2017)
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is even with marcelcata:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Simulation_results
README.md
compare_results.py
compare_results_several.py
evaluate.sh
makefile
pho_rnn.png
pho_rnn.py
tasas
tasas.c
wcmudict.test.aligned
wcmudict.test.dict
wcmudict.train.aligned
wcmudict.train.dict
weights_Embedding.npy
weights_RNN1.npy

README.md

Text to phoneme project

Deep Learning for Speech and Language

To design, implement and train a neural network to predict the phonemes of a word given it as a text.

A presentation of the project can be found in (may require additional permission): https://docs.google.com/presentation/d/1gp6NUnFIWxY4sGYGnY6zQMN-T3wniQhM5MK3-oAuwWI/edit#slide=id.g1bb33be1e7_0_20

  • pho_rnn.py is the main programme (that trains and tests the network)
  • compare_results.py and compare_results_several.py are used to compare the results of different simulations in a plot
  • tasas.c calculates the Goals, Substitutions, Erasures and Insertions done by the network given the predicted and the correct phonemes
  • wcmudict... are the databases for training and testing the network. Aligned means that the relation letter/phoneme is one-to-one
  • Simulation_results/ contains the simulation results