ForwardTacotron and HiFi-GAN support for NVDA Screen reader

Note: This add-on as well as the documentation is still under construction. Your contributions are welcome!

introduction

Remember that ForwardTacotron is a speech synthesis model in pytorch that uses a duration predictor to align text and generated mel spectrograms. The model has advantages, such as robustness, speed, pitch and energy manipulation, and efficiency.

So, this plugin is an attempt to implement support for ForwardTacotron in NVDA's open source screen reader via client/server, because the libraries used as torch are not possible to include in NVDA directly.

This is a work in progress and therefore there is still a lot to do.

In the meantime, you can listen to the progress that has been made so far.

audio samples

Language	Voice	Sample
English	LJSpeech (with griffinLim vocoder)
English	LJSpeech (with HiFi-GAN vocoder)
Spanish	Ald Dataset (with HiFi-GAN vocoder)
Spanish	Odal (with HiFi-GAN vocoder, universal model)

to do:

A way to compile and integrate the server to the add-on.
- When this happens, allow the server to open when the synth is loaded. Once the server loads, we can call check to make the speech synthesizer ready for use.
- Two versions could be made for the add-on, with CPU support and one with GPU support, since apparently the synthesis is generated in real time on a GPU. In the meantime, we may notice slowdowns in the CPU.
Voice and energy change support in synth ring options.
At the moment the add-on uses httplib2 to communicate with the server, but I could look for other methods and if necessary rewrite a part of the server.
Add support for loading different voices that could be detected within a "voice_models" folder.
- With this, a support for downloading trained models could be added. We have a ljspeech model in English, another in German and two in Spanish.
For newer multi-speaker models, it can read the settings to check, and if so, it can choose the voice from the synth ring options with first consult the speaker names on the model.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
addon/synthDrivers/Forward		addon/synthDrivers/Forward
demo		demo
site_scons/site_tools/gettexttool		site_scons/site_tools/gettexttool
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
buildVars.py		buildVars.py
manifest-translated.ini.tpl		manifest-translated.ini.tpl
manifest.ini.tpl		manifest.ini.tpl
sconstruct		sconstruct
style.css		style.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ForwardTacotron and HiFi-GAN support for NVDA Screen reader

introduction

audio samples

to do:

About

Releases

Packages

Languages

License

rmcpantoja/ForwardTacotron-NVDA

Folders and files

Latest commit

History

Repository files navigation

ForwardTacotron and HiFi-GAN support for NVDA Screen reader

introduction

audio samples

to do:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages