Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add C++ inference example and code #191

Merged
merged 16 commits into from Aug 20, 2020

Conversation

ZDisket
Copy link
Collaborator

@ZDisket ZDisket commented Aug 7, 2020

This is complete C++ code (from text processing to saving audio) for inference with TensorflowTTS/FastSpeech2 (phonetic MFA-aligned from my fork) and Multi-Band MelGAN using the Tensorflow C API. Can compile and run for Windows 64-bit out of the box(solution and project), but the code is cross-platform assuming one provides the required libraries. The project builds a simple command line program where one inputs sentences and they are generated and saved as WAVs.

There's a link for compiled binaries, libraries, and a sample model required to compile for Win64 in the README.

It will allow deploying TensorflowTTS models in a portable way into desktop environments.

@dathudeptrai
Copy link
Collaborator

@candlewill @machineko @azraelkuan can you guys take a look :))).

@dathudeptrai
Copy link
Collaborator

@ZDisket can we run it on ubuntu ? :))

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 7, 2020

@dathudeptrai Out of the box, you can try WINE, it works for simple console apps in my experience. Since the program has no platform dependent code, to compile, you just need libs and includes for Linux versions of Phonetisaurus, OpenFST and Tensorflow C API (that one can be downloaded, the rest you'll have to compile from source).

@machineko
Copy link
Contributor

machineko commented Aug 7, 2020

@dathudeptrai Out of the box, you can try WINE, it works for simple console apps in my experience. Since the program has no platform dependent code, to compile, you just need libs and includes for Linux versions of Phonetisaurus, OpenFST and Tensorflow C API (that one can be downloaded, the rest you'll have to compile from source).

Can you add it in readme or make some bash file to download dependencies based on platform?

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 7, 2020

@machineko I'll add to the README what has to be downloaded. The users will have to compile those from source.

@dathudeptrai dathudeptrai self-requested a review August 7, 2020 17:10
@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 7, 2020

@machineko @dathudeptrai The backend's mostly complete (going to start work on TensorVox now); look good? Any success running the demo in Ubuntu with WINE? Also, do any of these people that were requested know C++?

@machineko
Copy link
Contributor

@dathudeptrai Didnt have time yet to compile everything (im on arch not ubuntu but still shouldnt be a problem) after weekend I will try it :P

@dathudeptrai
Copy link
Collaborator

dathudeptrai commented Aug 7, 2020

Also, do any of these people that were requested know C++?

@candlewill is a master in C++ :)), also in TTS. @azraelkuan not sure. @machineko not sure :)). About me: almost zero =)))))))))))))))))))). So i can give you 1 approve rightnow, haha :)).

dathudeptrai
dathudeptrai previously approved these changes Aug 7, 2020
@@ -0,0 +1,99 @@
#include "Voice.h"
#include "ext/ZCharScanner.h"
const std::vector<std::string> Phonemes = { "AA","AA0","AA1","AA2","AE","AE0","AE1","AE2","AH","AH0","AH1","AH2","AO","AO0","AO1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we consider making it more versatile and just load phonemes and phonemes id from files?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same about every other constants that can be transfered to file based config used in the project

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also considered that before deciding against it; this is just an example and if someone wants to adapt it to their dataset then they should take it then change it. It's not designed to be a one size fits all solution, but something that people can use as a base for their project.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah :)) it's just example :D, we do not need make it general :D. Flexible is more important :D

Copy link
Contributor

@machineko machineko Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe but every other example right now can be used in very easy way just changing few things in scripting part
I think we should move in this direction. (In future prepro there will be saving all needed mappers so you can load phonemes and other mappers from json files just like configs right now :P)

Still im not following any C++ standards so this is a question for @candlewill

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:))) for this case, i just to make sure it can be run then i will approve merge =)))))))))))))))). want to learn c++ :'(

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future prepro there will be saving all needed mappers so you can load phonemes and other mappers from json files just like configs right now

Why not leave it like this right now then later when those preprocessing steps are implemented I make it so that the program can load exported dictionaries? Because another issue is to agree on a format for storing the characters and IDs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just wait for @candlewill to check it and whole pr :P

@machineko
Copy link
Contributor

I have almost zero exp in c++ but a lot in pure C mostly for optimization :P

@machineko
Copy link
Contributor

Also did u do some benchmarks it is faster than python based version :D?

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 7, 2020

@machineko For a rough estimate of performance, see #179 (comment) . I never ran the Python version in my local PC, so I can't compare.

@machineko
Copy link
Contributor

I'll compare later in next week on i9-9900k :P

Copy link
Collaborator

@candlewill candlewill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ZDisket for providing these useful c++ code for inference on Windows OS.

To compile the cpp code, Microsoft visual studio is required and I don't have a working visual studio environment. Could you also provide a CMakeList file to make it possible to compile on linux based system?

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 10, 2020

@candlewill The only experience developing on Linux that I have is with Qt Creator, I can make a Qt project for compiling on Ubuntu 18.04.

@candlewill
Copy link
Collaborator

@ZDisket Well, I would base on your code to support CMake compilation.

@machineko
Copy link
Contributor

I also can't compile on gcc 7.5 on arch will upgrade gcc later

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 12, 2020

There's now Linux support, a precompiled demo can be downloaded (tested in Kubuntu 18.04) here, and build instructions (w/qmake) are included in the README.
Peek 2020-08-12 18-06

@machineko
Copy link
Contributor

machineko commented Aug 12, 2020

It's working for me on 7.5 gcc, on 9.3 I got (probably tensorflow stuff)

*** stack smashing detected ***: <unknown> terminated
Aborted.

I'll benchmark it later

@dathudeptrai
Copy link
Collaborator

dathudeptrai commented Aug 13, 2020

@ZDisket @machineko it run so fast in my laptop CPU :))

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 13, 2020

@dathudeptrai So, you like it?

@dathudeptrai
Copy link
Collaborator

@dathudeptrai So, you like it?

:))) the question is so serious :))). Just wait @candlewill review again :v

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 13, 2020

@dathudeptrai Don't worry, I was just surprised it was running fast, not in a hurry right now especially since I'm busy with the tool.

@machineko
Copy link
Contributor

machineko commented Aug 13, 2020

It works fine on ubuntu but I still can't compile it on arch (i've tried gcc 7.5, 8, 10.3 and diff versions of dependencies dunno if its cause of tensorflow but i won't be trying any more right now :P).
Also can u compile second version as CLI or some rest API to test it?

@orikama
Copy link

orikama commented Aug 15, 2020

Since I was getting this error: "was created with an older compiler than other objects; rebuild old objects and libraries" for Phonetisaurus and OpenFST libs while building this project in Visual Studio 2019, I wrote normal, not auto generated CMake, that builds Phonetisaurus, OpenFST and this project from source. Also tested it on Manjaro(arch) with gcc 10.
And changed Phonetisaurus sources a little to fix linker errors, because I couldn't find gcc analog for /FORCE :)

@orikama
Copy link

orikama commented Aug 15, 2020

CMakeLists.txt
Phonetisaurus.zip

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 15, 2020

@orikama My fork of Phonetisaurus that I made for Linux compiling already has the required change to not require /FORCE; soon I'll make that one Windows compatible too.
And while I appreciate those CMake lists you made, I won't add them here because I can't maintain them. This C++ implementation will be updated in the future to keep up with the repo's developments.

@orikama
Copy link

orikama commented Aug 15, 2020

@ZDisket why you just can't get me to help? Plus, you depend on QT in Phonetisaurus which is unnecessary.
And what will be you solution to people who might want to compile this in VS 2019 ?
Although, I don't mind. I'll just fork it then, want to play with it for a little.

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 15, 2020

@orikama

And what will be you solution to people who might want to compile this in VS 2019 ?

Tell them to compile Phonetisaurus and OpenFST from source in VS2019. I provide prebuilt for VS2017 for convenience, but there's nothing stopping them from compiling it themselves. I'll update my Phonetisaurus fork to support Windows in a bit.

Plus, you depend on QT in Phonetisaurus which is unnecessary.

Good point, I'll modify the Phonetisaurus README to use qmake instead of the full IDE.

why you just can't get me to help?

I assumed your contribution was just a one-time thing only. Let me think about it.

@orikama
Copy link

orikama commented Aug 15, 2020

@ZDisket btw AudioFile.hpp also gave me error, when I was trying to build it on manjaro with gcc 10.
I tried make CXXFLAGS="-fpermissive", but it didn't worked for me. I'm not familiar with gcc and make, so may be I did it wrong.
So I changed AudioFile also to make it work.

Tell them to compile Phonetisaurus and OpenFST from source in VS2019

I was lazy to figure out how to build Phonetisaurus with 'make' on windows, so I almost gave up :) Especially since you don't need to build it all and only need to include a couple files from it.

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 15, 2020

@orikama Weird that -fpermissive didn't work, but it did for me.

I was lazy to figure out how to build Phonetisaurus with 'make' on windows, so I almost gave up :) Especially since you don't need to build it all and only need to include a couple files from it.

That's why I'm providing my fork to allow people to compile it, because this doesn't use Phonetisaurus but rather libPhonetisaurus, my modification with all the executables gone and only libraries.

@orikama
Copy link

orikama commented Aug 15, 2020

@ZDisket If I'm not mistaken you still use more files from Phonetisaurus than you need to. You should check what files I left in my Phonetisaurus.zip

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 15, 2020

@orikama Interesting, I'll check it out.

@dathudeptrai
Copy link
Collaborator

@ZDisket Finish ?

@ZDisket
Copy link
Collaborator Author

ZDisket commented Aug 20, 2020

@dathudeptrai Should be, I don't have anything else to do except clean up the libPhonetisaurus repo, but of course that's somewhere else.

@dathudeptrai dathudeptrai removed the request for review from azraelkuan August 20, 2020 05:17
@dathudeptrai dathudeptrai merged commit b38187c into TensorSpeech:master Aug 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🚀 New feature or request Feature Request 🤗 Feature support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants