New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TTS: Text to speech support #1808
Conversation
There are 2 things, I would like to know the opinion of the community about.
|
@@ -81,7 +81,7 @@ bool confirmWindowsVersion(int majorVersion, int minorVersion) { | |||
return VerifyVersionInfoFunc(&versionInfo, VER_MAJORVERSION | VER_MINORVERSION, conditionMask); | |||
} | |||
|
|||
wchar_t *ansiToUnicode(const char *s, uint codePage) { | |||
wchar_t *ansiToUnicode(const char *s, unsigned int codePage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint IS an assigned int: it’s defined as such in scummsys.h. So what’s the purpose of this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't know the type when compiling with Visual Studio. I guess the scummsys.h wasn't included before this. But the fix seemed pretty easy (uint and unsigned int should be the same), so I changed it and didn't worry about it.
Is this something, I should change back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please try and use the ScummVM specific types, they are defined appropriately for each platform we support
pthread_mutex_unlock(params->mutex); | ||
return NULL; | ||
} | ||
if(spd_say(_connection, SPD_MESSAGE, params->speechQueue->front().c_str()) == -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing
|
||
// init voice | ||
hr = CoCreateInstance(CLSID_SpVoice, NULL, CLSCTX_ALL, IID_ISpVoice, (void **)&_voice); | ||
if (!SUCCEEDED(hr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use FAILED() above... if these two functions are mutually exclusive, you should use one or the other
|
||
_voice->SetOutput(_audio, FALSE); | ||
|
||
if (_ttsState->_availableVoices.size() > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can’t you use empty() here?
I posted this in Discord/IRC last week as a teaser, but not everybody may have seen it. So here it is in action (on macOS): |
Overall, pretty good work :) Well done! @criezy: which speech engine is used in the videos? |
In the video it is using the macOS speech engine (the one that comes with the system). |
@bluegr Thank you for the review, I think I addressed all of the issues you pointed out. As for the recordings on my youtube channel, even though they are older, everything on the videos should still be working the same. |
@bluegr I added more documentation to the TTSVoice and a comment to the malloc. I also changed the unsigned int back to uint in the win32_wrapper. I hope everything is allright now. |
Many thanks! This looks OK now, well done, and nice work overall! +1 from me |
@bluegr I suggested to rename it as this specific implementation uses the speech dispatcher library, but we could envisage other implementations on Linux. This is a bit similar to the |
@criezy: thanks for the explanation :) |
This is now in a good state to be merged to master, IMHO. Is there any other pending issue? If not, I would like to have this merged by next weekend (Sunday, 1st of September) |
It has merge conflicts because of the Common::Encoding PR #1809. And since the Common::Encoding is merged and the TTS needs to convert between encodings, I would like to use that. After that, I don't know of any issue. I will resolve the conflicts and reimplement the encoding conversions using Common::Encoding tomorrow, since today I am quite busy. |
Probably works only in the builtin theme right now.
Unfortunatedly the encoding used by ScummVM breaks the speech-dispatcher, so after trying to say non-ascii character the connection has to be restarted. So for now I am restricting the GUI TTS to english only.
Conversion happens only for languages, that might needed (not for english)
Similarly as on linux, there isn't enough control of the speech queue to properly implement INTERRUPT_NO_REPEAT. So since this commit we use our own queuing and use SAPI to speak each speech. This is done outside the main thread.
On linux the resume() behaves slightly differently than on other platforms.
Refactoring as suggested by bluegr on github.
Use defined(USE_TTS) && defined(PLATFORM) instead
Add a new define for the SpeechDispatcherManager
2e0839b
to
6fce403
Compare
I rebased the branch on top of the current master. I replaced the character encoding conversion code with the Common::Encoding::convert(), which should support more encodings. And because this pull request adds speech synthesis to the mortevielle engine, which uses CP850 character encoding and Common::Encoding didn't support this encoding when compiling without iconv and using SDL, which is also compiled without iconv, I added a CP850 to unicode (and back) conversion table to the Common::Encoding. |
CP850 is used by the mortevielle engine (and apparently by other engines too). Anytime an engine using CP850 encoding wants to use the TTS, the encoding has to be converted, so this is pretty important encoding conversion to support. Unfortunately SDL (when compiled without iconv) doesn't support this encoding (which means, there might not be a way to convert this encoding on some platforms), so I added a conversion table for this.
Apart from codepage 850, there’s also 437. Perhaps this should be supported natively as well? |
is it planed to offer several voices for different "actors" in the game engines? (if most engines allow such distinction) |
Yes, but it is not guaranteed, there might not be any voice available in some situations. This is why in the mortevielle engine we use as many voices as are available, but we also change the pitch a little bit for each character (so the voice sounds at least a little bit different, if there are not enough voices for each character). We also change the pitch a lot more, if we have to use a voice with a different gender, than the character. So if for example there are only male voices available, but we have a female character, we use a male voice, but we set the pitch really high. |
Great work! This can be merged now |
@vyzigold : Thanks for the work on TTS. However, this has broken the OSX Intel build on the buildbot. I think the issue is that the SDK we use is for OSX 10.4 which is the earliest version we support and thus the TTS API may have changed or be not present in that version. Any ideas, apart from just dropping support for earlier OSX versions which is not optimal... |
Indeed, some of the API I used to implement TTS on macOS were introduced in OSX 10.5. Since this is the minimal version supported for the official releases, I did not think buildbot was using an older SDK (the PPC official release does support OS X 10.4 though, but buildbot doesn't have a OS X PPC build). I can look at trying to fix this so that we have TTS support on OS X 10.4. But my suggestion would be instead to:
|
@digitall I noticed, that the OSX build is broken by this pull request, but because I don't have anything with OSX and I have never seen any Objective-C++ code before this GSoC, I wasn't able to do anything about it. I hope criezy manages to fix the code, so it works with the 10.4 SDK, otherwise it will probably have to be disabled by the configure script for older versions. |
@criezy: I think it would be best to bring the buildbot up to date to 10.5 SDK as per your release builds and add a 64-bit OSX Intel build since we are already getting bug reports from the OSX 10.15 dropping of 32-bit build support: https://bugs.scummvm.org/ticket/11136 |
TTS libraries can now be used in default Mingw-w64 environments. - Removes reference to sapiddk.h which isn't used and isn't in Mingw-w64 - Defines guids whose symbols are missing from Mingw-w64 - Restores TTS detection to configure script
This pull request adds text to speech support on Linux, Windows and macOS (author of the macOS text to speech manager is @criezy ). It also uses this to add text to speech to the mortevielle engine, to the GUI (after allowing it in options) and it adds text to speech tests to the Testbed engine. This is a GSoC task described here.
To run the text to speech on Linux you need:
Install and configure speech-dispatcher, either through your package manager, or from github
Install one (or more) text to speech engine, supported by speech-dispatcher are: eSpeak, Festival, Flite, Pico. I tested it (end it worked) with eSpeak-ng and Festival.
Once you are successfuly able to use the speech-dispatcher to speak from the terminal, everything should be set up correctly to work with ScummVM. For more info on configuring speech-dispatcher, look at speech-dispatcher documentation
To run the text to speech on Windows you need:
(Do all of these steps even for MinGW-w64, which has it's own SAPI. Their implementation unfortunately doesn't have all the needed features.)
Install SAPI, for example from here
From the Program Files*\Microsoft Speech SDK 5.*\Include copy the sapi.h, sapiddk.h and sperror.h to your include directory, so the compiler finds them.
From the Program Files*\Microsoft Speech SDK 5.*\Lib*\ copy the sapi.lib to your lib directory, so the compiler finds them. (with MinGW-w64 delete or rename libsapi.a in the MinGW-w64's lib folder)
You have to explicitly enable the TTS when running configure script with the --enable-tts option. Or if you are using Visual Studio, run the create_project with --enable-tts option.
The text to speech is disabled by default on Windows, because the compilation currently ends with a warning: "Warning: corrupt .drectve at end of def file", which is caused by using a msvc compiled library (sapi.lib) with MinGW. Even with the warning, everything seems to work correctly.