Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baked AI voiceover #26

Open
lofcz opened this issue Oct 7, 2023 · 4 comments
Open

Baked AI voiceover #26

lofcz opened this issue Oct 7, 2023 · 4 comments

Comments

@lofcz
Copy link

lofcz commented Oct 7, 2023

Hey,
I'm not yet familiar with the codebase. Still, an idea struck me - we can extract all dialogues in plaintext from the localization files, build a dictionary <Actor, VoiceLine>, and generate the VO via ElevenLabs. The current patches need to be modified to lookup the correct VO, we could assign a GUID to every line of every dialogue so the lookup is just a dictionary lookup.

As someone knowledgeable in the topic at hand, does this sound feasible? Are there any major issues with this high-level plan? At least I could fund ElevenLabs API usage, should I have more time I'd be interested in implementing the entire thing myself.

@Osmodium
Copy link
Owner

Osmodium commented Oct 10, 2023

Hi!
Yes this would be possible for everything that is in the translation files.

The drawbacks that I see with this is:

  • Would take alot of time to generate.
  • Take up an enourmous amount of space, which people would have to download from Nexus (not even sure the are able to host as much as it would take up).
  • Very static to the point of, if they change any text in the game, it would need to be located and generated again.
  • Limited to the voices that is chosen at generation time.

The positives:

  • More natural sounding voices.

Personally I'm not at fan of introducing any of these things in the mod for the benefit of more natural sounding voices..

@lofcz
Copy link
Author

lofcz commented Oct 10, 2023

Thanks for the reply, I asked here for the reason you have the required know-how rather than with the intention of pushing said feature here, I agree this is out of the scope of the mod. I gave it a few more thoughts and toyed with sampling a few lines of each actor to gpt3.5-instruct to get characteristics of the actor on the output in a structured format (approximate age, moral alignment, male/female..)

Would you be interested in making a small tech demo that could play one pre-baked VO line somewhere at the start of the game? I'm still unfamiliar with the patches used in the codebase, so this would be a great headstart for me.

Of course, only if this is not too much trouble for you!

@Iheuzio
Copy link

Iheuzio commented Nov 23, 2023

ElevenLabs charges based on a limit of characters. Per the amount in the game, it would be too much to voice every line unless you happen to have 5k. Even then, you'll have to manually assign each id to every speech line, it is not something that is very feasible.

@curtwagner1984
Copy link

@Iheuzio What if we crowd source it?

Supposed from the software side we have a working module that does what @lofcz suggested. And this is released as a mod where the user needs to plug in their (paid or free) eleven labs api to generate new lines of speech.

The mod has a database of lines and audio files somewhere online. When a line needs to be said in the game, the mod will first look in the mod's online database. If the audio file exist there, then this is what's played. If it does not, then it requests Eleven Laps to generate the audio and then play it, and then uploaded it to the database. This allows people playing with the mod to benfit from the spoken lines generated by others and contributed to the audio database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants