Spacemen TTS #68106

tralezab · 2022-06-30T07:01:51Z

About The Pull Request

Just a little funny experiment, or really, how bone rattling it is to API call from dm. This makes Spacemen speak with TTS, which they pick in character preferences.

spessmentts.mp4

fix prefix related things with silicons
it stack traces a lot on bootup
add a signal for post say so i can tts something without other signal hooks ruining things
Let people truly pick any seed they want (prefs didn't have a box for entering strings, as far as I know)

Shoutout to NovelAI for their incredibly impressive TTS model!

Why It's Good For The Game

Really mostly for luls, there are several reasons why this can't be permanently in the game.

Changelog

🆑 NovelAI's TTSv2, Armhulen, Special credits to Iamgoofball for initial exploration of the idea, code to reference, etc
add: Spacemen now speak with TTS.
/:cl:

carshalash · 2022-06-30T07:02:31Z

Oh god it's ready.

SkeletalElite · 2022-06-30T08:10:52Z

Speedmerge

Miczu555PL · 2022-06-30T08:32:21Z

Add this permanently as disabled feature by default.

I wonder how spells would sound..

SkeletalElite · 2022-06-30T08:41:46Z

I unironically like this a lot

mc-oofert · 2022-06-30T08:44:26Z

Seperate languages such as felinid are now pointless because TTS reads them in english regardless if you understand it or not

optimumtact · 2022-06-30T09:00:01Z

Seperate languages such as felinid are now pointless

what do you mean by now?

MMMiracles · 2022-06-30T09:06:13Z

Why is that TTS so good wtf

mc-oofert · 2022-06-30T09:09:58Z

Seperate languages such as felinid are now pointless

what do you mean by now?

if you dont know the felinid language for example and someone speaks felinid you wont understand the text but the TTS voice will say it in english so you can just understand felinid language without actually knowing it

necromanceranne · 2022-06-30T09:11:54Z

Is one of those voices Keanu Reeves?

Jakksergal · 2022-06-30T11:29:11Z

I really want this to be added but I got a few concerns.

Does the API generate the AI text at the client, the server, or on NovelAI's server? If it's created on the NovelAI server, that presents a massive bandwidth usage for a server with 60+ users speaking unless there is an auxiliary script that prevents data being sent to players that couldn't hear it anyway.
Does the API support generating voice seeds on the fly? Does it support a variable configurable system to make new voices? If so, will players be able to use those features for their character?

Bluedino1025 · 2022-06-30T16:25:34Z

the tongue tied perk gets very confusing also would people talking on comms be heard

tralezab · 2022-06-30T16:49:39Z

Seperate languages such as felinid are now pointless because TTS reads them in english regardless if you understand it or not

I'll probably make it only work for GalCom next test.

I really want this to be added but I got a few concerns.

Like I said, can't /really/ be added.

Does the API generate the AI text at the client, the server, or on NovelAI's server? If it's created on the NovelAI server, that presents a massive bandwidth usage for a server with 60+ users speaking unless there is an auxiliary script that prevents data being sent to players that couldn't hear it anyway.

NovelAI's server, and yes, that's why I've been targetting lowpop. The solution is the same at the 15.ai pr, aka tgui window playing webroot cdn stuff. So it's fixable, but one sad part would be the fact you no longer hear the voices in the environment spoken which I really like. Since I'm not trying to add it permanently, I'm sticking with environmental sounds because I love them.

Does the API support generating voice seeds on the fly? Does it support a variable configurable system to make new voices? If so, will players be able to use those features for their character?

Indeed it does, for the next text I'll probably remove the list of voices to pick and just let people roll for ones until they like one.

AffectedArc07 · 2022-06-30T17:24:22Z

Oh god its 2013 para all over again

interface/skin.dmf

Iamgoofball · 2022-06-30T18:17:20Z

code/modules/surgery/organs/tongue.dm

@@ -50,8 +58,7 @@
 	..()
 	if(say_mod && tongue_owner.dna && tongue_owner.dna.species)
 		tongue_owner.dna.species.say_mod = say_mod
-	if (modifies_speech)
-		RegisterSignal(tongue_owner, COMSIG_MOB_SAY, .proc/handle_speech)
+	RegisterSignal(tongue_owner, COMSIG_MOB_SAY, .proc/handle_speech, override = TRUE)
 	tongue_owner.UnregisterSignal(tongue_owner, COMSIG_MOB_SAY)


registers signal
immediately unregisters it

not your fault cuz the code was already like this but lol, lmao

ah i see it's some weird signal magic nvm

Iamgoofball · 2022-06-30T18:20:39Z

code/modules/client/preferences/tts_seed.dm

+/datum/preference/choiced/tts_seed
+	category = PREFERENCE_CATEGORY_SECONDARY_FEATURES
+	savefile_identifier = PREFERENCE_CHARACTER
+	savefile_key = "tts_seed"
+	priority = PREFERENCE_PRIORITY_SPECIES + 1
+
+/datum/preference/choiced/tts_seed/deserialize(input, datum/preferences/preferences)
+	//if you figure out how to enter whatever you want than honestly take it idc seeds support that
+	return ..(input, preferences)
+
+/datum/preference/choiced/tts_seed/init_possible_values()
+	return GLOB.tts_seeds_prefs
+
+/datum/preference/choiced/tts_seed/apply_to_human(mob/living/carbon/human/target, value)
+	var/obj/item/organ/internal/tongue/tts_speaker = target.getorganslot(ORGAN_SLOT_TONGUE)
+	if(!tts_speaker)
+		log_admin("didn't apply tts seed to tongue")
+		return
+	tts_speaker.tts_seed = value
+
+/datum/preference/choiced/tts_seed/create_default_value()
+	return pick(GLOB.tts_seeds_prefs)
+
+/datum/preference/choiced/tts_seed/is_accessible(datum/preferences/preferences)
+	if (!..(preferences))
+		return FALSE
+	return TRUE


you can implement a text box preference this way:

/datum/preference/text abstract_type = /datum/preference/text var/max_length = 1024 /datum/preference/text/deserialize(input, datum/preferences/preferences) return STRIP_HTML_SIMPLE(input, max_length) /datum/preference/text/create_default_value() return "" /datum/preference/text/is_valid(value) return istext(value)

then over in base.tsx for preferences:

export const FeatureTextInput = ( props: FeatureValueProps<string> ) => { return (<TextArea height="100px" value={props.value} onChange={(_, value) => props.handleSetValue(value)} />); }; export const FeatureShortTextInput = ( props: FeatureValueProps<string> ) => { return (<Input width="100%" value={props.value} onChange={(_, value) => props.handleSetValue(value)} />); };

Farquaar · 2022-07-01T04:06:28Z

felinids are now pointless

Fixed that for ya @mc-oofert

On a more serious note, I looked up NovelAI and it appears to be a paid subscription service. Is there a free version you're using for this PR or something?

tralezab · 2022-07-01T05:43:31Z

On a more serious note, I looked up NovelAI and it appears to be a paid subscription service. Is there a free version you're using for this PR or something?

Someone needs a subscription for this to work, yes. Tests I've run the past few days are at my own expense, haha.

Sylphily · 2022-07-01T05:46:26Z

yikes, how much?

tralezab · 2022-07-01T06:05:18Z

yikes, how much?

don't worry, it's not a yikes. it's not pay per generation or I would have lost everything I owned from those like 15 people just spamming "A" in the bar

Farquaar · 2022-07-01T06:17:28Z

yikes, how much?

Looks like $10/month if I'm reading their prices right.

Not crazy, but $120 a year adds up. If it gets merged I imagine TTS might make for a decent Patreon monthly donation goal.

tralezab · 2022-07-01T06:36:28Z

There is also a rate limit, that is reached after a few rounds of nonstop tee tee ess. And half of that package would see no use, aka the "use language models" part. So if this were to see any unirony it needs some special discussions with NovelAI itself

tralezab · 2022-07-01T06:36:41Z

But I'm glad people enjoyed it while it was run

davethwave · 2022-07-01T18:53:03Z

It is fun. Though a few bugs could be spotted and a bit of qol changes will likely be wanted. Currently the borgs :b chat can be overheard. Might also be the case for drones. I'm unsure if someone speaking another language can be overheard by those who can't understand via tts but is something to check. A way to mute tts for people who would want to play old school or however you would put it.

Bluedino1025 · 2022-07-01T20:13:41Z

One thing is mutes can speak with hands full but i like the idea of saying the gloves have tts

cacogen · 2022-07-03T03:44:39Z

This is really cool, shame it can't be merged

GeneriedJenelle · 2022-07-03T03:46:22Z

There is also a rate limit, that is reached after a few rounds of nonstop tee tee ess. And half of that package would see no use, aka the "use language models" part. So if this were to see any unirony it needs some special discussions with NovelAI itself

Is there any chance there's a free open source tts engine with a similar quality that can be downloaded and run server-end to integrate into SS13? That's the main thing I'm thinking of that could possibly save this.

TheSmallBlue · 2022-07-03T04:54:02Z

I didn't get to play with this on personally but I've seen clips of it and im OBSESSED dude I love this SO MUCH, I HAVE TO SEE THIS IN PERSON please please PLEASE for the love god make a way, ANY way for it to be in the game. Make it disabled by default so that admins can turn it on (they'd have to have a NovelAi subscription of course), or make it entirely client based, like each user who wants to use it has to pay a NovelAi subscription then type in their key and seed in the options menu and it works only for them, hell make it a Patreon goal and I'll fucking fund it 100% by myself I want this THAT bad. It makes the game 1000% better in every way

Mothblocks · 2022-07-03T07:00:18Z

We don't get the patreon money :P

TheSmallBlue · 2022-07-03T07:47:31Z

Meant it in a more "make MSO to pay for it with patreon money if the tier is met" way but I'm willing to pay a monthly tax to the head coder (you, mothblocks) if it means we get this thing of beauty
Hell I'd pay it myself and make the key or whatever public, I'd suck dick even! Just make this a thing PLEASE.

MMMiracles · 2022-07-03T13:09:26Z

i'll give an extra dollar if it means i can talk about my minecraft lets play as keanu reeves

tralezab · 2022-07-03T16:11:43Z

Is there any chance there's a free open source tts engine

absolutely

with a similar quality

no, sadly as far as i'm aware

Iamgoofball · 2022-07-03T19:31:50Z

Is there any chance there's a free open source tts engine with a similar quality

If we wanted to do this, we'd need to train our own TTS model from scratch with our own training data and our own server to run it on.

It's doable, but only if MSO agrees to host a server for it to sit on and take requests from the game servers.

That's also discounting that we'd need to spend money on GPU time to train said model.

Mothblocks · 2022-07-03T19:33:08Z

@TheSmallBlue Yeah fair enough

I've wanted TTS for years but 99.99% this is going to need to be local. Services like this go offline pretty often from my experience (it having the possibility to go offline at all is a pretty sad loss), and the delay is noticeably huge, even ignoring the pricing (which is in itself subject to the whims of the service). Imagine if, for instance, one day the TTS service goes into a total maintenance mode, and is totally offline for weeks. Or if they change their pricing to be based on per-message rather than whatever it is now, or just raises their prices. We'd have to get rid of it, and people would be significantly more upset by that.

TheSmallBlue · 2022-07-04T01:17:56Z

Ok so, to sum up, here are the possible ways of having this or something like it in a permanent way:

Option 1:
We agree to add the cost of the NovelAI service onto the server upkeep monthly patreon goal. API requests are done server side.
This would entail somehow convincing MSO to pay more for upkeep, and of course, a higher patreon goal.

Pros: TTS is free for every single user. Yay!
Cons: We'd be inmplementing an online service outside of our control into the game. It can go down any second, for any reason. The price for the service is also outside of our control, it could increase or decrease, making the server upkeep costs unstable. The delay between sending messages and them being read would be pretty big. And the problem that makes it clearly unviable: the rate limit. It wont be able to be used in constant highpop servers like Sybil, so its use would be limited to Manuel, and even then it'd only last a few rounds before we're rate limited.

A good example of how this would feel to use is the testmerges that happened on manuel already, it'd be like that. Maybe even as often.

Option 2:
We train our own text to speech model and host it on our own servers.
For this to happen, we'd need someone with a very good GPU to train a decent TTS model, then send that model over to MSO, who will need to set up a server with a GPU in it to run said model.

Pros: TTS is still free for every single user! There wouldn't be as much delay as option 1, and we'd have complete control over training and generation. We can choose our own voices, our own modifications to said voices, our own systems to modify said voices, hell maybe a way to implement it with BYOND that doesn't include APIs at all! No need to rely on someone else's service, no need to rely on someone else's pricing demands, no ratel imits!
Cons: Training an AI is very resource intensive. In terms of training we'd not only need someone with a very high end GPU to train the ai, but someone with the ai knowledge to choose the correct dataset and somehow recreate the level of customization NovelAI has, which is no easy task, and all of it for basically free (or maybe a code bounty?).
We'd also need to convince MSO to buy a good GPU, which aren't cheap, to put it in a server, and to set up said server to work for on the fly ai synthesis. While AI synthesis isn't as intensive as training, its still no easy task and in the context of a server it has to be done SUPER quick, which means it needs high end parts. This all means more money, possibly over 1000+ bucks worth of parts, and that means higher upkeep costs, and I mean WAY higher. (I'm not an Ai expert or anything like that, I might be wrong on all this, if any of you want to correct me feel free to do so.)

Option 3:
We implement the API requests and integrate NovelAi, but locally instead of in the server side. Users would have to pay for their own NovelAi subscriptions
For the record I don't know how exactly our API system works, I've only heard it's not that good, so I don't know if we even have the option of sending an API call locally instead of from the server, but if we do...

Pros: Low delay, the less steps in the middle the better. The rate limit would possibly not be reached, or it wouldn't be reached as often. One player sees way less people talk than 30 players do. This would make the investment worth it for some (me, i am some). Unlike the other two options, the upkeep costs would stay the same.
Cons: Users would have to individually pay for their NovelAi subscriptions, and the servers could go down at any minute.

Option 1 is the most similar to what's currently implemented, except instead of MSO paying for a subcription its a mantainer. It clearly isn't stable, and it's not a good idea to keep using it.

Option 2 is the utopic option. If we manage to do it, it'd be awesome to have, but the actual road to getting there is way too complicated to it to be feasible any time soon.

Option 3 is the compromise, it's the one I see most likely to happen if anything does actually happen. Though again, I don't know enough about how BYOND works to know if it even makes sense.

Would be cool if a design document of sorts about this could be written, I think TTS improves the game to a whole new level and It should be implemented in some way.

Also do note that I am speaking mostly out of my ass so if anything I said is stupid and wrong please say so :)

optimumtact · 2022-07-04T01:27:03Z

it's not going to happen

TheSmallBlue · 2022-07-04T01:32:39Z

Oranges if you keep destroying my hopes and dreams I WILL break into tears

optimumtact · 2022-07-04T01:33:10Z

good

Iamgoofball · 2022-07-04T01:34:32Z

For this to happen, we'd need someone with a very good GPU to train a decent TTS model, then send that model over to MSO, who will need to set up a server with a GPU in it to run said model.

incorrect, we just pay amazon or google for a GPU on the cloud for training, then we run a model in cpu inference mode(after it's been benchmarked to actually handle our 300+ requests a minute on 3 servers)

TheSmallBlue · 2022-07-04T02:08:13Z

incorrect, we just pay amazon or google for a GPU on the cloud for training, then we run a model in cpu inference mode(after it's been benchmarked to actually handle our 300+ requests a minute on 3 servers)

Huh, good point. The main issue then would be to find someone willing to pay for training. I also have no idea what cpu interference mode is nor why it'd need to be benchmarked, but it feels like this might actually be possible? It'd be so sick for us to have our own dynamic text to speech synthesizer

MrStonedOne · 2022-07-04T04:14:31Z

I have a cluster of Single board computers in a kubernetes swarm in my garage. two of them are even atomic pis and pods can access the gpus.

I have like 5 computers with decent gpus too.

SplinterGP · 2022-07-04T10:38:59Z

honestly, if we could somehow implement a more basic TTS that is doable would be also okay, goofy ass TTS's are funny asf like moonbase alpha one.

Also we have the option of waiting some 5 or 10 years so technology evolves and it becomes easier to do it and byond may be possibly less shit, or we are all dead in these 5 or 10 years

AffectedArc07 · 2022-07-04T11:47:08Z

I have a cluster of Single board computers in a kubernetes swarm in my garage. two of them are even atomic pis and pods can access the gpus.

I have like 5 computers with decent gpus too.

RS also lets you throw a quadro RTX 4000 into the server

NovelAI TTS experiment

5f8c699

tralezab requested review from Ryll-Ryll, stylemistake and Mothblocks as code owners June 30, 2022 07:01

tgstation-server added Config Update Time to bother the headadmins for three months to get your config applied Feature Exposes new bugs in interesting ways UI We make the game less playable, but with round edges labels Jun 30, 2022

github-actions bot requested a review from MrStonedOne June 30, 2022 07:03

tralezab added 2 commits June 30, 2022 00:16

Moves TTS

f144889

please be quieter

f302530

Mothblocks added the Do Not Merge You must have really upset someone label Jun 30, 2022

Iamgoofball reviewed Jun 30, 2022

View reviewed changes

interface/skin.dmf Outdated Show resolved Hide resolved

Iamgoofball reviewed Jun 30, 2022

View reviewed changes

Fixes, Improvements. PAI voice.

06e8545

tralezab requested a review from ninjanomnom as a code owner June 30, 2022 20:08

no more skin.dmf

739703a

optimumtact closed this Jul 6, 2022

skylord-a52 mentioned this pull request Aug 2, 2022

Feature Request: Talking noises lizardqueenlexi/orbstation#155

Open

Bizzonium mentioned this pull request Sep 24, 2022

Feat: add TTS and integrate Silero via REST API ss220-space/Paradise#1364

Merged

38 tasks

Spacemen TTS #68106

Spacemen TTS #68106

Conversation

tralezab commented Jun 30, 2022 • edited

About The Pull Request

Why It's Good For The Game

Changelog

carshalash commented Jun 30, 2022

SkeletalElite commented Jun 30, 2022

Miczu555PL commented Jun 30, 2022

SkeletalElite commented Jun 30, 2022

mc-oofert commented Jun 30, 2022

optimumtact commented Jun 30, 2022

MMMiracles commented Jun 30, 2022

mc-oofert commented Jun 30, 2022

necromanceranne commented Jun 30, 2022

Jakksergal commented Jun 30, 2022

Bluedino1025 commented Jun 30, 2022

tralezab commented Jun 30, 2022 • edited

AffectedArc07 commented Jun 30, 2022

Iamgoofball Jun 30, 2022

Choose a reason for hiding this comment

Iamgoofball Jun 30, 2022

Choose a reason for hiding this comment

Iamgoofball Jun 30, 2022 • edited

Choose a reason for hiding this comment

tralezab Jun 30, 2022

Choose a reason for hiding this comment

Farquaar commented Jul 1, 2022 • edited

tralezab commented Jul 1, 2022

Sylphily commented Jul 1, 2022

tralezab commented Jul 1, 2022

Farquaar commented Jul 1, 2022

tralezab commented Jul 1, 2022

tralezab commented Jul 1, 2022

davethwave commented Jul 1, 2022

Bluedino1025 commented Jul 1, 2022

cacogen commented Jul 3, 2022

GeneriedJenelle commented Jul 3, 2022

TheSmallBlue commented Jul 3, 2022

Mothblocks commented Jul 3, 2022

TheSmallBlue commented Jul 3, 2022 • edited

MMMiracles commented Jul 3, 2022

tralezab commented Jul 3, 2022 • edited

Iamgoofball commented Jul 3, 2022 • edited

Mothblocks commented Jul 3, 2022 • edited

TheSmallBlue commented Jul 4, 2022

optimumtact commented Jul 4, 2022

TheSmallBlue commented Jul 4, 2022

optimumtact commented Jul 4, 2022

Iamgoofball commented Jul 4, 2022

TheSmallBlue commented Jul 4, 2022

MrStonedOne commented Jul 4, 2022

SplinterGP commented Jul 4, 2022

AffectedArc07 commented Jul 4, 2022

tralezab commented Jun 30, 2022 •

edited

tralezab commented Jun 30, 2022 •

edited

Iamgoofball Jun 30, 2022 •

edited

Farquaar commented Jul 1, 2022 •

edited

TheSmallBlue commented Jul 3, 2022 •

edited

tralezab commented Jul 3, 2022 •

edited

Iamgoofball commented Jul 3, 2022 •

edited

Mothblocks commented Jul 3, 2022 •

edited