A Discord Speech-to-Text module made with Vosk. Yes, it works!
In Discord settings, if you are not in the Push-to-Talk mode in "Voice & Video" and you are in the Voice Activity mode, you must disable "Automatically determine input sensivity" option and set it to lowest value. Like in this screenshot:
- Download a model. You can click here to find models.
- Extract the zip file to a folder.
- Rename the folder as "model" and put it in the root of your project or a path you want.
- If you put the model folder at a custom path then edit the model path in the code.
- Install ffmpeg. (
apt install ffmpeg
for Ubuntu) - Install this npm module with
npm i assisky
options:
{
"voskLogLevel": int, // -1 to disable logging of Vosk
"modelPath": "Path to downloaded model folder", // Default: ./model
}
Returns an event for sending recognition results. See example for details.
userId: A Discord user ID that joined to a voice channel.
discordVoiceConnection: The voice connection of the bot. (Bot should join the same VC with the user.) See this for more details.
userId: A Discord user ID that joined to a voice channel and recognition for user has been started and progressing.
channelId: A Discord channel ID, bot will stop listening everyone in the channel. Useful to run before leaving a VC.
Bot returns an Object which contains current listening users' streams, connection, userId. {discordAudio,PCMToMP3,wavReader,rec,connection,userId}
The config object that recieved from Assisky.setup() function.
Note: In the example below, I've used the Turkish language model but you can use different models as you want.
"bu bir denemedir" means "this is a test"
"merhaba dünya" means "hello world"