このリポジトリについて

はじめに

Discordでの音声チャンネルでの会話を、テキストチャンネルで話者ごとに文字起こししてくれるbotを動かすためのプログラムです。

オリジナルのコードはIlya Nevolin氏が開発しており、それにNicklas Vedsted氏が手を加えたものをベースにしています。お二人やその他開発に寄与された方に感謝します。

Renderの無料プランで運用するための改修などを行なっています。

オリジナルのReadmeの記述は日本語の後にそのまま残しています。

導入方法

botの準備
- DiscordのDeveloper portalでbotを作成
- 使用したいサーバーに導入
botを動かすプログラムの準備
- GitHubのアカウントを用意する
- このデポジトリをforkする
Google Speech-to-Textを使用できるようにする
- Google Cloudのアカウントを作成する
- Google Speech-to-Text APIを有効にする
- 新しいサービスアカウントを作成し、キーを保存
Renderにデプロイする
- GitHubのアカウントと接続する
- settings.jsonとgspeech_key.jsonをSecret Filesとして保存する

DiscordEarsBot

A speech-to-text bot for Discord written in NodeJS. Can be useful for hearing impaired and deaf people.

Getting Started:

Installation Tutorial

YouTube: https://www.youtube.com/watch?v=IKIlnaCDZcI

Try the bot for yourself on our Discord server: https://discord.gg/ApdTMG9

Developers

Heroku

If you don't have a linux server/machine then you can use Heroku for hosting your bot 24/7 and it's free.

Fork this GitHub repository
Create Discord Bot, Invite it to your server and get the API Token
Create new Heroku app, use the GitHub method and Deploy DiscordEarsBot
Under "resources" disable "web" and enable "worker" dyno instead.
Provide the DISCORD_TOK Config Var under "settings"

Manual Installation

You need nodeJS version 12.x or 14.x with npm on your machine, use node -v to check your version. Execute the following commands:

git clone https://github.com/healzer/DiscordEarsBot.git
cd DiscordEarsBot
npm install

Proivde the Discord API Token using DISCORD_TOK Env Variable or in settings.json.

Finally run node index.js. You can also use pm2 or nodemon to keep the bot running 24/7.

Usage

By now you have a discord server, the DiscordEarsBot is running and is a part of your server. Make sure your server has a text and voice channel.

Enter one of your voice channels.
In one of your text channels type: *join, the bot will join the voice channel.
Everything said within that channel will be transcribed into text (as long as the bot is within the voice channel).
Type *leave to make the bot leave the voice channel.
Type *help for a list of commands.

notes:

When the bot is inside a voice channel it listens to all speech and transcribes audio into text.
Each user is a separate audio channel, the bot hears everyone separately.
Only when your user picture turns green in the voice channel will the bot receive your audio.
A long pause interrupts the audio input.
For Google Speech & WitAI: The duration of a single audio input is limited to 20 seconds, longer audio is not transcribed.

Speech Services

YouTube comparison and tutorial for developers on choosing the right Speech API: https://www.youtube.com/watch?v=fQcEZIgw_LA

Vosk API

This is our default Speech-to-Text method. The Vosk API is a free & open-source solution that runs locally (offline). By default only english is enabled. Developers can change or include more language models from here: https://alphacephei.com/vosk/models

WitAI

Installation:

set SPEECH_METHOD to witai
use your Server Access Token for WITAI_TOK

WitAI supports over 120 languages (https://wit.ai/faq), however only one language can be used at a time. If you're not speaking English on Discord, then change your default language on WitAI under "settings" for your app.

You can also change the language using the following bot command: *lang <code> <code> should be an ISO 639-1 language code (2 digits): https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes

Google Speech API

You can use Google's Speech-to-Text API as follows:

set SPEECH_METHOD to google
For non-English transcriptions: open index.js, inside the function transcribe_gspeech change the value of languageCode.
Enable Google Speech API here: https://console.cloud.google.com/apis/library/speech.googleapis.com
Create a new Service Account (or use your existing one): https://console.cloud.google.com/apis/credentials
Create a new Service Account Key (or use existing) and download the json file.
Put the json file inside your bot directory and rename it to gspeech_key.json.

Mozilla DeepSpeech (experimental)

Using Mozilla DeepSpeech for speech recognition, tutorial.

Contact

For enquiries or issues get in touch with me:

Name: Ilya Nevolin

Email: ilja.nevolin@gmail.com

Discord: https://discord.gg/ApdTMG9

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
vosk_models/en		vosk_models/en
.gitignore		.gitignore
Dockerfile.sample		Dockerfile.sample
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
ecosystem.config.js		ecosystem.config.js
index.js		index.js
keep-alive.js		keep-alive.js
nodemon.json		nodemon.json
package-lock.json		package-lock.json
package.json		package.json
settings-sample.json		settings-sample.json

License

ssfuno/DiscordEarsBot

Folders and files

Latest commit

History

Repository files navigation