Skip to content
/ ritmom Public

Convert Google Translate Phrasebook to audio tracks

Notifications You must be signed in to change notification settings

ptytb/ritmom

Repository files navigation

RITMOM - Repetition is the mother of memory.

What this app does

It's for learning languages. Creates audio tracks from:

  • Your Google Translate favorites list. Feeds word and phrase pairs from.
  • Plain text file with foreign word list. Offline dictionaries are used for translation (WordNet, GoldenDict, Lingoes).

The idea is to use a sequence

  • foreign female slow
  • native female normal
  • example usage phrase female
  • foreign male slow
  • native female normal
  • example usage phrase male

for better memorization.

What it does not

Text to speech out of a box. It uses 3rd party TTS engines via Windows COM. You have to use some proprietary TTS engine to generate a good quality narration.

Usage

  1. Install TTS engines
  2. You will need ffmpeg executable in your PATH to convert your tracks to MP3.
  3. Save phrases to phrases/ dir, both csv or xls will fit:
  4. pip install -r requirements.txt
  5. Edit config.json file.

foreign and native parameters must contain substring of description, which can be listed with this command: python main.py -l

Download desired text corpuses to be able to generate context usage phrases. Corpuses are listed in a phraseExamples configuration option:

import nltk; nltk.download()

  1. Generate audio python main.py

Japanese support

For furigana support you'll need MeCab executable accessible in your PATH.

You'll need to install the requirements pip install -r requirements-jp.txt

Offline dictionaries

Supported dictionary formats:

TODO

  • Fix TidyUpText filter
  • Download and cache required resources (dictionaries, pre-trained models, etc)
  • Use TextChunk instead of plain text
  • Extract examples from offline dictionaries
  • Add subtitle (.srt) corpus reader and examples
  • Do try fuzzy matching if no translation found
  • Text: write approximate audio timestamp
  • Arbitrary Text Source (book, subtitles, news)
  • Generate text output too
  • Make a sequence configurable from a config
  • Fix wordnet for languages differ from English
  • Search for collocations
  • Download missing NLTK packages and corpuses automatically if missing
  • Fix index before running tasks
  • Add synonyms and antonyms
  • Use more threads for TTS
  • Pickle indices for corpus for quicker start
  • Add wordnet: definition (thesaurus, examples, excerpts)
  • Update requirements.txt

Releases

No releases published

Packages

No packages published