Skip to content

tuluce/iRun

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
app
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

iRun

irun logo

A web app to help with the pronunciation of Turkish words and phrases

Website: irun.fyi

Netlify Status

Scripts

  • Install dependencies : yarn install
  • Lint source code : yarn lint
  • Preprocess data : yarn preprocess
  • Start development server : yarn dev
  • Build and generate the app page to the /out directory : yarn export
  • Serve the generated page in the /out directory : yarn serve

How does it work?

Preprocessing steps - /preprocessing

  1. The words which do not exist in the standard English dictionary are filtered from CMUdict. generate-filtered-dict.js
  2. From the filtered CMUdict entries, a reverse mapping (from one pronunciation to possibly multiple words) is generated. generate-reverse-multimap.js
  3. The raw English word frequency data file is parsed. generate-frequency-map.js
  4. The words with the same pronunciation but lower usage frequency are eliminated from the reverse mapping. generate-reverse-map.js

Pronunciation algorithm - /pronunciation

  1. All possible syllable combinations are generated from the input Turkish word. hyphenate-all.js
  2. The letters in the syllables are written using the alternatives in CMUdict phonetic alphabet. phonetic-map.json
  3. The result is searched in the reverse mapping file. reverse-map.json
  4. If no match is found for a syllable, simple translations are applied to each letter. letter-pronunciation-map.json
  5. The results are sorted prioritizing:
  6. The first 10 of the best results are returned.

Example input: bahadır

  • (1). ['bah', 'ad', 'ır'], ['ba', 'had', 'ır'], ['bah', 'a', 'dır'], ['ba', 'ha', 'dır']
  • (2). [[['B', 'AA', 'HH'], ['AA', 'D'], ['AH0', 'R']], ... (all combinations) ... ]
  • (3, 4). ['baah-odd-er', 'bah-hud-er', 'baah-uh-derr', 'bah-huh-derr']
  • (5). ['bah-hud-er', 'bah-huh-derr', 'baah-odd-er', 'baah-uh-derr']

User interface - /app

  1. Consists of a single Next.js statically-generated page with no back-end.
  2. The reverse mapping file is loaded to the client app, so the algorithm runs on the browser.

References

About

A web app to help with the pronunciation of Turkish words

Resources

License

Stars

Watchers

Forks