Skip to content
This repository has been archived by the owner on Jan 12, 2023. It is now read-only.
/ Gillajab-i Public archive

2021 데이터청년캠퍼스 고려대학교 과정 4조

Notifications You must be signed in to change notification settings

blackdurumi/Gillajab-i

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

header

길라잡이 : 외국인을 위한 한류 콘텐츠 기반 발음 평가 서비스

팀 소개(Team Members)

What is Gillajab-i, and why did we make it?

Gillajab-i is a program that helps a non-native Korean in perfecting his/her Korean Pronunciation. Due to the steep popularity rise in K-Pop and all things Korean, the number of non-native Korean learners are at record high. So in order to help these Korean learners, kids from none other than Korea University made this app to help learners prefect their Korean through CAPT (Computer Aided Pronunciation Training).

What does it do?

Gillajab-i first lets the user (thats you! 😊) choose from a number of videos featuring their favorite K-content. Then Gillajab-i Automatically recognizes the Audio of the video, then

Automatically recognizes the video audio

Converts the video audio into Korean text

Converts the Korean text into Romaji (for the user to read)

Converts the Korean text into English (for the user to understand)

So that the user can really understand their favorite idol is sayting. After reading the Korean out loud (via Romaji) and practicing his/her pronunciation, the user can click the Start Gillajab-i! button to record her Korean.

After the recording, Gillajab-i automatically detects the users Korean, then

Splits it into Phonemes.

Then, Calculates the pronunciation rate of the user compared to the original pronunciation(of the video) using CER

And highlights where the user's pronunciation was off using Levenshtein distance. Neat huh? 😎

How does it do it?

First of all, we trained a brand new ASR model that specifies in recognizing the phonemes of the spoken Korean Language.

We used ESPNet, an end-to-end speech recognition toolkit to train our model.

First, we collected 1000 hours of raw Korean speaking data, and used Short-time Fourier transforming to convert the audio data into Mel-Spectrograms. The Mel-Spectrograms were then fed into a CNN, thus allowing the model to actually "read the audio!"

Then, the output vectors of the CNN model is fed through 12 layers of Encoding, and 6 Layers of Decoding, and results in a model that is able to detect the phoneme sequence of the Korean Spoken Language!

About

2021 데이터청년캠퍼스 고려대학교 과정 4조

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages