Skip to content

yunhak0/transliteration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Korean Character Romanization

Romanized the Korean to English

This is not perfect for Korean pronunciation because the phonological process is not reflected.

Characteristic of Korean Character

hanguel

Onset

KOREAN(onset) ROMAN
g
kk
n
d
tt
r
m
b
pp
s
ss
j
jj
ch
k
t
p
h

Nucleus

KOREAN(Nucleus) ROMAN
a
ae
ya
yae
eo
e
yeo
ye
o
wa
wae
oe
yo
u
wo
we
wi
yu
eu
ui
i

Coda

KOREAN(Coda) ROMAN
k
k
k
n
n
n
t
l
k
m
p
t
t
p
l
m
p
p
t
t
ng
t
t
k
t
p

Usage

import pandas as pd
from kor2eng import kor_romanizied

from dask import dataframe as dd
from dask.multiprocessing import get
from multiprocessing import cpu_count

# Import data set --------------------------------------------------
korean_words = pd.DataFrame({"KOREAN": ["안녕하세요",
                                        "감사합니다",
                                        "반갑습니다"],
                             "MEANING": ["Hi",
                                         "Thank you",
                                         "Nice to meet you"]})

# Romanized --------------------------------------------------------
nCores = cpu_count()
romanized_string = dd.from_pandas(korean_words, npartitions=nCores).\
   map_partitions(
      lambda df : df.apply(
         lambda x : kor_romanizied.split_kor(x.KOREAN), axis=1)).\
   compute(scheduler='processes')

korean_words["ROMANIZED"] = romanized_string

print(korean_words)
#   KOREAN           MEANING            ROMANIZED
# 0  안녕하세요                Hi  an-nyeong-ha-se-yo-
# 1  감사합니다         Thank you    gam-sa-hap-ni-da-
# 2  반갑습니다  Nice to meet you  ban-gap-seup-ni-da-

About

Romanized the Korean to English

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages