Skip to content

Commit

Permalink
Replace Zenkaku-Space to Zenkaku-Underscore
Browse files Browse the repository at this point in the history
Remove disallowed UTF-8 whitespace character
  • Loading branch information
yamachu committed Oct 11, 2017
1 parent 5fbdc66 commit e8102c2
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions egs/csj/s5/local/csj_make_trans/csj2kaldi4m.pl
Expand Up @@ -245,6 +245,11 @@
$morph =~ s/\/\//\//g;
$morph =~ s/\/\//\//g;
$morph =~ s/\/$//g;
# Replace Zenkaku-Space to Zenkaku-Underscore
# Input: っしゃっ+動詞/ラ行五段/連用形/促音便 省略 r a q sh a q
# ->
# Output: っしゃっ+動詞/ラ行五段/連用形/促音便_省略 r a q sh a q
$morph =~ s/ /_/g;

unless( $word =~ /skip_word/){
if ($word && $pos){
Expand Down

0 comments on commit e8102c2

Please sign in to comment.