Skip to content

Commit

Permalink
1.schema IPA版修訂幾個例字,與聲調版加上首2字母簡拼;
Browse files Browse the repository at this point in the history
2.dict 增補詞彙,來源開放粵語詞典同CC-Canto,並除去八股文中已有都詞彙,同時對不少開放粵語詞典的同音替代字換回本字,CC-Canto數據源爲網友所寫,可能有生僻字或替代字,已儘量修補。
  • Loading branch information
leimaau committed Oct 25, 2019
1 parent 533b82e commit bd8349b
Show file tree
Hide file tree
Showing 3 changed files with 20,743 additions and 7 deletions.

6 comments on commit bd8349b

@laubonghaudoi
Copy link
Member

@laubonghaudoi laubonghaudoi commented on bd8349b Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

今次新增嘅開放粵語詞典嘅兩萬幾個詞彙都放喺碼表最後,係唔係因爲仲未校對完?等校對完之後最好將佢哋撈埋原先嘅嗰啲詞組重新排序,噉樣方便管理。

而且我見到開放粵語詞典入邊有好多詞好似都奇離,好似丫挺 aa1 ting5唔知係咩來嘅。仲有就係,呢啲詞組係唔係同目前rime-jyutping無聲調版碼表入邊最尾嘅未標音詞組一樣嘅?

@leimaau
Copy link
Collaborator Author

@leimaau leimaau commented on bd8349b Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

太多詞㗎啦所以無可能校對完美嘅,只能夠儘量校對,睇到邊啲有問題就校對邊啲,可能永無校對完成之日,撈到一起都好麻煩,所以放到碼表最屘。
開放粵語詞典我只係取比較有用比較清楚嘅詞來加入,所以係精選主要嘅部分然後加上CC-Canto嘅詞彙部分,我冇參照過rime-jyutping無聲調版碼表後便嘅詞,所以兩便相同唔相同嘅詞都可能會有。

@laubonghaudoi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好嘅,噉呢啲詞加入去嗰陣有無將嗰個佢哋同前面啲詞去重?可能會有好多重複。

@leimaau
Copy link
Collaborator Author

@leimaau leimaau commented on bd8349b Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我已經使用數據庫去過重了,唔會重複,除非之前手工加入嘅時候唔留意多加一個,上半節LSHK詞表嘅部分,但呢種情況概率小啲。

@chaaklau
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好似丫挺 aa1 ting5唔知係咩來嘅。

「丫挺,北京方言,粗话。是“丫头养的”的连读。」

@laubonghaudoi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好似丫挺 aa1 ting5唔知係咩來嘅。

「丫挺,北京方言,粗话。是“丫头养的”的连读。」

其實我有諗過一個問題,就係有無必要將似乎入邊嘅北方話詞彙同粵語詞彙分開來,不過噉樣又好似太麻煩

Please sign in to comment.