-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
擴充詞庫 #10
Comments
(應該打錯咗檔案名?) Emoji 喺推廣輸入法方面好有用,長遠要大幅擴張至係 :) 如果有餘力的話,應該要每個 emoji 加十個八個相關詞落去。 |
@laubonghaudoi 兩個詞庫分開之後如果用戶嗰便能夠自動安裝就冇問題,若果都係要手動安裝兩個詞庫同opencc都係一件麻煩事。 @chaaklau Emoji 係可以通過OpenCC_Emoji來加嘅,做法同繁簡轉換一樣,下個更新可以加上。 |
emoji嘅支持係可以另外自己安裝嘅,就係用呢個倉庫https://github.com/rime/rime-emoji 具體操作就係,運行下面呢行命令,然後重新佈署,就可以打emoji了。
|
我之前搵到咗呢個倉庫:https://github.com/ziloeng/rime-jyut6ping3
呢個倉庫入邊有好豐富嘅粵語詞彙,我亦都徵得咗作者嘅同意,可以將入邊嘅數據加到我哋嘅碼表度。所以我而家諗住下一步就整合呢啲詞彙。呢個倉庫入邊有5個詞庫文件:
jyut6ping3.dict.yaml
單字字音碼表,例子Unihan嘅kCantonse。呢部分我哋已經解決,可以忽略。jyut6ping3.dict.yaml
少量emoji碼表,可以忽略。jyut6ping3.vocabulary.dict.yaml
大量粵語詞彙,其中有1萬1千條有標粵拼,剩低9萬幾條剩得個詞組,冇標粵拼。呢部分係我哋要重點考慮嘅jyut6ping3.vocabulary.emoji.dict.yaml
3千幾條粵語詞彙,冇標粵拼,可以都加入(唔知同上面有乜唔同,點解要分出來)。所以我而家打算先增補呢部分詞彙。另外有一個問題就係,因爲呢啲詞彙數量太大,無辦法一次過手工檢查晒,所以我推薦先將呢部分詞彙放喺另外一個文件
jyut6ping3.vocabulary.dict.yaml
入邊,包括@leimaau 之前提交bd8349b 加嘅兩萬個詞條,都整合放到呢個文件入邊,統一以後收到反饋再修改維護。噉樣好唔好?最後有個問題就係,如果我哋加入晒呢啲詞彙,話唔定可以取消使用個自帶八股文詞庫嘅設定。因爲呢度嘅詞彙已經足夠多,而且可以避免打出一啲官話詞彙。當然呢一點要到時試過先知。
The text was updated successfully, but these errors were encountered: