Skip to content
This repository has been archived by the owner on May 14, 2024. It is now read-only.

Commit

Permalink
Merge pull request #13 from i3thuan5/有的無的
Browse files Browse the repository at this point in the history
有的無的
  • Loading branch information
Wenli Tsai committed Jun 4, 2019
2 parents a8787c1 + fd28989 commit c89773c
Show file tree
Hide file tree
Showing 4 changed files with 24 additions and 3 deletions.
18 changes: 17 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,26 @@

教育部書寫有規定,輕聲符一定kah頭前的詞連寫;詞頻書寫愛看狀況,將非詞組的詞用空白隔開,方便計算詞頻。


## 開發

### 試驗

```
python -m unittest
```
```

### 輕聲詞資料

紀錄輕聲詞屬啥款情形
- 分寫
- 佮頭前詞分開
- 連寫
- 佮頭前詞相接
- 不處理
- 動詞補語

動詞補語的用法較複雜,所致目前嘛làng過無處理。親像「leh」:
- 拭拭咧 tshit-tshit--eh,連寫,疊字詞+咧
- 看覓咧 khuànn-māi --eh,分寫,動詞+咧
- 佇咧 tī--eh,連寫,詞彙化
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from setuptools import setup, find_packages

版本 = '1.0.2'
版本 = '1.0.3'

setup(
name='khin1siann1_hun1sik4',
Expand Down
5 changes: 5 additions & 0 deletions 試驗/分析/test輕聲分析器整合試驗.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,8 @@ def test例外有的無的(self):
self.漢字 = '有的無的'
self.原臺羅 = 'ū--ê-bô--ê'
self.按算臺羅 = 'ū--ê-bô--ê'

def test例外有的無的大寫(self):
self.漢字 = '有的無的'
self.原臺羅 = 'Ū--ê-bô--ê'
self.按算臺羅 = 'Ū--ê-bô--ê'
2 changes: 1 addition & 1 deletion 輕聲分析/分析.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ def 輕聲分析(self, 句物件):
def 拆開輕聲(self, 句物件):
規句的詞陣列 = []
for 詞物件 in 句物件.網出詞物件():
if 詞物件.看分詞() == '有-的-無-的|ū-ê-bô-ê':
if 詞物件.看型() == '有的無的':
新的詞陣列 = [詞物件.篩出字物件()]
else:
# 無分開的愛分開
Expand Down

0 comments on commit c89773c

Please sign in to comment.