Skip to content

Commit

Permalink
Update NounMatchTokenizer usgae in LRNounExtractor (fix #105)
Browse files Browse the repository at this point in the history
  • Loading branch information
lovit committed Sep 12, 2020
1 parent 02e6441 commit 5bce5d7
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions soynlp/noun/lr.py
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,13 @@ def get_noun_tokenizer(self):
>>> noun_tokenizer.tokenize(sentence, concat_compound=False)
$ ['네이버', '뉴스', '기사', '이용', '학습', '모델', '예시']
>>> noun_tokenizer.tokenize(sentence, flatten=False)
$ [[Token(네이버, score=1.0, offset=(0, 3))],
[Token(뉴스기사, score=0.972972972972973, offset=(5, 9))],
[Token(이용, score=0.9323344610923151, offset=(11, 13))],
[Token(학습, score=0.9253731343283582, offset=(16, 18))],
[Token(모델예시, score=1.0, offset=(20, 24))]]
"""
if not self.is_trained:
raise RuntimeError('Train LRNounExtractor firts. LRNonuExtractor().extract(train-data)')
Expand Down

0 comments on commit 5bce5d7

Please sign in to comment.