Skip to content

Commit

Permalink
WordExtractor 리팩토링 결과 반영 (#115)
Browse files Browse the repository at this point in the history
  • Loading branch information
lovit committed Mar 28, 2021
1 parent 1f7717c commit 47c5b6e
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions tests/test_tokenizers.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,9 +149,8 @@ def test_maxscore_tokenizer_usage():
sents = [sent.strip() for doc in f for sent in doc.split(" ")]
sents = [sent for sent in sents if sent][:10000]
word_extractor = WordExtractor()
word_extractor.train(sents)
cohesion_scores = word_extractor.all_cohesion_scores()
cohesion_scores = {l: cohesion for l, (cohesion, _) in cohesion_scores.items()}
cohesion_scores = word_extractor.extract(sents, extract_cohesion_only=True)["cohesion"]
cohesion_scores = {l: score.leftside for l, score in cohesion_scores.items()}
tokenizer = MaxScoreTokenizer(cohesion_scores)

for i, sentence in enumerate(
Expand Down

0 comments on commit 47c5b6e

Please sign in to comment.