Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FilterCounter #30

Open
e9t opened this issue Oct 10, 2014 · 1 comment
Open

Add FilterCounter #30

e9t opened this issue Oct 10, 2014 · 1 comment

Comments

@e9t
Copy link
Member

e9t commented Oct 10, 2014

Inherit Counter from collections, and apply filters.

Some filter examples:

  • Minimum length of terms (minsyl)
  • (General) Stopwords
  • Include/exclude specific (POS or other kind of) tags

Usage example:

from konlpy.corpus import kolaw
from konlpy.tag import Hannanum
from konlpy.utils import FilterCounter

doc = kolaw.open('constitution.txt').read()
pos = Hannanum().pos(doc)
cnt = FilterCounter(pos, minsyl=2, stopwords=True, ta

cnt = FilterCounter(pos, minsyl=2, stopwords=True, postags="N|V|EF|^XPV|^SF")
cnt.most_common(20)
@e9t
Copy link
Member Author

e9t commented Nov 14, 2014

Filter과 counter는 따로 있는게 좋겠다는 의견이 있었습니다.
(Sequence를 보존하고 싶은 경우가 있을 수 있음.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant