Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LRNounExtractor_v2 후처리 함수 #13

Open
lovit opened this issue Jun 27, 2018 · 1 comment
Open

LRNounExtractor_v2 후처리 함수 #13

lovit opened this issue Jun 27, 2018 · 1 comment

Comments

@lovit
Copy link
Owner

lovit commented Jun 27, 2018

후처리 과정에서 걸러지는 단어 예시

  • 확실한 경우 : '이상으로'
  • 문맥에 따라 다른 경우 : '최고가', '만족도'

'최고가', '만족도'는 문맥에 따라서 true 일수도 false 일수도 있으나, base postprocessor 로는 이를 구분할 수 없음.

@lovit
Copy link
Owner Author

lovit commented Jun 29, 2018

N = N + J 인지 확인하는 과정에서 '지금은' 은 '지금 + 은'이기 때문에 걸러진다. 하지만, '이력서', '고양이' 역시 '이력+서', '고양 +이'로 걸러진다.

lrgraph_origin.get_r('점심은')

[('', 9581),
 ('?', 1714),
 ('요?', 324),
 ('요', 65),
 ('용?', 61),
 ('여?', 39),
 ('여', 15),
 ('용', 13),
 ('유?', 10),
 ('유', 5)]

lrgraph_origin.get_r('이력서')

[('', 1469),
 ('에', 88),
 ('는', 81),
 ('를', 69),
 ('도', 62),
 ('랑', 52),
 ('가', 30),
 ('만', 28),
 ('?', 23),
 ('나', 16)]

lrgraph_origin.get_r('고양이')

 [('', 2814),
 ('가', 665),
 ('는', 406),
 ('랑', 210),
 ('도', 187),
 ('들', 139),
 ('를', 106),
 ('?', 93),
 ('야', 74),
 ('한테', 64)]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant