[week 8] 본 논문에서 제안하는 character n-gram을 활용하면 접두사와 접미사에 대한 정보를 알 수 있나요? #31

HanNayeoniee · 2022-05-05T12:27:18Z

character n-gram을 구성할 때, 첫 번째 subword에는 <를 붙이고 마지막 subword에는 >를 붙인다고 합니다.
저는 이 기호가 subword 집합의 시작과 끝을 알리는 역할이라고 생각했는데요, <과 >를 통해 해당 단어의 접두사와 접미사에 대한 정보를 알 수 있나요?

Each word w is represented as a bag of character n-gram. We add special boundary symbols < and > at the beginning and end of words, allowing to distinguish prefixes and suffixes from other character sequences.

단어 where의 character n-gram(n=3)

xuio-0528 · 2022-05-05T16:10:56Z

<가 붙은 경우에는 접두사, >가 붙은 경우에는 접미사로 평가되게 됩니다.
그렇게 되면 단어 중간에 접두사 형태가 나오더라도 ex) <im , im 두 단어는 다르게 학습되게 되어서 접두사, 접미사를 보다 정확하게 학습할 수 있는 것 같습니다.

HanNayeoniee closed this as completed Jun 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[week 8] 본 논문에서 제안하는 character n-gram을 활용하면 접두사와 접미사에 대한 정보를 알 수 있나요? #31

[week 8] 본 논문에서 제안하는 character n-gram을 활용하면 접두사와 접미사에 대한 정보를 알 수 있나요? #31

HanNayeoniee commented May 5, 2022 •

edited

xuio-0528 commented May 5, 2022

[week 8] 본 논문에서 제안하는 character n-gram을 활용하면 접두사와 접미사에 대한 정보를 알 수 있나요? #31

[week 8] 본 논문에서 제안하는 character n-gram을 활용하면 접두사와 접미사에 대한 정보를 알 수 있나요? #31

Comments

HanNayeoniee commented May 5, 2022 • edited

xuio-0528 commented May 5, 2022

HanNayeoniee commented May 5, 2022 •

edited