Skip to content

zhixiaochuan12/ChineseNounPhraseExtraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Chinese noun phrase extraction

code modified from phrasemachine project

method referred from Justeston and Katz (1995)

key idea: extract noun phrases by pattern ((A|N)+|(A|N)*(NP)?(A|N)*)N

phrasemachine only supports English, this project can extract Chinese noun phrases by jieba.

Example

Input(Chinese string)

"中华人民共和国位于亚洲东部,太平洋西岸,是工人阶级领导的、以工农联盟为基础的人民民主专政的社会主义国家。"

Output(index of extracted noun phrases)

[(0, 1), (2, 3), (5, 6), (9, 10), (9, 11), (10, 11), (16, 17), (18, 19), (18, 20), (19, 20), (21, 22), (21, 23), (22, 23)]

About

使用词性模板抽取中文语料中的名词短语

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages