Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何加载自定义词典? #31

Closed
jiyzhang opened this issue Jun 17, 2017 · 2 comments
Closed

如何加载自定义词典? #31

jiyzhang opened this issue Jun 17, 2017 · 2 comments

Comments

@jiyzhang
Copy link

jiyzhang commented Jun 17, 2017

ansj加载自定义词典

尝试修改了library.properties, 添加了自定义词典

#path of userLibrary this is default library
dic=library/default.dic

user defined dictionary

dic_name=library/name.dic
dic_company=library/company.dic
dic_term=library/term.dic

#redress dic file path
ambiguityLibrary=library/ambiguity.dic
stop_dic1=library/stop.dic
synonymsLibrary=library/synonyms.dic
#set real name
isRealName=true

#isNameRecognition default true
isNameRecognition=true

#isNumRecognition default true
isNumRecognition=true

#digital quantifier merge default true
isQuantifierRecognition=true

测试程序启动时,也找到了对应的词典文件

六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init dic_term to env value is : library/term.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init stop_dic1 to env value is : library/stop.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init dic_name to env value is : library/name.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init ambiguityLibrary to env value is : library/ambiguity.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init isQuantifierRecognition to env value is : true
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init dic_company to env value is : library/company.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init isRealName to env value is : true
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init synonymsLibrary to env value is : library/synonyms.dic
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init isNumRecognition to env value is : true
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init isNameRecognition to env value is : true
六月 17, 2017 2:49:30 下午 org.ansj.util.MyStaticValue info
信息: init dic to env value is : library/default.dic
六月 17, 2017 2:49:30 下午 org.ansj.dic.impl.File2Stream info
信息: path to stream library/ambiguity.dic
六月 17, 2017 2:49:30 下午 org.ansj.library.AmbiguityLibrary info
信息: load dic use time:1 path is : library/ambiguity.dic
六月 17, 2017 2:49:30 下午 org.ansj.dic.impl.File2Stream info
信息: path to stream library/default.dic
六月 17, 2017 2:49:31 下午 org.ansj.library.DicLibrary info
信息: load dic use time:1249 path is : library/default.dic
六月 17, 2017 2:49:32 下午 org.ansj.library.DATDictionary info
信息: init core library ok use time : 736
六月 17, 2017 2:49:32 下午 org.ansj.library.NgramLibrary info
信息: init ngram ok use time :510

分词使用的是 DictAnalysis:
Result terms = DicAnalysis.parse(sent1);

但分词结果一直没有变化,实在是找不到原因了,还望大侠解救。

环境如下:
OS: macOS 10.12.5
JDK: 1.8.0_65
ansj-seg: 5.1.2
nlp-lang: 1.7.2

BTW, 使用 DictLibrary.insert添加新词后没问题。

@shi-yuan
Copy link
Member

配置了,但是没使用,词典没生效,试试:

DicAnalysis.parse(sent1, DicLibrary.gets("dic_name", "dic_company", "dic_term"))

@jiyzhang
Copy link
Author

非常感谢,使用上面的语句,自定义词典已经生效

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants