Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

word_sort.data为空是怎么回事啊? #10

Closed
wwtBlog opened this issue May 16, 2017 · 11 comments
Closed

word_sort.data为空是怎么回事啊? #10

wwtBlog opened this issue May 16, 2017 · 11 comments

Comments

@wwtBlog
Copy link

wwtBlog commented May 16, 2017

解压之后在bin文件夹中执行命令,最后生成的文件有的为空,有的不为空。

@sing1ee
Copy link
Owner

sing1ee commented May 16, 2017

有什么错误么?

@wwtBlog
Copy link
Author

wwtBlog commented May 16, 2017

运行的时候没有提示错误,很快就运行完了,语料是网上下载的西游记txt,编码为utf-8
gen sorting...
2017-05-16 14:03:32 [main] INFO dict.build.FastBuilder -start to extract words
sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream@735b478
2017-05-16 14:03:32 [main] INFO dict.build.FastBuilder -build freq TST done!
2017-05-16 14:03:32 [main] INFO dict.build.FastBuilder -start to sort extracted words
2017-05-16 14:03:32 [main] INFO dict.build.FastBuilder -all done

@sing1ee
Copy link
Owner

sing1ee commented May 16, 2017

你用的是哪个代码?直接打包的还是? 看日志不是最新的代码

@wwtBlog
Copy link
Author

wwtBlog commented May 16, 2017

直接打包到本地mac上,解压之后运行的,我又下载了最新的代码,wordsort.data和word.data还是空的。
2017-05-16 14:25:58 [main] INFO dict.build.FastBuilder -start to extract words
sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream@735b478
2017-05-16 14:25:58 [main] INFO dict.build.FastBuilder -build freq TST done!
2017-05-16 14:25:58 [main] INFO dict.build.FastBuilder -start to sort extracted words
2017-05-16 14:25:58 [main] INFO dict.build.FastBuilder -all done
这是日志

@sing1ee
Copy link
Owner

sing1ee commented May 16, 2017

freqFile 这个文件有内容么

@sing1ee
Copy link
Owner

sing1ee commented May 16, 2017

你在哪里下载的文件我试试。

@wwtBlog
Copy link
Author

wwtBlog commented May 16, 2017

就是直接右上角download的啊,freq_ngram.data有内容,其他没有

@sing1ee
Copy link
Owner

sing1ee commented May 16, 2017

你的测试的数据文件,哪里下载的,我试试

@wwtBlog
Copy link
Author

wwtBlog commented May 16, 2017

@sing1ee
Copy link
Owner

sing1ee commented May 16, 2017

文件编码,不是UTF8. 下载这个:西游记UTF8

Top结果

师父	1545	7.247927513443586	3.527650290469952	0.1371395690812608
大圣	1047	6.426264754702098	2.5456136550742494	0.13128460061010055
唐僧	971	6.832890014164742	3.930083100341155	0.43277723258096173
我们	685	5.044394119358453	2.7293239374235605	0.5335176226675881
菩萨	607	9.364134655008051	3.4676913766943906	0.15910495734948696
妖精	604	6.894817763307944	2.955584582292545	0.13134411600669268
见那	438	3.1699250014423126	2.688761729223144	0.11444652908067542
徒弟	407	7.774787059601174	2.3951170564332958	0.15553809897879026
小妖	348	5.584962500721157	2.939566615637408	0.3392018779342723
一声	307	4.08746284125034	2.269805525984046	0.2674922938432581

@wwtBlog
Copy link
Author

wwtBlog commented May 16, 2017

好了,是文件的问题,太感谢了。

@sing1ee sing1ee closed this as completed May 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants