Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

每个doc的向量如何获取? #4

Open
tangbogreat opened this issue Aug 12, 2016 · 1 comment
Open

每个doc的向量如何获取? #4

tangbogreat opened this issue Aug 12, 2016 · 1 comment

Comments

@tangbogreat
Copy link
Contributor

您好,有几个疑问:
我make完之后,执行了train这个工程,然后程序就执行结束了,之后就没有任何的回馈信息了——就是每个doc的向量存储在哪儿的?如果我要计算与"苹果"这个词最近相似度的词的话,如果写测试代码?在train.cpp里面自己加进去?这样确实有点。。。

如何处理中文文档的?我显示出来是乱码的。

@hiyijian
Copy link
Owner

Doc2Vec这个类里有获得特定词的TOPK个相似词的方法:
bool word_knn_words(const char * search, knn_item_t * knns, int k);
如果要得到特定文档的向量,可以调用Doc2Vec的这个方法:
void infer_doc(TaggedDocument * doc, real * vector, int skip = -1);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants