word2vec/glove/swivel binary file on chinese corpus
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md Update README.md Oct 25, 2016

README.md

chinese-word2vec

word2vec/glove/swivel binary file on chinese corpus

word2vec: https://code.google.com/p/word2vec/

glove: http://nlp.stanford.edu/projects/glove/

swivel: https://github.com/tensorflow/models/tree/master/swivel

        http://arxiv.org/abs/1602.02215

训练语料:百科网页+多个分类新闻语料(全半角转换后)

binary file: (Word2vec的二进制读取可以参考其代码)

    word2vec CBOW: utf8  2.18G

        http://pan.baidu.com/s/1qX334vE

    word2vec SKIP: utf8  2.17G

        http://pan.baidu.com/s/1bogTzfp

    word2vec CBOW: gb18030(精简语料)  2.31G

        http://pan.baidu.com/s/1jHumqjW

    word2vec SKIP: gb18030(精简语料)  1.47G

        http://pan.baidu.com/s/1ntVJBYD

    swivel: utf8  

        待上传

    glove: utf8   9.19G

        http://pan.baidu.com/s/1i3XowWP

其他:

    使用1.25亿英国twitter训练的词向量:

        参考地址:https://figshare.com/articles/UK_Twitter_word_embeddings/4052331

        下载:https://ndownloader.figshare.com/articles/4052331/versions/1