Skip to content

to-shimo/chinese-word2vec

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 

chinese-word2vec

word2vec/glove/swivel binary file on chinese corpus

word2vec: https://code.google.com/p/word2vec/

glove: http://nlp.stanford.edu/projects/glove/

swivel: https://github.com/tensorflow/models/tree/master/swivel

        http://arxiv.org/abs/1602.02215

训练语料:百科网页+多个分类新闻语料(全半角转换后)

binary file: (Word2vec的二进制读取可以参考其代码)

    word2vec CBOW: utf8  2.18G

        http://pan.baidu.com/s/1qX334vE

    word2vec SKIP: utf8  2.17G

        http://pan.baidu.com/s/1bogTzfp

    word2vec CBOW: gb18030(精简语料)  2.31G

        http://pan.baidu.com/s/1jHumqjW

    word2vec SKIP: gb18030(精简语料)  1.47G

        http://pan.baidu.com/s/1ntVJBYD

    swivel: utf8  

        待上传

    glove: utf8   9.19G

        http://pan.baidu.com/s/1i3XowWP

其他:

    使用1.25亿英国twitter训练的词向量:

        参考地址:https://figshare.com/articles/UK_Twitter_word_embeddings/4052331

        下载:https://ndownloader.figshare.com/articles/4052331/versions/1

About

word2vec/glove/swivel binary file on chinese corpus

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published