Skip to content

boppreh/word2vec_bin_parser

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 

word2vec_bin_parser

Tiny Python script for parsing Word2Vec .bin embeddings.

Interested in running analyses on Word2Vec embeddings? Maybe you searched for a pre-trained embedding and found the 3.5GB GoogleNews-vectors-negative300.bin from https://code.google.com/archive/p/word2vec/? Stuck with a binary file and don't want to download a large machine learning library just to look at the vectors?

Your problems are over! Introducing word2vec_bin_parser, a tiny, tiny Python file for reading those monstrosities. You can use it as a library (bin2stream(path)), or a converter (word2vec_bin_parser file.bin). Because it's Python it's very slow, but feel free to look up the tiny source code and adapt into your program.

About

Tiny Python library for parsing Word2Vec .bin embeddings

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages