Skip to content

domluna/Glove.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Glove

Build Status Coverage Status

Implements Global Word Vectors.

using Pkg
Pkg.add("https://github.com/domluna/Glove.jl.git")

See benchmark/perf.jl for a usage example.

Here's the rough idea:

  1. Take text and make a LookupTable. This is a dictionary that has a map from words -> ids and vice-versa. Preprocessing steps should be taken prior to this.

  2. Use weightedsums to get the weighted co-occurence sum totals. This returns a CooccurenceDict.

  3. Convert the CooccurenceDict to a CooccurenceVector. The reasoning for this is faster indexing when we train the model.

  4. Initialize a Model and train the model with the CooccurenceVector using the agagrad! method.

It's pretty fast at this point. On a single core it's roughly 3x slower than the optimized C version.

TODO

  • [ ] More docs.

  • [ ] See if precompile(args...) does anything

  • [ ] Notebook example ( has to have emojis )

  • [ ] Multi-threading

About

Implements Global Word Vectors.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published