Skip to content

An interactive map of English words, where words with similar meaning appear closer together.

License

Notifications You must be signed in to change notification settings

anthonygarvan/wordgalaxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WordGalaxy: An Interactive Map of English

This project visualizes a large number (eg., 20,000) of words in two dimensions in such a way that words that are similar in meaning appear closer to one another, and words that are dissimilar are farther away from each other.

It yields an interactive, statically served site where users can explore the space by zooming & panning.

This project cherry picks the latest and greatest technologies in natural language processing and web development. To generate the vectors, I rely on Google's Word2Vec tool. Specifically, I used some preprocessed vectors run on 200-300 millions words of Wikipedia, code that was written by dhammock a few years ago, see here. The vectors are represented in 250 dimensions, so to visualize them they need to be collapsed into 2 or 3 dimensions. For this, I use this implementation of Barnes-Hut t-SNE.

For the presentation, I rely on PIXI.js, a game engine which uses WebGL when available and fails gracefully into canvas rendering. To get a headstart, I launched off this implementation, which is a library for plotting graphs that already had zoom & pan built in.

If you have questions, feature requests, or feedback of any kind, please don't hesitate to message me.

References

Barnes-Hut SNE

Efficient Estimation of Word Representations in Vector Space

About

An interactive map of English words, where words with similar meaning appear closer together.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages