Skip to content

GunjanDhanuka/word2vec_vis

Repository files navigation

word2vec_vis

Check out the app live at Streamlit Cloud: StreamLit link

The webapp is based on the Efficient Estimation of Word Representations in Vector Space paper. Read it here.

Screencast

screencast.webm

Features:

  1. Upload your own text corpus, or even a CSV dataset.
  2. Train the Word2Vec on the fly using custom parameters.
  3. Choose either PCA or TSNE as your dimensionality reduction technique.
  4. Visualize the word in either 2-D or 3-D space.
  5. Get similar words for each word, with similarity scores.
  6. Option to tune the number of words you wish to see for each input.

Steps to install locally:

  1. Setup a virtual environment using Conda or any other method you prefer.
  2. Install the dependencies from requirements.txt.
  3. Run the following in the terminal.
    pip install -U pip setuptools wheel
    pip install -U spacy
    python -m spacy download en_core_web_sm
    
  4. Run streamlit run app.py in the terminal to launch the web app.

About

Semantic Word Embeddings Visualizer that has the option to train on your own data. Get similar words from a large text corpus and get cool 2D and 3D plots!

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published