Skip to content

ArdalanM/gensim2tensorboard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gensim2tensorboard:

Train word embeddings with gensim and visualize them with TensorBoard.

fig

Requirements:

pip install regex gensim tensorflow

Installation:

git clone https://github.com/ArdalanM/gensim2tensorboard.git

Example:

  • Train from text file:
python3 -m src.train --file "data/SMSSpamCollection.txt" \
                     --input_type 'txt' \
                     --folder "models/SMSSpamCollection"
                     --size 50 \
                     --alpha 0.025 \
                     --window 5 \
                     --min_count 5 \
                     --sample 1e-3 \
                     --seed 1 \
                     --workers 4 \
                     --min_alpha 0.0001 \
                     --sg 0 \
                     --hs 0 \
                     --negative 10 \
                     --cbow_mean 1 \
                     --iter 5 \
                     --null_word 0
  • Train from csv file:
python3 -m src.train --file "data/movie_reviews.csv" \
                     --input_type "csv" \
                     --separator "," \
                     --folder "models/movie_reviews" \
                     --columns_to_select "Phrase" \
                     --size 50 \
                     --alpha 0.025 \
                     --window 5 \
                     --min_count 5 \
                     --max_vocab_size 100000 \
                     --sample 1e-3 \
                     --seed 1 \
                     --workers 4 \
                     --min_alpha 0.0001 \
                     --sg 0 \
                     --hs 0 \
                     --negative 10 \
                     --cbow_mean 1 \
                     --iter 5 \
                     --null_word 0

Eventially, visualize the embeddings with tensorboard: run tensorboard from the project root folder.

tensorboard --logdir=models/ --reload_interval 1

About

Train word embeddings with Gensim and vizualize them with TensorBoard

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages