Skip to content

slavastar/word2vec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Word2Vec

Implementation of the Word2Vec paper in PyTorch: Efficient Estimation of Word Representations in Vector Space

Most of the code was taken from this repository.

Description

The Word2Vec model was trained on Wiki2 dataset. The hyperparameters of the model can be found in the config.yaml file.

Project structure

  • src/
    • model/
      • cbow.py - implemented CBOW model.
      • skip_gram.py - implemented Skip-Gram model.
      • utils.py - contains common function used for models.
    • dataloader.py - contains functions for text preprocessing and collecting a dataset.
    • train.py - contains a full pipeline for training and saving the model.
    • training.py - contains a wrapper for training a model.
    • utils.py - contains common functions used by other modules.
  • config.yaml - config file with the main parameters of the model.
  • demo.ipynb - demo notebook with several examples of model inference with visualisation.

Usage

python train.py --config config.yaml

About

Word2Vec implementation in PyTorch.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors