Skip to content

This repository implements different architectures for training word embeddings.

License

Notifications You must be signed in to change notification settings

sindre0830/Word-Vectors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Word-Vectors

Overview

This repository implements different architectures for training word embeddings. The architectures include Continuous Bag-of-Words (CBOW), skip-gram, and Global Vectors for Word Representation (GloVe). Wikipedia articles is used as training data, while the Google Analogy dataset and the WordSim353 dataset is used for validating the word embeddings.

Setup

  1. Install required python version 3.11
  2. Install required packages pip install -r source/requirements.txt (We recommend using virtual environment, follow guide under Virtual Environment Setup below and skip this step)
  3. Run program python source/main.py

Virtual Environment Setup

Windows

  1. Get the package pip install virtualenv
  2. Create a new empty instance of python environment py -3.11 -m venv ./.venv
  3. Activate the environment source .venv/Scripts/activate
  4. Install the packages required by this project pip install -r source/requirements.txt

Linux

  1. Get the package pip install virtualenv
  2. Create a new empty instance of python environment python -m venv ./.venv
  3. Activate the environment source .venv/bin/activate
  4. Install the packages required by this project pip install -r source/requirements.txt

About

This repository implements different architectures for training word embeddings.

Topics

Resources

License

Stars

Watchers

Forks

Languages