Skip to content
/ WAKU Public

WAKU: Hoyer-Square regularisation for sparse word embeddings. Final Project for NLP, UCL MSc Machine Learning 2020

Notifications You must be signed in to change notification settings

apappu97/WAKU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code for our UCL NLP Final Project, using Hoyer-Square regularisation for learning sparse embeddings.

George Lamb, Kush Madlani, Udeepa Meepegama, Aneesh Pappu

Directory Structure

Code

Code for training the WAKU pipeline, downstream task evaluation, etc. resides in the main directory titled "waku".

Data

All input data should be placed in the folder titled "raw_data". Trained embeddings should be written to "embeddings".

The "questions-words" dataset for the word analogy task can be found here

Downstream Evaluation

A notebook which loads trained embeddings, evaluates them on the sentiment analysis, word analogy, word similarity, and word intrusion tasks, and produces t-SNE visualisations can be found in the folder titled "notebooks".

About

WAKU: Hoyer-Square regularisation for sparse word embeddings. Final Project for NLP, UCL MSc Machine Learning 2020

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •