This repository was created for a term paper. As you can see there are four directories.
-
Theory
Contains theoretical papers related to the topic.
-
Models
Contains estimation of various models for word vector representation on test subcorpus.
-
Similarity count
Contains evaluation of two types of context similarity: maximum pairwise similarity between right context and left context, rigt and left mean context vectors similarity.
-
Main
Contains code for preprocessing of main train corpus for final CBOW model and discussion of various classification algorithms.