Word2vec papers & tutorial
<Title - Link>
-
Distributed Representations of Words and Phrases and their Compositionality - http://web2.cs.columbia.edu/~blei/seminar/2016_discrete_data/readings/MikolovSutskeverChenCorradoDean2013.pdf
-
Efficient Estimation of Word Representations in Vector Space - https://arxiv.org/abs/1301.3781
-
Distributed Representations of Sentences and Documents - https://cs.stanford.edu/~quocle/paragraph_vector.pdf
-
word2vec Parameter Learning Explained - https://arxiv.org/abs/1411.2738
-
hierarchical output layer(Hugo Larochelle NLP lecture) - https://www.youtube.com/watch?v=B95LTf2rVWM
-
Word2Vec Tutorial (The Skip-Gram Model) - http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/
-
Hierarchical Probabilistic Neural Network Language Model - http://www.iro.umontreal.ca/~lisa/pointeurs/hierarchical-nnlm-aistats05.pdf
-
A Scalable Hierarchical Distributed Language Model - https://pdfs.semanticscholar.org/1005/645c05585c2042e3410daeed638b55e2474d.pdf
-
bag of words and ngram model - https://en.wikipedia.org/wiki/Bag-of-words_model
-
Understanding Word2Vec and Paragraph2Vec - http://piyushbhardwaj.github.io/documents/w2v_p2vupdates.pdf
-
How does doc2vec represent feature vector of a document? Can anyone explain mathematically how the process is done? - https://www.quora.com/How-does-doc2vec-represent-feature-vector-of-a-document-Can-anyone-explain-mathematically-how-the-process-is-done