Skip to content

Code to find the distance/similarity between the 2 documents using several embeddings - 1. TF-IDF, 2. word2vec, 3. ELMO, 4. Universal Sentence Encoder, 5. Flair embeddings, 6. Spacy embeddings, 7. WMD (Word Movers Distance)

Notifications You must be signed in to change notification settings

swapnilg915/cosine_similarity_using_embeddings

Repository files navigation

This directory contains the ready to use python scripts to find the similarity between any 2 given documents.

There are so many pre-trained word embeddings are available for representing the text data into vector form. Here I have written a code to find the distance/similarity between the 2 documents using several embeddings -

  1. TF-IDF
  2. word2vec
  3. ELMO
  4. Universal Sentence Encoder
  5. Flair embeddings
  6. Spacy embeddings
  7. WMD (Word Movers Distance)
  8. Sentence transformers

python libraries used - Tensorflow, Gensim, Sklearn, Flair, sentence_transformers, numpy

About

Code to find the distance/similarity between the 2 documents using several embeddings - 1. TF-IDF, 2. word2vec, 3. ELMO, 4. Universal Sentence Encoder, 5. Flair embeddings, 6. Spacy embeddings, 7. WMD (Word Movers Distance)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages