This directory contains the ready to use python scripts to find the similarity between any 2 given documents.
There are so many pre-trained word embeddings are available for representing the text data into vector form. Here I have written a code to find the distance/similarity between the 2 documents using several embeddings -
- TF-IDF
- word2vec
- ELMO
- Universal Sentence Encoder
- Flair embeddings
- Spacy embeddings
- WMD (Word Movers Distance)
- Sentence transformers
python libraries used - Tensorflow, Gensim, Sklearn, Flair, sentence_transformers, numpy