This is a small project to find similar terms in corpus of documents
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


This is a small project to find similar terms in corpus of documents.

For the project I have used some tags based on news articles. These tags are extracted from various news aggregation methods. You can easily create custom dataset using the

How to Use

  • Clone the Repository:

    git clone

  • Install the dependencies by simply executing:

    pip3 install -r requirements.txt

  • Run the Term Similarity:

    python3 <word_to_search_for_similar_words>


# Suppose you have to find the similar terms for the word 'machine learning'
# Then run the following command
$python3 'machine learning'

# Output would be

   distance                     name
0  0.000000         Machine Learning
1  0.000000         machine learning
2  1.213289                 software
3  1.213289                 Software
4  1.216590  Artificial Intelligence
5  1.216590  artificial intelligence
6  1.219796     predictive analytics
7  1.224047         data & analytics
8  1.224047           data analytics
9  1.241769       big data analytics

# As we can see in the above output 'machine learning' is closely related to
# terms or words as 'big data' and 'artificial intelligence'

Built with ♥ by Omkar Pathak


If you have found my softwares to be of any use to you, do consider helping me pay my internet bills. This would encourage me to create many such softwares :)

PayPal Donate via PayPal!
₹ (INR) Donate via Instamojo