Skip to content
No description, website, or topics provided.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
DataBase.py
ID-NumberPairs.txt
README.md
clustering_and_other_things.py
main.py
newEditedSwords.txt
normalizing.py

README.md

Web Document Clustering System

Intoroduction

Document clustering, as one of the methods of unsupervised machine learning, is widely used in various fields of natural language processing such as information retrieval, automated multi-text summary, etc.
In this project, we implement k-means algorithm on documents.

Download

This program is run by python so you need to setup it on your local device. If you have not installed yet, you can download it from this link: https://www.python.org/downloads/

Since this program uses some of those libraries which are not in default installation, so you must have downloaded and installed. For your convenience, you should have installed "pip" feature of your workspace. For this purpose you can visit this link: https://www.makeuseof.com/tag/install-pip-for-python/

Now you can execute the command line of your device and with help of "pip install" to setup libraries requirements You must have setup this library before runs this program:

  • pymongo
  • hazm
  • xlsxwriter
  • scikit-learn
It is better to know that this program works with MongoDb, so it must have installed on your device. The installation file is available form this link: https://www.mongodb.com/download-center/community

Documentation

After that you can go to mongodb path direction and execute "mongod" file.Mongo's Server now waiting for communication with your program. The default port of it are 27017.
You should have opened Command Line in the folder where "main.py" exists (Where you have copied and extracted the project zip file).Now you can type "python main.py" and enter in the command line space. Finally you must have opened the browser and insert 127.0.0.1:5000.
Now, the purpose program is running and you can use it
"Be notice that: This program is linked into a default data base which name is News if you want to use the another one, it enough that you change the name of "News" to desirable database name in "DataBase.py" (docs = client." The name of your desirable database".Documents)
You can’t perform that action at this time.