-
Notifications
You must be signed in to change notification settings - Fork 0
ianleaman/PageRankKeywordClassifier
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This project explores the use of Google Page Rank as a means to generate key words from an input of simlar documents. It is a work in progress and was built to learn about Page Rank, not to create a better keyword classifier. Currently the algorithm performs poorly compared to conventional keyword generation such as TF-IDF. Generally ranking stop words such as "the" or "it" highest. Some improvements: - Expand to Ngrams or part of speach tags - Add the ability to give command line input. To run on sample data: - Install Numpy - in terminal type "python run.py"
About
Exploring the use of Page Rank to extract key terms from a collection of documents.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published