This text summarization experiment is(/was?) a small weekend experiment I ended up doing following a small improvement I was trying to do for my KMS project involving automatically generating tags for each document, which would hopefully improve search results.
It works by getting the most used keywords in a text, removing trash and scoring the sentences in the text using these keywords and their distance.
Note: While I do believe it will be clear from what you are about to read, but I will mention this regardless - I have no experience in NLP, linguistics, math and algorithms as well as quite limited experience in programming as a whole. Nevertheless, coding is fun and, a fortiori, so is experimenting with silly things such as the above.
Note 2: While the results I've been getting are somewhat accurate (based on my own testing), this code and it's results probably shouldn't be used in anything mission-critical. I am not responsible for bad grades, AI uprising nor any other mishaps following the usage of the code included hereunder.
There are tools and libraries available online that will do a MUCH better job, but it's just not as fun, is it?
Full writeup available on my blog, here.