Skip to content
/ tldr-bot Public

Create summaries ("tldr") of long long articles or passages using the Natural Language Tool Kit

Notifications You must be signed in to change notification settings

dli85/tldr-bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tldr bot

Overview

This is a program which takes in a block of text and attempts to summarize it. The bot works by first creating a dictionary of word frequencies based on the non-stopwords words in the text. Stopwords are words such as "the" and "is" that do not add any meaning to a sentence. Each sentence is then assigned a value based on its word frequencies. To compensate for the fact that longer sentences have an inherent advantage, sentence scores will be divided by their length. Finally, sentences with the highest scores are used to create the summarized version of the text. Sentences with the bottom 78% of scores will be excluded from the final summary.

Installation

The only requirement is nltk. Install it using pip

pip install nltk

Usage

Run tldr.py to create a new tldr from an existing text source. The program will prompt you for the name of a textfile to read from. The output will also be a textfile containing the shortened text.

Run the wikipediaCrawl.py file to gather text from a wikipedia article. You will need to provide the program with an article link and an output name. The output name is the name of the final text file (must end in .txt). If one is not provided, the article title will be used as the name.

About

Create summaries ("tldr") of long long articles or passages using the Natural Language Tool Kit

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages