Skip to content

greengangsta/NewsSummarization

Repository files navigation

NewsSummarization

Creating a machine learning model which takes news articles as input and produces their summaries as output.
Approach 1 : Using word frequency to score the sentences.(newssum.py)
Approach 2 : Using ner words to score the sentences.(newssumNER.py)

Dependencies

:
  • Python
  • numpy
  • pandas
  • heapq
  • spacy
  • nltk
  • matplotlib
  • beutiful soup
  • re
  • urllib.request
Note : Some of the dependencies might already be fulfilled if you are using anaconda environment.

Instructions for running

No model training is required for these two approaches as they are both extractive methods of text summarization. To run the solution on your article modify the path in the section with loading the article. You can also scrape the article from a webpage by providing the link, but make sure that the article is contained in the paragraph tag of HTML. Please install the required dependencies mentioned above before running the files. Dataset provided in the repository is non-significant for these two approaches. For getting the summaries with a specific lenght choice(no of sentences) all you have to do is modify the value of n in heapq.nlargest and heapq.nsmallest function which is currently set to 5 sentences.

About

Creating a machine learning model which takes news articles as input and produces their summaries as output.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors