News scraper app made with Javascript and Python with NLP

This project mainly consists of a scraper and a dashboard.
The scraper is made with Selenium and BeautifulSoup, and tested on a news website https://thestar.com.my.
The dashboard is made with React.JS and styled with Material UI.
NLP methods:

Sentiment Analysis - Made with nltk library
Summarizer - Made with Sumy's LsaSummarizer

Dashboard overview

A table of data consisting of fields (No, Title, Date, Tag/Category, Content) is displayed on the left side of the screen after data is fetched or scraped.
Functionalities are provided and interactable through buttons in the middle part of the screen:

Fetch news = Fetch previously scraped news from database (mongoDB) (~1-6 seconds)
Scrape news = Create instance of scraper to scrape news from source (~20-30 seconds)
Reset news = Clear data from database (~1-3 seconds)
Sentiment analysis = Perform sentiment analysis on selected news article (~1-3 seconds)
Summarize = Perform summarizing on selected news article (~1-3 seconds)
Fetch news (With Tags) = Fetch previously scraped new from database according to tags
Scrape news (With Tags) = Scrape news from source according to tags

A details column on the right side of the screen then displays the following info when a news is selected:
- Title
- Url
- Sentiment Analysis Result
- Summary
- Content
Limited selection of themes is added and can be exposed through the sidebar.

To start:

npm install
npm start (might take more time initially to download nltk files)

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
api		api
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
debug_mongo.py		debug_mongo.py
package-lock.json		package-lock.json
package.json		package.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News scraper app made with Javascript and Python with NLP

Dashboard overview

To start:

About

Uh oh!

Packages

Contributors 2

Uh oh!

Languages

coderJT/News-Scraper

Folders and files

Latest commit

History

Repository files navigation

News scraper app made with Javascript and Python with NLP

Dashboard overview

To start:

About

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Contributors 2

Uh oh!

Languages

Packages