A framework to scrape, analyze and visualize trends and insights from news sources
This project initialized as course project for university.
You need to install python version +3.8.5
You can download the latest version from here
You need install the requirements
pip install -r requirements.txt
Running the code scraper.py
will start scraping armenpress.com website for news, and stores the results in data/armenpress/
Running the code processor.py
will start processing scrapped data from data/armenpress/
, and creates a csv
file in the root folder
The main.py
will access the updated-data.csv
to generate graphs and insights.
python main.py
The generate_wordclouds.py
will generate wordclouds for each month of a given year.
You can access the scraped data here.
You can download the final processed version until (2021/04/23) here.
Sentiment: Mentions: Titles: WordCloud:
- More news sources
- More historical data
- Interactive visualization
- More object oriented structure
- Better sentiment analysis
- Other Natural Language Processing models
- Deploying as a separate website