A simple web scraper to scrape the Hacker News(HN) website for news at https://news.ycombinator.com
pages: Number of pages one wants the HackerNews for, this creates one file for each page, and a maximum of only 20 pages can be fetched for now.
verbose: Enable or disable verbose output by Y/N, if Y, then progress is printed to the terminal when each page is fetched, else, the program runs silently.
First, please install the dependencies for this scraper by using the
pip install -r requirements.txt
To use this for your daily share of HackerNews headlines, please clone and use the
git clone https://github.com/Bharat123rox/HackerNews-Scraper.git
- Add support to extract a small snippet/preview of text from each article
- Add Multiprocess support in future, making it as an optional argument
- Add support to fetch more pages
Any contributions to this Project are always Welcome!!