Skip to content
A simple web scraper to scrape the Hacker News(HN) website for news at https://news.ycombinator.com
Python
Branch: master
Clone or download

Latest commit

dependabot and Bharat123rox Bump requests from 2.18.4 to 2.20.0 (#1)
Bumps [requests](https://github.com/requests/requests) from 2.18.4 to 2.20.0.
- [Release notes](https://github.com/requests/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/master/HISTORY.md)
- [Commits](psf/requests@v2.18.4...v2.20.0)

Signed-off-by: dependabot[bot] <support@github.com>
Latest commit ea60cef Nov 1, 2019

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
HackerNews.py
LICENSE
README.md
requirements.txt

README.md

HackerNews-Scraper

A simple web scraper to scrape the Hacker News(HN) website for news at https://news.ycombinator.com

Parameters:

pages: Number of pages one wants the HackerNews for, this creates one file for each page, and a maximum of only 20 pages can be fetched for now.

verbose: Enable or disable verbose output by Y/N, if Y, then progress is printed to the terminal when each page is fetched, else, the program runs silently.

First, please install the dependencies for this scraper by using the requirements.txt file

pip install -r requirements.txt

To use this for your daily share of HackerNews headlines, please clone and use the HackerNews.py file

git clone https://github.com/Bharat123rox/HackerNews-Scraper.git

Future Scope:

  • Add support to extract a small snippet/preview of text from each article
  • Add Multiprocess support in future, making it as an optional argument
  • Add support to fetch more pages

Any contributions to this Project are always Welcome!!

You can’t perform that action at this time.