HackerNews / Algolia Python Library
This is a simple library to interface with HN Search API (provided by Algolia).
👉Note: As an example, I used this library to download ALL Hacker News posts and made it available as a public dataset in Kaggle.
$ pip install python-hn
Get hands on
python-hn in this interactive demo online:
Check out Interactive Docs to try the library without installing it.
from hn import search_by_date # Search everything (stories, comments, etc) containing the keyword 'python' search_by_date('python') # Search everything (stories, comments, etc) from author 'pg' and keyword 'lisp' search_by_date('lisp', author='pg', created_at__lt='2018-01-01') # Search only stories search_by_date('lisp', author='pg', stories=True, created_at__lt='2018-01-01') # Search stories *or* comments search_by_date(q='lisp', author='pg', stories=True, comments=True, created_at__lt='2018-01-01')
Tags are part of HN Search API provided by Algolia. You can read more in their docs. They can form complex queries, for example:
# All the comments in the story `6902129` tags = PostType('comment') & StoryID('6902129')
The available tags are:
PostType: with options
Author: receives the username as param (
StoryID: receives the story id (
Filters can be applied to restrict the search by:
- Creation Date:
- Number of comments:
They can accept
>, <, >=, <= operators with a syntax similar to Django's.
<): Lower than. Example
<=): Lower than or equals to. Example
>): Greater than. Example
created_at__gt='2018'(created after 2018-01-01).
>=): Greater than or equals to. Example
Examples (See Algolia docs for more info):
# Created after October 1st, 2018 search_by_date(created_at__gt='2018-10') # Created after October 1st, 2017 and before January 1st 2018 search_by_date(created_at__gt='2018-10', created_at__lt='2018') # Stories with *exactly* 1000 points search_by_date(tags=PostType('story'), points=1000) # Comments with more than 50 points search_by_date(tags=PostType('comment'), points__gt=50) # Stories with 100 comments or more search_by_date(tags=PostType('story'), num_comments__gt=100)
Current milestone: https://github.com/santiagobasulto/python-hacker-news/milestone/2
- V0.0.4: Other endpoints: /search, /users, /items (CURRENT)
- V0.0.3: Post type aliases, improved API
- V0.0.2: Functioning API
- V0.0.1: Initial Version