Skip to content

Latest commit

 

History

History
100 lines (64 loc) · 2.29 KB

README.md

File metadata and controls

100 lines (64 loc) · 2.29 KB

Woid

Python Version Django Version

Table of Contents

Running Locally

First, clone the repository to your local machine:

git clone https://github.com/vitorfs/woid.git

Install the requirements:

pip install -r requirements/dev.txt

Apply the migrations:

python manage.py migrate

Load the initial data:

python manage.py loaddata services.json

Finally, run the development server:

python manage.py runserver

The site will be available at 127.0.0.1:8000.

Supported Services

Currently Woid crawl the following services to collect top stories:

  • Hacker News hn
  • Reddit reddit
  • GitHub github
  • The New York Times nytimes
  • Product Hunt producthunt

Crawlers

You can run the crawlers manually to collect the top stories using the following command:

python manage.py crawl reddit

You can pass multiple services at once:

python manage.py crawl reddit hn nytimes

Valid values: hn, reddit, github, nytimes, producthunt.

The New York Times

To crawl The New York Times you will need an API key.

You can register one application at developer.nytimes.com.

Product Hunt

Product Hunt require an API key to consume their API.

You can register one application at api.producthunt.com/v1/docs

Cron Jobs

You can set up cron jobs to execute the crawlers periodically. Here is what my crontab looks like:

*/5 * * * * /home/woid/venv/bin/python /home/woid/woid/manage.py crawl reddit hn producthunt >> /home/woid/logs/cron.log 2>&1
*/30 * * * * /home/woid/venv/bin/python /home/woid/woid/manage.py crawl nytimes github >> /home/woid/logs/cron.log 2>&1

License

The source code is released under the Apache 2.0 license.