Scrapes articles from Reddit news subreddits. June 2015.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.env.example
.gitignore
LICENSE.md
README.md
articlebot.py
blacklist.txt.example
comments.txt.example
links.txt.example
requirements.txt

README.md

articlebot

Article scraping Reddit bot. Derived from xiaoxu193's bitofnewsbot with the significant change of scraping entire articles instead of condensing them.

You can find an example of articlebot in use by /u/justice_article_bot.

How do I run it?

Setup cron to run it every minute

Instructions for Ubuntu:

  • Install dependencies
  • Rename .env.example to .env and edit accordingly
  • Rename blacklist.txt.example, comments.txt.example and links.txt.example to *.txt accordingly
  • Install cron sudo apt-get install cron
  • Open up crontab to edit cron sudo crontab -e
  • Tell it to run every minute: * * * * * /usr/bin/python articlebot.py

done.txt and comments.txt prevent duplicate comments. blacklist.txt causes bots to ignore listed sites.

Dependencies: