Elastic Beat to index new Reddit Submissions of one or multiple Subreddits
Getting Started with Redditbeat


Ensure that this folder is at the following location: ${GOPATH}/github.com/voigt

Clone Project

git clone https://github.com/voigt/redditbeat.git $GOPATH/src/github.com/voigt/redditbeat
cd $GOPATH/src/github.com/voigt/redditbeat

Init Project

To get running with Redditbeat and also install the dependencies, run the following command:

make scaffold


To build the binary for Redditbeat run the command below. This will generate a binary in the same directory with the name redditbeat.



To run Redditbeat with debugging output enabled, run:

./redditbeat -c redditbeat.yml -p data/redditmap.json -e -d "*"

Hint: If you want to reindex already indexed Subreddits (resets data/redditmap.json):

make clear-cache


You'll want to configure which Subreddits to index. You will do this in redditbeat.yml.

  # Defines how often an event is sent to the output
  period: 60s                       # how often to check for new Submissions

    username: "username"
    password: "password"
    useragent: "Redditbeat v0.1"
    subs: ["kitten", "news"]        # a list of Subreddits to index
    limit: 10                       # curret limit is 100


  • index new Submissions of one or multiple given Subreddits
  • add persistency, so already indexed submissions will not be indexed again
  • add dockerfile make package
  • index new Submissions of one or multiple Users

Known issues

  • Redditbeat misses some new Submissions
    Redditbeat is making use of geddit. Unfortunately geddit saves the timestamp of a submission in float32, which means we lose up to 99 seconds of the timestamp. Ultimately this leads to the fact, that Redditbeat does not recognise new Submissions of which created date is closer than 99 secs. geddit is already informed.

