Tracking StackOverflow answering rate
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Live at stackpro.herokuapp.

The Assignment

As dictated by the Hackerati:

Build a system to collect data that is generated on an interval (once a minute/hour/day/etc. Store in a database; record time and data value.

Build a web app that displays a graph of the collected data with a choice of intervals (per min, per hour, per day, etc.). Add a table report of the data with column headings. The table should be placed below the graph.

The Intel

Once an hour, a rake process runs to collect data from the Stack Overflow API to get the total number of questions asked that hour and the total number of which went unanswered. A question is considered unanswered if there are no answers with at least one upvote. This data is stored as an object in a mongo database running at MongoLab as follows:

    "asked": [int] number of questions asked
    "unnswered": [int] number of questions unanswered
    "percentage": [float] unanswered/asked * 100
    "unix": [long] the unix timestamp of when the data was pulled
    "timestamp": [string] a nicely formatted string: "hh:mm mm/dd"

When the site is loaded, rails calls the database for the most recent 48 entries and makes an array for each of the fields. Because the chart takes arrays, it's easier to do this iteration server side. On the front, chart.js and tablesorter provide beautiful representations of the data. The interval buttons slice and average the data live as necessary.

The whole thing is hosted on Heroku. Every 24 hours, all data older than 72 hours is erased to keep db size low (72 hours are kept just in case I expand at some point in the future).

The Future

Software (and the mission) is never done, and there's always more to add. Here's an outline of extra features I may one day implement.

  • Flexible intervals (or user set). It can basically handle this now; the front-end can slice/average any interval of the data it's given. There just needs to be another database call to get older data. This would probably lead to the back-end getting a face lift and making the db call separate so the front can call it as much as it likes.

  • Historical records. The only real limitation here is database size. Keeping hourly records for years will get out of hand. Could make a daily record every 24 hours, that could be neat. Would show trends through the week.

  • Other Stack Exchamge sites. They all use the same API, so it would be simple enough to have the existing code call for any (user inputted) site. The issue would be setting up recurring API calls for every possible site. But, with a small waiting bar, it would be easy enough to make 24 calls to iterate over the past 24 hours of an inputted site and give a live result. However, if a lot of users do this, it'll burn into max calls very quickly.