Skip to content
A scraper and classifier for snarky SO comments.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE
README.md

README.md

Snark by the hour

Inspired by Davy M.'s post on Stack Overflow Meta as a reaction on SO's June Update. I've decided to make the Snark by the hour feed happen. It seems like a fun thing to make and I really couldn't agree more with Davy's words:

My idea is a Snark by the Hour live feed so I can enjoy the sarcasm and snark of my favorite Stack Overflow users in real time, but that's just me.

If you happen to have a snarkiness-classifier (with a training set) laying around, feel free to let me now in an issue, or better yet; throw me a pull request. (Yes, kind people of SO, I am hinting at you guys here.)

To-do:

  • Gather some training data
    • Code data collector
      • Hopefully improve this if this API issue ever gets fixed. Or find a work around ? Which is not too hard for comments that have been added since the collectors last run.
    • Extract possible features
      • Determine sarcasm. (Waiting for my pull request to AniSkywalker/SarcasmDetection to be merged.)
    • Clean-up code and make seperate modules for collection and feature extraction
    • Grab data from API
    • Manually rate a metric shit ton of comments.
  • Come up with a snarkiness classifier
    • Visualize data.
    • Select relevant features
    • Determine appropriate classifier
  • Make some web interface to show this potentially hilarious data
  • Write a classifier for questions that are likely to yield snarky comments
You can’t perform that action at this time.