Hacking the good old [Hangman](http://en.wikipedia.org/wiki/Hangman_(game\)) game, solving the eternal question of which alphabet to start with.
- setup.bash requires bash
- curl and gunzip to download data files from IMDb
- Python and PyMongo
- MongoDB
- curl -s http://raw.github.com/ninadsp/hangman-hacking/master/setup.bash | bash -s --
- Copy dbconfig_sample.py to dbconfig.py with the appropriate values
- Execute import_dump.py - expected run time of 5-*0 minutes and RAM usage of ~500 MB
- Clean the data being imported
- Write a script to read through each document, and create a count of each character in the title, update it in the document
- Dumb map/reduce (or script) over each document and find the count for each character throughout
- Add some fancy functionality to the import_dump.py script with getopt
- Make the map/reduce intelligent by adding ratings/language into the picture
This project is still a work in progress. As of now, all it does is read the list of movies, it's language and rating and insert it into a mongo collection. Has not been tested fully as my disk runs out of space :s