a simple library that gets the http, parse it and saves to mongodb
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


A small library that downloads web pages, parse them and saves the needed data in mongodb.

Uses HPricot.

Right now it's parsing IMDB movies, using simple multithreading, it's a quick trial of some functions. The IMDB part was inspired by an article that used the technique. I wondered it threading and mongodb from there.