Hi, I was implementing some storage backends and saw the use of Marshal in Page#to_hash and Page#from_hash ...
I'm looking to implement the backends for: Riak, RiakSearch, Solr, and Amazon's S3.
I'd also like to look into putting anemone onto resque / redis, for work distribution. I've recently become unhappy with Nutch+Hadoop for crawling, so I'd like to see how far I can scale anemone.
Removed Marshal. Makes pages unusable elsewhere.
I need to make some other changes for something like this to work, so if you decide to drop the use of Marshal-ing, I can adjust the other libraries and add to this pull request. Lots of specs fail for me on the original library (ie, without my changes). I don't even know why I'm getting the below issue, which (to me) just reinforces "don't use Marshal".
TypeError in 'Anemone::Storage::Redis should implement merge!, and return self'
incompatible marshal file format (can't be read)
format version 4.8 required; 99.111 given
/Users/sgonyea/Sites/emone/lib/anemone/storage/redis.rb:41:in `block in each'
spec/storage_spec.rb:162:in `block (2 levels) in <module:Storage>'
and in both ruby-18 and ruby-19, I get:
129 examples, 32 failures
With the specs.