GitHub - aeden/feed-processor: A multi-step feed parser using JRuby, MRI, beanstalk and MongoDB

Feed Processor is a multi-stage feed processor built with JRuby, MRI, beanstalk and MongoDB.

There are two steps to the feed processing:

Step 1: Download feed content using non-blocking IO and insert the raw data into MongoDB. A message is sent via Beanstalk notifying the parser stage that the feed data is ready for a specific feed.
Step 2: A multi-processor feed parser pulls the raw data from MongoDB, parses it and inserts the resulting parsed record into MongoDB.

Dependencies

MongoDB
beanstalkd
JRuby
MRI

Gems (for JRuby):

jruby-http-reactor
threadify
beanstalk-client
mongo_mapper

Gems (for MRI):

beanstalk-client
mongo_mapper
feedzirra

Executing

Each of the following commands should be executed in a separate console or executed to run as a background process.

Start MongoDB and Beanstalk:

mongod beanstalkd

Run the fetch processor:

jruby -rubygems -Ilib bin/fetch urls.txt

Run the parse processor:

ruby -rubygems -Ilib bin/parse

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
bin		bin
lib		lib
.gitignore		.gitignore
README.textile		README.textile
Rakefile		Rakefile
VERSION		VERSION
feed-processor.gemspec		feed-processor.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dependencies

Executing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

aeden/feed-processor

Folders and files

Latest commit

History

Repository files navigation

Dependencies

Executing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages