public
Description: Simple background loops framework for ruby
Homepage: http://www.scribd.com
Clone URL: git://github.com/kovyrin/loops.git
loops /
name age message
file .gitignore Tue Dec 30 20:46:01 -0800 2008 gitignored some stuff [kovyrin]
file LICENSE Fri Sep 18 11:19:59 -0700 2009 switching to MIT license [alexeyv]
file README.rdoc Fri Dec 04 14:09:29 -0800 2009 Simplify loading a specific class in loop confi... [alvinschur]
directory generators/ Mon Sep 14 18:09:16 -0700 2009 oopsies [alexeyv]
file init.rb Thu Oct 16 21:19:55 -0700 2008 Converting to a rails plugin [kovyrin]
directory lib/ Thu Dec 10 05:32:00 -0800 2009 Enable ERB parsing in loops.yml [kpumuk]
directory spec/ Thu Dec 10 06:22:25 -0800 2009 Small specs tweaks [kpumuk]
README.rdoc

Simple background loops framework for rails and merb

loops is a small and lightweight framework for Ruby on Rails, Merb and other ruby frameworks created to support simple background loops in your application which are usually used to do some background data processing on your servers (queue workers, batch tasks processors, etc).

What tasks could you use it for?

Originally loops plugin was created to make our own loops code more organized. We used to have tens of different modules with methods that were called with script/runner and then used with nohup and other not so convenient backgrounding techniques. When you have such a number of loops/workers to run in background it becomes a nightmare to manage them on a regular basis (restarts, code upgrades, status/health checking, etc).

After a short time of writing our loops in more organized ways we were able to generalize most of the loops code so now our loops look like a classes with a single mandatory public method called run. Everything else (spawning many workers, managing them, logging, backgrounding, pid-files management, etc) is handled by the plugin itself.

But there are dozens of libraries like this! Why do we need one more?

The major idea behind this small project was to create a deadly simple and yet robust framework to be able to run some tasks in background and do not think about spawning many workers, restarting them when they die, etc. So, if you need to be able to run either one or many copies of your worker or you do not want to think about re-spawning dead workers and do not want to spend megabytes of RAM on separate copies of Ruby interpreter (when you run each copy of your loop as a separate process controlled by monit/god/etc), then I’d recommend you to try this framework — you’d like it.

How to install?

To install this plugin you need to do a few steps:

  • To install the plugin, you can use the following command:
      ./script/plugin install git://github.com/kovyrin/loops.git
    

This will install the whole package in your vendor/plugins directory. For merb applications, just check out the code and place it to the vendor/plugins directory.

  • Generate binary and configuration files by running
      ./script/generate loops
    

This will create the following list of files:

  • ./script/loops - binary file that will be used to manage your loops
  • ./config/loops.yml - example configuration file
  • ./app/loops/simple.rb - REALLY simple loop example
  • ./app/loops/queue_loop.rb - simple ActiveMQ queue worker

How to use?

Here is a simple loop scaffold for you to start from (put this file to app/loops/hello_world_loop.rb):

  class HelloWorldLoop < Loops::Base
    def run
      with_period_of(1) do # period is in seconds
        debug("Hello, debug log!")
        sleep(config['sleep_period']) # Do something "useful" and make it configurable
        debug("Hello, debug log (yes, once again)!")
      end
    end
  end

When you have your loop ready to use, add the following lines to your (maybe empty yet) config/loops.yml file:

  hello_world:
    type: simple
    sleep_period: 10

This is it! To start your loop, just run one of the following commands:

  • To list all configured loops:
      $ ./script/loops -L
    
  • To run all enabled (actually non-disabled) loops in foreground:
      $ ./script/loops -a
    
  • To run all enabled loops in background:
      $ ./script/loops -d -a
    
  • To run specific loop in background:
      $ ./script/loops -d -l hello_world
    
  • To see all possible options:
      $ ./script/loops -h
    

How to run more than one worker?

If you want to have more than one copy of your worker running, that is as simple as adding one option to your loop configuration:

  hello_world:
    type: simple
    sleep_period: 10
    workers_number: 1

This workers_number option would say loops manager to spawn more than one copy of your loop and run them in parallel. The only thing you’d need to do is to think about concurrent work of your loops. For example, if you have some kind of database table with elements you need to process, you can create a simple database-based locks system or use any memcache-based locks.

How to run more than one loop using the same class?

You can run the same loop class with different configuration parameters by explicitly identifying the loop class to execute:

  hello:
    type: simple
    loop_name: some_module/my_worker
    language: English

  salut:
    type: simple
    loop_name: some_module/my_worker
    language: French

Now two independent sets of loops are using the same class SomeModule::MyWorkerLoop customized by the language parameter.

ActiveMQ-based workers? What’s that?

In some of our worker loops we poll ActiveMQ queue and process its items to perform some asynchronous operations. So, to make it simpler for us to create such a workers, we’ve created really simple loops class extension that wraps your code with basic queue polling/acknowledging code and as the result, you can create a loops like this:

    class MyQueueLoop < Loops::Queue
      def process_message(message)
        debug "Received a message: #{message.body}"
        debug "sleeping..."
        sleep(0.5 + rand(10) / 10.0) # do something "useful" with the message :-)
        debug "done..."
      end
    end

With configs like this:

    # An example of a STOMP queue-based loop
    my_queue:
      type: queue
      host: 127.0.0.1
      port: 61613
      queue_name: blah

Of course, this solution scales perfectly and to make your queue processing faster you just need to add more workers (by adding workers_number: N option).

Warning: This type of loops requires you to have stomp plugin installed in your system.

There is this workers_engine option in the config file. What could it be used for?

There are two so called "workers engines" in this plugin: fork and thread. They’re used to control the way process manager would spawn new loops workers: with fork engine we’ll load all loops classes and then fork ruby interpreter as many times as many workers we need. With thread engine we’d do Thread.new instead of forks. Thread engine could be useful if you are sure your loop won’t lock ruby interpreter (it does not do native calls, etc) or if you use some interpreter that does not support forks (like jruby).

Default engine is fork.

What Ruby implementations does it work for?

We’ve tested and used the plugin on MRI 1.8.6 and on JRuby 1.1.5. At this point we do not support demonization in JRuby and never tested the code on Ruby 1.9. Obviously because of JVM limitations you won’t be able to use fork workers engine in JRuby, but threaded workers do pretty well.

Who are the authors?

This plugin has been created in Scribd.com for our internal use and then the sources were opened for other people to use. All the code in this package has been developed by Alexey Kovyrin, Dmytro Shteflyuk and Alexey Verkhovsky for Scribd.com and is released under the MIT license. For more details, see LICENSE file.