Skip to content

criteo/lobster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lobster

Simple job service that was originally written to run hadoop jobs, but can be used for any batch-processing scripts that you want to run.

Lobster runs as a service (daemon) and regularly checks a schedule in $LOBSTER_DIR/config/schedule.rb. The goal is to run the scheduled jobs in parallel but a job can't run twice at the same time (no overlap). The only configuration needed is the delay between 2 job runs.

Get Started

Install

gem install lobster

Create a config/schedule.rb file

Run lobsterize in the directory where you need to setup and run your jobs (defined by the LOBSTER_DIR variable)

lobsterize

Then add a job in config/schedule.rb

# config/schedule.rb
job "my-job" do
  cmd "runthis && runthat >> log/here.log"
  delay 5 # minutes
end

Additional settings:

  • directory this is set to LOBSTER_DIR by default
  • user the Unix user that will run the job, lobster must have full passwordless sudo access to this user
  • max_duration for monitoring, will log an error if the job has not been successful for max_duration + delay minutes

Run Lobster

Two environment variables are used (with defaults):

  • LOBSTER_DIR is the directory used for all job commands, and where the configuration/schedule files are (default: current directory)

  • LOBSTER_ENV is a variable used to handle different environments (default: development)

  • lobster run will run in the console

  • or lobster start as a deamon

Configuration

The lobster configuration file should be located in $LOBSTER_DIR/config/lobster.yml and has the following layout

# default values

# monitor the daemon and restart it if it stops
monitor: true
# log directory, absolute or relative to the LOBSTER_DIR
log_dir: log
# pids directory, absolute or relative to the LOBSTER_DIR
pid_dir: /var/run/lobster/
# schedule file, absolute or relative to the LOBSTER_DIR
schedule_file: config/schedule.rb

# environment overrides, the env variable LOBSTER_ENV has 
# to be set (default: development)
my_test_env:
  monitor: false
  log_dir: test_los
my_prod_env:
  pid_dir: /var/run/

Any log in stderr/stdout will be written in the log directory as lobster.output, the actual lobster log is in lobster.log

Capistrano Deployment

It is very easy to have a lobster project with a proper config/lobster.yml to handle different environments such as development, testing and production. You can simply have a schedule.rbfile per environment.

Then you can use capistrano to deploy the new jobs and schedule to your servers (the schedule is automatically reloaded).

Releases

No releases published

Packages

 
 
 

Languages