Skip to content
Robots for ingesting objects into SDR Preservation Core -- replaced by preservation_robots
Ruby Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin
config
lib
log
spec
.gitignore
.rspec
.rubocop.yml
.rubocop_todo.yml
.travis.yml
Capfile
Gemfile
Gemfile.lock
LICENSE.rdoc
README.md
Rakefile.rb
VERSION

README.md

SDR Ingest Workflow Robots

Build Status Coverage Status Dependency Status

Note As of Q1 2018, this repository will soon be replaced by https://github.com/sul-dlss/preservation_robots

Authors

  • Alpana Pande
  • Bess Sadler
  • Richard Anderson
  • Darren Weber

See the Wiki for general documentation on the robots infrastructure.

Project Structure

  • config
    • certs : authentication certificates for workflow service
    • deploy : host-level Capistrano configuration
    • environments : configuration for dev,stage,prod environments
    • workflows : workflow specific configuration - steps, dependencies. One directory per workflow.
  • lib : ruby classes needed for your local robots
    • sdrIngest : all of the robots for a particular workflow. One directory per workflow
  • spec
    • lib : spec for library classes
    • sdrIngest : specs for the workflow

An overview of the workflow

Admin Menu

alias sdr2='cd ~/sdr-preservation-core/current/bin'
alias ingest='sdr2 ; bundle exec menu.rb sdrIngestWF; cd $OLDPWD'
alias ingest-log='log sdrIngestWF'
function log() { cd ~/sdr-preservation-core/current/log/$1/current; }

Crontab

See bin/cron_jobs.txt or ssh onto a deploy system and run crontab -e.

Deploying Robots to a new machine checklist

  • clone the code repository to your laptop, using the master branch, and install dependencies:
    git clone git@github.com:sul-dlss/sdr-preservation-core.git
    cd sdr-preservation-core
    bundle install
  • create or update config/deploy/{stage}.rb to specify the server parameters, e.g.
    cp config/deploy/dev.rb config/deploy/{stage}.rb
    # modify the defaults, e.g.
    #ENV['SDR_HOST'] ||= 'sdr-stage1'
    #ENV['SDR_USER'] ||= 'sdr_user'
    #ENV['ROBOT_ENVIRONMENT'] = 'stage'
    # Note that the value of ROBOT_ENVIRONMENT entails the existence of two config files:
    # config/environments/${ROBOT_ENVIRONMENT}.rb
    # config/environments/robots_${ROBOT_ENVIRONMENT}.yml
    • capistrano can deploy to multiple servers simultaneously
    • the {stage} file name can be any name, it doesn't have to be the same as a ROBOT_ENVIRONMENT
  • create or update config/environments/<ROBOT_ENVIRONMENT>.rb
    • see config/environments/development.rb
  • create or update config/environments/robots_<ROBOT_ENVIRONMENT>.yml
    • This defines robot names, queue lanes they are associated with, and the number of instances of the robot
    • See the extensive comments in the example file at config/environments/robots_development.yml
  • check and initialize the deployment directory structure on each <deploy_server>, e.g.
    cap {stage} deploy:check
    #cap -T # this should display all the available capistrano tasks (and subtasks)
  • program puppet to manage the shared_configs for <deploy_server>
  • deploy and restart the robots, e.g.
    cap {stage} deploy
    # to undo a deploy, use:
    #cap <deploy_server> deploy:rollback

Restarting Robots

all of the robots on a server

cap <deploy_server> deploy:restart # restarts all the robots

individual robots on individual servers

ssh <deploy_server>
cd ~/sdr-preservation-core/current
export ROBOT_ENVIRONMENT={environment}
bundle exec controller status  # shows the status of the robots
bundle exec controller restart # to restart all of them
bundle exec controller restart sdr_sdrIngestWF_register-sdr # to restart just this robot

safest way to restart

ssh <deploy_server>
cd ~/sdr-preservation-core/current
export ROBOT_ENVIRONMENT={environment}
bundle exec controller stop
bundle exec controller quit
bundle exec controller boot

kill a specific robot

kill -QUIT $PID # graceful shutdown
kill -9 $PID # kill it now!
bundle exec controller status # the robot should be restarted

some things to note

  • the robot machines need to be in the same zone as DOR services for firewall reasons
  • the robot machines need to access the /dor/export filesystem on DOR services
  • the robot machines need to access mount points configured for Moab::Config.storage_roots

Development

Clone the repository and install dependencies

git clone git@github.com:sul-dlss/sdr-preservation-core.git
cd sdr-preservation-core
bundle install

Run tests

cd sdr-preservation-core
bundle install
bundle exec rspec
You can’t perform that action at this time.