Skip to content

File Lock

Thibaut Barrère edited this page May 13, 2020 · 4 revisions

FileLock provides an easy way to ensure a single block of code runs at a given time on a given server (= no overlap), something which is often needed for ETL jobs. For instance:

  • You may have long-running incremental data synchronisation jobs, which will store a last_updated_at timestamp on disk. You would not want this timestamp to be modified by two concurrent jobs (or the data synchronisation would not work as expected).
  • You may query a remote resource (server etc), for which you may need to avoid more than one connection at a given time.

Kiba Pro FileLock solves this problem in the case you are running your tasks on a single server.

FileLock will protect you against:

  • Cron jobs which may overlap (e.g. 1st run becomes long for some reason, 2nd run starts in the middle).
  • Background jobs that may end up running concurrently in a way you didn't expect.

FileLock is not a distributed lock. A Postgres-based lock will be shipped in a next version of Kiba Pro for that purpose.

Requirements and use

Currently tested against: MRI Ruby 2.4-2.7, Linux/Ubuntu, Mac OS X. Other platforms may work (please send inquiry for verification).

To use FileLock, wrap your ETL job run with this code:

require 'kiba-pro/middlewares/file_lock'

Kiba::Pro::Middlewares::FileLock.with_lock(lock_file: 'tmp/process.lock') do
  job = prepare_job
  Kiba.run(job)
end

When this code runs, the middleware will attempt to immediately acquire the lock on the file (creating it if it does not exist). If the acquirement is successful, the code in the block will run. Otherwise, LockFailedException will be raised, and the code won't run.

This will cover against other processes locking against the same file, but also against threads in the same process, or forks of the same process.

Important notes

A few important points must be noted:

  • :warning: Do not ever delete/unlink the lock file. Doing so will prevent FileLock from working properly, as a race condition will be able to occur. If you really need to delete lock files, turn off all your processes and crons (e.g. maintenance mode), make sure nothing is running nor can be started, then proceed. See DON'T unlink section on this page for more background.
  • If you use Capistrano-styled deployment, make sure to store your lock files in a folder which is shared between deployments (typically under shared). The lock works on a given file, so if the file is not the same, the lock will not work!