Skip to content
Ed Riccardi edited this page Jul 9, 2017 · 7 revisions

JBOV: Just a Bunch of Volumes

(Pronounced Jey-Bob)

If too lazy, jump two sections down to Solution

The Problem

During years I have built a collection of hard disks, unfortunately, neither ZFS or X-Raid would be useful for building a redundant storage; most of them are of a different size. Also, I don't want/can plug/mount all of them at the same time. I would like to have some sort of redundant storage for my data-hoarding hobby. I'm not interested in high specs but more of a simple long-term solution using already available disks.

Alternatives

You can skip to the next section, the following considers alternative solutions.

  • Buy more disks of the same size. I can't justify it, given all the relatively useful disks I have laying around.
  • Do some manual rsync' scripts. Too manual.
  • Try to use ZFS or RAID by creating smaller but equally sized volumes i.e. 512GB, so a 2TB disk would have 4 volumes. This can get very complicated very quickly (have to be careful to maintain redundancy.).
  • Use BTFS... Well, yes... when it becomes stable (status page).
  • Use CEPH, GFS, Lustre... All of them are not trivial to maintain, i.e. Adding or removing storage requires planning, most of the times you require a few servers, all the disks have to be online, etc.

Solution

Join a bunch of standard volumes (i.e. ext4, exFAT, etc) together and present them as one; redundancy is achieved by maintaining copies of the files in multiple volumes based on user-defined rules. Disks don't need to be mounted all the time or at the same time. Frictionless, a new volume does not need to be empty to join an existing JBOV, just add it and all its files become part of the JBOV.

a visual example:

{vol1} +-movie-1.avi
       |-movie-2.avi
       \-openbsd.iso

{vol2} +-movie-1.avi
       |-movie-2.avi
       \-debian.iso

{vol3} +-openbsd.iso
       \-debian.iso

Unified filesystem will show:

{jbov} +-movie-1.avi
       |-movie-2.avi
       |-openbsd.iso
       \-debian.iso

So, to put it simple, the files are spread in different volumes. The redundancy is achieved by duplicating files (like RAID1), but this simplicity adds its load of convenience: the underlying volumes continue to be normal volumes, you can go and grab files from them, or just copy files into it, without having to use JBOV at all. In the previous example, {vol3} is the ISO images disk, you can set a rule specifying that all the ISO files have to have one copy that particular removable disk {disk3}, so it is easier for you to install a new server, you can grab the removable disk, knowing that JBOV will keep a copy of all the isos there. This also works the other way around, you can copy new isos into the disk, and JBOV will make redundant copies into other disks when the opportunity arises.

Pros

  • You are using trusted filesystems, i.e. fsck works. You don't need to mount every disk, you can just use the disk alone. JBOV works 'on top' of existing file systems.
  • Partially mounted is fine, and encouraged: adding or removing volumes, or even mounting the JBOV without all the volumes is possible and it will not triggers inconsistency or recovery scenarios (this can be painful, specially using unreliable or old disk).
  • Rule based system for redundancy control i.e. folder /new_movies/ should be copied in full to the {removable} volume-disk... /family/ folder should keep at least three copies.
  • Because files are relatively smaller than the disks, it is possible to maximise the disk space utilisation.
  • Easy to replace disks, just copy the contents to a new disk, or remove the disk from the JBOV $ jbov del /volume and do $ jbov sync /mount.
  • It is absolutely fine to use the underlying file-system while JBOV is mounted.
  • Can be easily used for off-premises backup.
  • Remote storage is a real possibility soon-ish.

Cons

  • It is not meant to be IO intensive, but if you need IO you are free to use the underlying volumen.
  • It does simple data duplication, this means, you get half the storage you put into.
  • There is no database tracking of files in the JBOV, only the volumes are tracked, you can tell if all the volumes are online at the moment, but JBOV will not track its contents or if something is missing.
  • Synchronisation is not transparent (actually a feature!)
  • No snapshots, no compression, no encryption. All these features are left to the underlying filesystem.

Only Two Moving Parts

Fuse Module

It is an optional module that can be used to mount all the volumes together, and see them as one. When creating new files, It will follow the redundancy rules. But, given that the JBOV can be partially mounted, it does not force consistency. You have to manually trigger it later if not all the volumes are online.

the $ jbov command line tool

It can be used to create a jbov, to set rules, to verify and enforce redundancy rules. Because jbov can be partially mounted, this is a way to manually enforce consistency, i.e. after using underlying volumes, or adding or removing see the command line examples here.

Internals, metadata

So far the data structure for the metadata files can be seen here.

Feedback welcome

email, chat, me. Thanks for reading!