Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Structure change supporting faster Backup / Repair #55

matthewvon opened this Issue Oct 19, 2012 · 2 comments


None yet
1 participant

matthewvon commented Oct 19, 2012

backup / repair: move from a flat directory structure to a directory by level. Allows external backup (if performed in level order, rsync default) and creates implied manifest for repair.


matthewvon commented Oct 19, 2012

Design points:

  • Move .sst files into directories by "levels". The directory hierarchy would then emulate the manifest information.
  • Update the repair process to validate one directory / one level of .sst files at a time. Move any pair of files with overlapping key ranges to level 0 like today or initiate a same level merge. For all the non-overlapping files simply record the .sst file in a new manifest at the level implied by directory. Only .sst files that were in the middle of a compaction will need to move to Level 0 for reexamination (or initiate same level merge).
  • For first release, ignore issues relating to incomplete / in-flight recovery logs. This is not any different than today's repair. Just pointing this out since it is highly likely the backup process will not get complete images (but the recovery logs would likely not be complete in a system crash either).


The above design would allow customers to backup files through simple rsync or a more complex backup program such as Retrospect. It will not matter if the manifest files on disk properly represent the .sst files. Any files that are "in-flight" will readily be identified by the level by level validation. Today's .sst "move" operation might be slightly more complex, requiring a hard link so the .sst exists temporarily in two directories (or maybe not). But overall this is a really easy change.

The restore operation will be simple: copy the directory structure back into place and execute repair. Repair will be speedy compared to today. This could also help build new data sets at remote locations before starting replication. Physical drives can be literal shipped to speed the new site's initial data load.

I do believe we can survive without requiring a "stable" file system, i.e. Riak can be live during the backup. The worst case scenario is that someone is in the middle of rsync for level 6 and additional .sst files move/compact from 5 to 6. This has the potential to have files that moved from 5 being gone by the time rsync starts on 5 plus never seen on level 6. Data Loss! But manual scripting of rsync to get lower levels first would instead duplicate the data, which is ok, instead of losing it. The requirement of repair after file restore takes care of the duplication.

In Linux, LVM can externally freeze a file system view for a backup. Whatever .sst files are in transition at the time of the freeze will still cleanup during restore / repair cycle. I am guessing zfs has a similar feature. This external capabilities would eliminate manual sequencing with rsync and scp (as I proposed in the previous paragraph).

Similarly, big systems like Retropect can be configured to snapshot images in a defined sequence.

@matthewvon matthewvon closed this Aug 14, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment