Added BDB native backup capabilities #54

Merged
merged 1 commit into from Jan 23, 2012

Projects

None yet

5 participants

@jroper
Contributor
jroper commented Oct 14, 2011

This is a reissue of #51, which I've closed, rebased, and fixed some of the problems with the original pull request (such as excess IDE autoformatting), creating a new branch and applying my changes to that branch in a single commit.

Reading google, the recommended method for backing up a BDB voldemort store is to just copy the files while voldemort is running. Because BDB is an append only database, this should work, right? Wrong, BDB's clean up thread can get in the way and corrupt the backups. The Java BDB port provides a tool for assisting with backups, basically it finishes the current BDB file, and then pauses the clean up thread while the backup is running so that a safe backup can be done. This tool must be run within the same Java process for the correct locking to occur. For detailed documentation you can read here:

http://download.oracle.com/docs/cd/E17277_02/html/java/com/sleepycat/je/util/DbBackup.html

So, I've implemented an option into the voldemort admin client to do this, called native backup. Storage engines can declare themselves to support this by implementing NativeBackupable. I've implemented support for this in the BdbStorageEngine, which does fast NIO copies, and also supports incremental backups. Backups are guaranteed to be consistent.

I've written unit tests for the backup itself, there aren't any existing tests for the admin client/protocol, so I didn't write tests for the changes I made to the client.

@eoconnor

James,

I figured out some of the earlier issues I was having with the backup utility. I've got it working now, but I've got one quick question: what is the backup-dir parameter intended to refer to? I'd originally thought it was the directory where backup files were written to, but now I'm thinking that it's intended to refer to the BDB data directory containing the files to be backed up.

I ran the utility with backup-dir set to my BDB data directory, which had the following contents:

-rw-r--r-- 1 eoconnor IUSA\domain users 2355872 Oct 17 11:04 00000000.jdb
-rw-r--r-- 1 eoconnor IUSA\domain users 0 Sep 30 13:54 je.info.0
-rw-r--r-- 1 eoconnor IUSA\domain users 0 Oct 17 11:04 je.info.0.lck
-rw-r--r-- 1 eoconnor IUSA\domain users 0 Sep 21 12:14 je.lck

After running the utility, the directory contained these contents:

-rw-r--r-- 1 eoconnor IUSA\domain users 2355929 Oct 17 11:38 00000000.jdb
-rw-r--r-- 1 eoconnor IUSA\domain users 2529 Oct 17 11:38 00000001.jdb
-rw-r--r-- 1 eoconnor IUSA\domain users 0 Sep 30 13:54 je.info.0
-rw-r--r-- 1 eoconnor IUSA\domain users 0 Oct 17 11:04 je.info.0.lck
-rw-r--r-- 1 eoconnor IUSA\domain users 0 Sep 21 12:14 je.lck

There was no data being written to my Voldemort cluster during this time, so the changes to the contents of that directory were definitely due to the backup utility. It's not clear to me exactly what it did, however. Can you shed any light on what the change in file contents in that directory represents? Is the utility hard-coded to make in-place backups (e.g., as opposed to copying the backup files somewhere else)?

Thanks,
Eric

@eoconnor

OK, I think I figured it out. It looks like the backup-dir parameter is intended to refer to the directory where the backup files are to be written, but it needs to be a full absolute pathname, and not a relative pathname, as I was initially doing. Let me know if that doesn't sound correct.

Thanks,
Eric

@afeinberg
Contributor

Hi,

I am in the process of testing this (for performance impact at scale) and then integrating. May make a few changes. I'll let you know when I have an integrated version in trunk!

Thanks,

  • Alex
@jroper
Contributor
jroper commented Nov 3, 2011

Thankyou Alex. If you find any major issues that you don't want to deal with yourself, please let me know, and I'll have a look.

@akkumar
Contributor
akkumar commented Dec 30, 2011

Alex -
So - how is this testing coming along ? Has this been integrated into trunk since then , or anything pending /outstanding that we need to look into.

@afeinberg
Contributor

Hey Karthik,

I've been a bit busy working on other performance testing issues 

-- but I'll resume working on this again in January. I am cleaning
the code up slightly to fit in a bit more with the rest of Voldemort
code base.

I do want to integrate this, however.

On Thursday, 29 December 2011 at 23:59:57 -0800, Karthik K wrote:

Alex -
So - how is this testing coming along ? Has this been integrated into trunk since then , or anything pending /outstanding that we need to look into.


Reply to this email directly or view it on GitHub:
#54 (comment)

@vinothchandar
Collaborator

Hi Karthik,

BDB also uses hexadecimal numbering for the log files. The code breaks at fileNameToNumber() in BDBNativeBackup since it assumes a decimal number. I have hacked this for now and proceeding with the testing

  • Vinoth
@vinothchandar
Collaborator

Also, the timeout in the waitForCompletion call in AdminClient::nativeBackup needs to be admin client configuration or better an argument, since the time to wait depends directly on the data size. Right now, the back up times out for 35GB of data.

@vinothchandar
Collaborator

Guys. quick update : I am going to add the incremental backup capabilities and the logverfication and pull this in.

@akkumar
Contributor
akkumar commented Jan 12, 2012

Thanks, Vinoth.

If there are any numbers / charts comparing the native / jni versions in terms of average read throughput / latency performance metrics, that would be very useful.

@vinothchandar
Collaborator

What do you mean by jni version?

@vinothchandar vinothchandar merged commit fd5dbeb into voldemort:master Jan 23, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment