FutureDesignChanges

Future Design Changes

This page is an initial requirements document for possible future Box enhancements. It was collated by Per Thomsen and is taken from his email of 14th March 2005.

1. Symbolic Names for Hosts

In addition to using a hex number to identify backup clients, a mapping must be put in place between the account number, and a name the user (backup admin) determines. This name can be up to 255 characters long (RFC 1034).

For backwards compatibility, account numbers must still be accessible, and usable just as before. If needed command line switches can be applied to denote one or the other. Additionally, for management purposes it should be possible to explicitly set the account number during the creation of a new client (as well as the symbolic name).

While the name can be arbitrary, the installation process should attempt to determine the full domain name of the host, and use that value as the default.

The bbackupd installation process should be able to use both the symbolic name and the account ID. Certificate files, etc. will be generated using the symbolic name, if available. Otherwise it will fall back to the account ID.

2. Client Groups

To support the distinction of groups of boxes being backed up, the concept of groups of users should be implemented. Groups are a collection of clients, and no group can be a member of other groups.

A group has a 'group administrator' associated with itself. Messages that would go to the Backup administrator if there were no groups will go to the group administrator as well for messages related to group member accounts.

The 'bbstoreaccounts' executable will add functionality to manage groups, including getting statistics from a group. The statistics will be the cumulative values of the same data as for a single client, with the addition of the following:

List of group members
?

It is given that if a client is a member of multiple groups, the statistical data will count in each group.

Code and configuration should be implemented to support optional quotas on groups. Each group will have hard and soft limits, much like client accounts.

I'm not sure what the consequences of exceeding a group quota should be? Aggressive housekeeping on all members, when a client hits the group ceiling? Something else?

3. Client Monitoring

The client should be able to send 'heart beat' messages to the bbstored server.

Configuration information for heart beat is kept on the bbstored server. It includes:

For each client:

on/off switch. Clients can be monitored, but are not required to be. This is especially useful for mobile users, who are not connected to the internet all the time.
Heart beat interval. How often the client sends heart beat information. Given in seconds. Default is 900.

Heart beat messages could be transmitted whenever a client connects to the server for backing up, rather than on a separate time schedule. Snapshot backups should transmit the data as well, to be able to track when the last backup was made. It would be preferable if the interval was a separate number as described in the previous paragraph. This would give more consistent data about clients that snapshot backup or have long backup intervals. This is often done (at least by me) to improve sluggish client machine performance every hour, when the disk is scanned for eligible files.

Heart beat messages will not interfere with long-running syncs or restores (large files), but will insert itself as close to the interval as possible, to ensure that as few false error alerts as possible are sent to administrators.

When bbackupd starts, it will register with the bbstored server, and request its configuration information. It will use this to send the messages at the appointed times.

Also, bbstored will create a record of the now running bbackupd (in memory, mmap, or whatever works best), to hold the data for the statistics, as well as to ensure that only clients that have registered are being monitored. Snapshot backups will not register, but rather data will be kept about the timestamps, etc. of the last backup.

When a bbackupd daemon completes an orderly shutdown, it will 'de-register' itself from the bbstored service, to ensure that no false 'down alerts' go out. However, if bbackupd dies as a result of some failure, the record on the bbstored server will remain, and eventually cause alerts to go out to the backup administrator, and the group owner for a given group.

The heart beat packet contains the following information:

Host identifier (name and/or account number)
bbackupd version number
backup type for last backup performed (lazy/snapshot)
uptime (ie. how long has bbackupd been running on this host)
time stamp of last connection (not necessarily any files uploaded)
timestamps of last sync (when was the last file uploaded)
Number of bytes synced since last heart beat message
Number of bytes restored since last heart beat message
any significant errors that have occurred since last heart beat.
?

On the server side, a daemon (most likely bbstored) receives these heart beat messages, and keeps track of the status of all clients. It will keep a running counter of the byte-count statistics for the client, as well as a log of the significant errors.

When a bbackup client daemon dies unexpectedly, the bbstored server will notice that there is no heart beat message from the client after approximately 2 x the heart beat interval. It will then notify backup administrators using the NotifySysAdmin.sh mechanism, or one very much like it. This mechanism should support notification to a 'group owner', for clients that are in a group.

When a significant error occurs, and is logged with the server, a similar notification mechanism will be used to notify the backup administrator.

Optionally, the statistics information can be stored in a database for billing/auditing purposes.

A utility (possibly an updated bbstoreaccounts) will be needed to display this information in ways that will be useful to administrators. For individual accounts this information could include:

time/date of last successful sync
duration of last successful sync
???

4. Space Use Reporting

Reporting of space consumption is needed at several levels:

The entire bbstored server (all RAID volumes being used for backups).
Each Volume. Ensuring that one single volume isn't bearing the brunt of the load, as well as for planning purposes.
By Group. This relates to item #2 in my list. It has very similar reporting requirements to the individual client, with the same additions as described in the Group section.
By Individual. This is already available in the current version.

5. Account Database

The ability to store the client account information in a database is crucial to the stability and scalability of the system. Change the use of text files to using a database.

Implement support for storing the client account information for multiple Box servers in one database.

6. Interaction With the Rest of the World

Interfacing in an easy way to other systems for Monitoring and reporting purposes. In addition to nicely formatted output there should be an option in all commands to format the output for human and for script consumption. This data could then be used by products like Nagios (www.nagios.org)

7. Account Migration Tools

It should be possible to move a client account from one Box server to another.

When the move is complete (not before), the old bbstored server should either redirect (preferred) or proxy the requests to the 'new' server, so the client can continue operations unaffected by the change.

8. Server Redundancy (grabbed from message by Ben on 9/24/04)

Design objectives

Failure means the server cannot be contacted by the client. If a server can be contacted by another server but not the client, then that server must still be considered down.

No central server. The objective above means server choice must be made by the client.

A misbehaving client should not cause the stores to lose syncronisation.

Assume that all servers have the same amount of disc space, and identical disc configuration.

Allow choice of primary and secondary on a per account basis.

Any connection can be dropped at any time, and the stores should be in a workable, if non-optimal, state.

As simple as possible. Avoid keeping large amounts of state about the accounts on another server.

8.1 Server Froups

The client store marker is defined to change at the end of every sync (if and only if data changed) from the client. The client sync marker should increase each time the store is updated. This allows the server groups to determine easily if they are in sync, and which is the latest version.

Stores are grouped. Each server is a peer within the group.

On login, the server returns a list of all other servers in the group. The client records this list on disc.

When the client needs to obtain a connection to a store, it uses the following algorithm:

Let S = last server successfully connected
Let P = primary server
Do
{
    Attempt to connect to S
    If(S == P and S is not connected)
    {
        Pause;
        Try connecting to P again.
    }
} While(S is not connected and not all servers have been tried)

If(S is not connected)
{
    Pause
    Start process again
}

Let CSM_S = client store marker from S

If(S != P)
{
    Attempt to connect to P again, but with a short timeout this time
    If(P is connected)
    {
        Let CSM_P = client store marker from P
        If(CSM == expected client store marker)
        {
            Disconnect S
            S = P
        }
        else
        {
            Disconnect P
        }
    }
}

This algorithm ensures that the client prefers to connect to the primary server, but will keep talking to the secondary server for as long as it's available and is at a later state than the primary store. (This gives time for the data to be transferred from the secondary to the primary and avoid repeat uploads of data.)

Servers within a group use the fast sync protocol to update accounts on a regular basis.

8.2 Observations

The servers are simply peers. The primary server for an account is chosen merely by configuring the client.

If the servers simply use best efforts to keep each other up to date, the client will automatically choose the best server to contact.

Using the existing methods of handling unexpected changes to the client store marker, it doesn't matter whether a server is out of date or not. The existing code handles this occurence perfectly.

The servers do not need to check whether other servers are down. This fact is actually irrelevant, because it's the client's view of upness which is important.

8.3 Accounts

The accounts database must be identical on each machine. bbstoreaccounts will need to push changes to all servers. It will probably be necessary to change the account database, and store the limits within the database rather than in the stores of data. This is desirable anyway.

Note: If another server is down, it won't be possible to update the account database.

Alternatively, servers could update each other with changes to the accounts database on a lazy basis. This might cause issues with housekeeping unnecessarily deleting files which have to be

8.4 Fast Sync Protocol

Compare client store markers. End if they are the same. Otherwise, the server with the greater number becomes the source, and the lesser the target.

Zero client store marker on target.

Send stream of deleted (by housekeeping) object IDs from source to target. Target deletes the objects immediately.

Send stream of object ID + hash of directories on source server to the target.

For each directory on the target server which doesn't exist, or doesn't have the right hash...

check objects exist, and transfer them
write directory, only if all the objects are correct
check for patches. Attempt to transfer by patch if new version exists

Each server records the client store marker it expects on the remote server. If that marker is not as expected, then the contents of the directories are checked as well, sending MD5 hashes across. This allows recovery from partial syncs. [This should probably be optimised if for when there's an empty store at one end.]

When an object is uploaded, the "last object ID used" value for that account should be kept within the acceptable range to allow recovery when syncing with the client.

Write new client store marker on target.

If a client connects during a fast sync, then that fast sync will be aborted to give the client the lock on the account.

8.5 Optimised Fast Sync

It's undesirable for the fast sync to check every directory when it doesn't have to. During sync with a client a store:

Keeps a list of changed directories by writing to disc (and flushing) every time a directory is saved back to disc.

Keep patches from previous versions to send to remote store.

Connect after backup to remote stores, use fast sync to send changes over.

This will allow short-cuts to be taken when syncing, and changes sent by patch.

The cache of patches will need to be managed, deleting them when they are transferred to a peer or are too old.

8.6 Housekeeping

Deleted objects need to be kept in sync too. Housekeeping takes place independently on each server. Since housekeeping is a deterministic process, this should not delete different files on different servers.

A list of deleted objects is kept on each server during the housekeeping process.

In the unlikely event that a server deletes an object that the source server doesn't, this object will be retrieved in the next fast sync. This is unlikely to happen because clients only add data.

Typically, housekeeping on non-primary servers will never delete an object in that account.

9 Pseudo-Clustering of Servers (from Ben on 9/27/04)

It has just occurred to me that using the built-in software RAID, a limited form of redundant servers could be created. Someone suggested this on the list a while back, and I've only just realised the implications.

All you need are three identical servers. On each server, compose the RAID file sets from the local hard drives and the two hard drives from the other servers (mount the discs using NFS or something.)

Run the bbstored daemon on each, and use round-robin DNS with a low TTL to send clients to different machines.

It should then "just work". If any machine goes down, then the software RAID will kick in and no-one will notice, apart from the administrator who will notice the log messages.

The changes required are:

Add communications between bbstored servers so that a client can log in even if another server is housekeeping that account.
Account database syncing between servers.
Raid file disc set restoration tools needs to be written (which is still currently lacking -- right now you have to move the existing files away in case they're needed, then blank every account and wait until the clients have uploaded everything again.)
Efficiency: write the raidfile daemon to offload RAID work, and write the temporary files to the local filesystem only.

The advantage over the previous plan is that most of the work is already done -- none of the above is a particularly significant amount of effort. The disadvantage is that it limits clusters to three machines which are connected to each other with fast network connections. However, it is a rather neat and simple solution.

10. No SSL/TLS on the Wire

It should be an option to turn off SSL/TLS after the initial handshake, to lower the overhead of the protocol.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly