Skip to content

Commit

Permalink
Merge pull request #91 from joergsteffens/dev/joergs/bareos-17.2/merg…
Browse files Browse the repository at this point in the history
…e-16.2-droplet

Merge bareos-16.2-droplet into bareos-17.2
  • Loading branch information
pstorz committed Jul 9, 2018
2 parents 000ccba + 38aebda commit 0aadb37
Show file tree
Hide file tree
Showing 13 changed files with 374 additions and 71 deletions.
119 changes: 92 additions & 27 deletions README.droplet
Expand Up @@ -2,52 +2,57 @@ Using droplet S3 as a backingstore for backups.

The droplet S3 storage backend writes chunks of data in an S3 bucket.

For this you need to install the libdroplet-devel and the storage-droplet packages which contains
the libbareossd-chunked*.so and libbareossd-droplet*.so shared objects and the droplet storage backend which implements a dynamic loaded
storage backend.
For this you need to install the the bareos-storage-droplet packages which contains
the libbareossd-chunked*.so and libbareossd-droplet*.so shared objects and the droplet storage backend which implements a dynamic loaded storage backend.

In the following example all the backup data is placed in the "bareos-backup" bucket on the defined S3 storage.
A Volume is a sub-directory in the defined bucket, and every chunk is placed in the Volume directory withe the filename 0000-9999 and a size
that is defined in the chunksize.
A volume is a sub-directory in the defined bucket, and every chunk is placed in the Volume directory withe the filename 0000-9999 and a size that is defined in the chunksize.

The droplet S3 can only be used with virtual-hosted-style buckets like http://<bucket>.<s3_server>/object
Path-style buckets are not supported when using the droplet S3.

On the Storage Daemon the following configuration is needed.
Example bareos-sd.d/device file:
Example bareos-sd.d/device/S3_ObjectStorage.conf file:

Device {
Name = "S3_1-00"
Media Type = "S3_File_1"
Archive Device = Object S3 Storage
Name = S3_ObjectStorage
Media Type = S3_Object1
Archive Device = S3 Object Storage

#
# Config options:
# profile= - Droplet profile to use either absolute PATH or logical name (e.g. ~/.droplet/<profile>.profile
# location= - AWS location (e.g. us-east etc.)
# Device Options:
# profile= - Droplet profile path, e.g. /etc/bareos/bareos-sd.d/droplet/droplet.profile
# location= - AWS location (e.g. us-east etc.). Optional.
# acl= - Canned ACL
# storageclass - Storage Class to use.
# storageclass= - Storage Class to use.
# bucket= - Bucket to store objects in.
# chunksize= - Size of Volume Chunks (default = 10 Mb)
# iothreads= - Number of IO-threads to use for upload (use blocking uploads if not defined.)
# ioslots= - Number of IO-slots per IO-thread (default 10)
# iothreads= - Number of IO-threads to use for upload (use blocking uploads if not defined)
# ioslots= - Number of IO-slots per IO-thread (0-255, default 10)
# retries= - Number of retires if a write fails (0-255, default = 0, which means unlimited retries)
# mmap - Use mmap to allocate Chunk memory instead of malloc().
#
Device Options = "profile=/etc/bareos/bareos-sd.d/.droplet/droplet.profile,bucket=backup-bareos,iothreads=3,ioslots=3,chunksize=100M"

# testing:
Device Options = "profile=/etc/bareos/bareos-sd.d/droplet/droplet.profile,bucket=bareos-bucket,chunksize=100M,iothreads=0,retries=1"

# performance:
#Device Options = "profile=/etc/bareos/bareos-sd.d/droplet/droplet.profile,bucket=bareos-bucket,chunksize=100M"

Device Type = droplet
LabelMedia = yes # lets Bareos label unlabeled media
Random Access = yes
AutomaticMount = yes # when device opened, read it
RemovableMedia = no
AlwaysOpen = no
Description = "Object S3 device. A connecting Director must have the same Name and MediaType."
Maximum File Size = 500M # 500 MB (Allows for seeking to small portions of the Volume)
Description = "S3 device"
Maximum File Size = 500M # 500 MB (allows for seeking to small portions of the Volume)
Maximum Concurrent Jobs = 1
Maximum Spool Size = 15000M
}


The droplet.profile file holds the credentials for S3 storage
Example /etc/bareos/bareos-sd.d/.droplet/droplet.profile file:
Example /etc/bareos/bareos-sd.d/droplet/droplet.profile file:

Make sure the file is only readable for bareos, credentials for S3 are listed here.

Expand All @@ -61,20 +66,80 @@ pricing_dir = ""
backend = s3
aws_auth_sign_version = 2

If the pricing_dir is not empty, it will create an <dir>/droplet.csv file wich
will record all S3 operations.
See the 'libdroplet/src/pricing.c' code for an explanation.
If the pricing_dir is not empty,
it will create an <profile_directory>/droplet.csv file which will record all S3 operations.
See the code at https://github.com/bareos/Droplet/blob/bareos-master/libdroplet/src/pricing.c for an explanation.

The parameter "aws_auth_sign_version = 2" is for the connection to a CEPH AWS connection.
The parameter "aws_auth_sign_version = 2" is for the connection to a CEPH S3 gateway.
For use with AWS S3 the aws_auth_sign_version, must be set to "4".

On the Director you connect to the Storage Device with the following configuration
Example bareos-dir.d/storage file:
Example bareos-dir.d/storage/S3_1-00.conf file:

Storage {
Name = S3_1-00
Name = S3_Object
Address = "Replace this by the Bareos Storage Daemon FQDN or IP address"
Password = "Replace this by the Bareos Storage Daemon director password"
Device = S3_ObjectStorage
Media Type = S3_File_1
Media Type = S3_Object1
}


Troubleshooting
===============

S3 Backend Unreachable
----------------------

The droplet device can run in two modes:
* direct writing (iothreads = 0)
* cached writing (iothreads >= 1)

If iothreads >= 1, retries = 0 (unlimited retries) and the droplet backend (e.g. S3 storage) is not available, a job will continue running until the backend problem is fixed.
If this is the case and the job is canceled, it will only be canceled on the Director. It continues running on the Storage Daemon, until the S3 backend is available again or the Storage Daemon itself is restarted.

If iothreads >= 1, retries != 0 and the droplet backend (e.g. S3 storage) is not available, write operation will be silently discarded after the specified number of retries.
*Don't use this combination of options*.

Caching when S3 backend is not available:
This behaviour have not changed, but I fear problems can arise, if the backend is not available and all write operations are stored in memory.

The status of the cache can be determined with the "status storage=..." command.


Pending IO chunks (and inflight chunks):
```
...
Device "S3_ObjectStorage" (S3) is mounted with:
Volume: Full-0085
Pool: Full
Media type: S3_Object1
Backend connection is working.
Inflight chunks: 2
Pending IO flush requests:
/Full-0085/0002 - 10485760 (try=0)
/Full-0085/0003 - 10485760 (try=0)
/Full-0085/0004 - 10485760 (try=0)
...
Attached Jobs: 175
...
```

If try > 0, problems did already occur. The system will continue retrying.


Status without pending IO chunks:
```
Device "S3_ObjectStorage" (S3) is mounted with:
Volume: Full-0084
Pool: Full
Media type: S3_Object1
Backend connection is working.
No Pending IO flush requests.
Configured device capabilities:
EOF BSR BSF FSR FSF EOM !REM RACCESS AUTOMOUNT LABEL !ANONVOLS !ALWAYSOPEN
Device state:
OPENED !TAPE LABEL !MALLOC APPEND !READ EOT !WEOT !EOF !NEXTVOL !SHORT MOUNTED
num_writers=0 reserves=0 block=8
Attached Jobs:
```
8 changes: 5 additions & 3 deletions src/dird/msgchan.c
Expand Up @@ -468,10 +468,12 @@ extern "C" void *msg_thread(void *arg)
}
if (n == BNET_HARDEOF) {
/*
* This probably should be M_FATAL, but I am not 100% sure
* that this return *always* corresponds to a dropped line.
* A lost connection to the storage daemon is FATAL.
* This is required, as otherwise
* the job could failed to write data
* but still end as JS_Warnings (OK -- with warnings).
*/
Qmsg(jcr, M_ERROR, 0, _("Director's comm line to SD dropped.\n"));
Qmsg(jcr, M_FATAL, 0, _("Director's comm line to SD dropped.\n"));
}
if (is_bnet_error(sd)) {
jcr->SDJobStatus = JS_ErrorTerminated;
Expand Down
21 changes: 19 additions & 2 deletions src/stored/acquire.c
Expand Up @@ -3,7 +3,7 @@
Copyright (C) 2002-2013 Free Software Foundation Europe e.V.
Copyright (C) 2011-2012 Planets Communications B.V.
Copyright (C) 2013-2016 Bareos GmbH & Co. KG
Copyright (C) 2013-2018 Bareos GmbH & Co. KG
This program is Free Software; you can redistribute it and/or
modify it under the terms of version three of the GNU Affero General Public
Expand Down Expand Up @@ -492,12 +492,29 @@ bool release_device(DCR *dcr)
char tbuf[100];
int was_blocked = BST_NOT_BLOCKED;

Jmsg(jcr, M_INFO, 0, "Releasing device %s.\n", dev->print_name());

/*
* Capture job statistics now that we are done using this device.
*/
now = (utime_t)time(NULL);
update_job_statistics(jcr, now);

/*
* Some devices do cache write operations (e.g. droplet_device).
* Therefore flushing the cache is required to determine
* if a job have been written successfully.
* As a flush operation can take quite a long time,
* this must be done before acquiring locks.
* A previous implementation did the flush inside dev->close(),
* which resulted in various locking problems.
*/
if (!job_canceled(jcr)) {
if (!dev->flush(dcr)) {
Jmsg(jcr, M_FATAL, 0, "Failed to flush device %s.\n", dev->print_name());
}
}

dev->Lock();
if (!dev->is_blocked()) {
block_device(dev, BST_RELEASING);
Expand All @@ -506,7 +523,7 @@ bool release_device(DCR *dcr)
dev->set_blocked(BST_RELEASING);
}
lock_volumes();
Dmsg2(100, "release_device device %s is %s\n", dev->print_name(), dev->is_tape() ? "tape" : "disk");
Dmsg1(100, "releasing device %s\n", dev->print_name());

/*
* If device is reserved, job never started, so release the reserve here
Expand Down
14 changes: 7 additions & 7 deletions src/stored/append.c
Expand Up @@ -2,7 +2,7 @@
BAREOS® - Backup Archiving REcovery Open Sourced
Copyright (C) 2000-2012 Free Software Foundation Europe e.V.
Copyright (C) 2016-2016 Bareos GmbH & Co. KG
Copyright (C) 2016-2018 Bareos GmbH & Co. KG
This program is Free Software; you can redistribute it and/or
modify it under the terms of version three of the GNU Affero General Public
Expand Down Expand Up @@ -315,6 +315,11 @@ bool do_append_data(JCR *jcr, BSOCK *bs, const char *what)
commit_data_spool(dcr);
}

/*
* Release the device -- and send final Vol info to DIR and unlock it.
*/
release_device(dcr);

/*
* Don't use time_t for job_elapsed as time_t can be 32 or 64 bits,
* and the subsequent Jmsg() editing will break
Expand All @@ -324,15 +329,10 @@ bool do_append_data(JCR *jcr, BSOCK *bs, const char *what)
job_elapsed = 1;
}

Jmsg(dcr->jcr, M_INFO, 0, _("Elapsed time=%02d:%02d:%02d, Transfer rate=%s Bytes/second\n"),
Jmsg(jcr, M_INFO, 0, _("Elapsed time=%02d:%02d:%02d, Transfer rate=%s Bytes/second\n"),
job_elapsed / 3600, job_elapsed % 3600 / 60, job_elapsed % 60,
edit_uint64_with_suffix(jcr->JobBytes / job_elapsed, ec));

/*
* Release the device -- and send final Vol info to DIR and unlock it.
*/
release_device(dcr);

if ((!ok || jcr->is_job_canceled()) && !jcr->is_JobStatus(JS_Incomplete)) {
discard_attribute_spool(jcr);
} else {
Expand Down
10 changes: 5 additions & 5 deletions src/stored/backends/Makefile.in
Expand Up @@ -80,27 +80,27 @@ STORED_RESTYPES = autochanger device director ndmp messages storage
$(NO_ECHO)$(LIBTOOL_COMPILE) $(CXX) $(DEFS) $(DEBUG) -c $(WCFLAGS) $(CPPFLAGS) $(INCLUDES) $(DINCLUDE) $(CXXFLAGS) $<
if [ -d "$(@:.lo=.d)" ]; then $(MKDIR) $(CONF_EXTRA_DIR); $(CP) -r $(@:.lo=.d)/. $(CONF_EXTRA_DIR)/.; fi

$(CHEPHFS_LOBJS):
$(CHEPHFS_LOBJS): $(CHEPHFS_SRCS)
@echo "Compiling $(@:.lo=.c)"
$(NO_ECHO)$(LIBTOOL_COMPILE) $(CXX) $(DEFS) $(DEBUG) -c $(WCFLAGS) $(CPPFLAGS) $(INCLUDES) $(CEPHFS_INC) $(DINCLUDE) $(CXXFLAGS) $(@:.lo=.c)
if [ -d "$(@:.lo=.d)" ]; then $(MKDIR) $(CONF_EXTRA_DIR); $(CP) -r $(@:.lo=.d)/. $(CONF_EXTRA_DIR)/.; fi

$(DROPLET_LOBJS):
$(DROPLET_LOBJS): $(DROPLET_SRCS)
@echo "Compiling $(@:.lo=.c)"
$(NO_ECHO)$(LIBTOOL_COMPILE) $(CXX) $(DEFS) $(DEBUG) -c $(WCFLAGS) $(CPPFLAGS) $(INCLUDES) $(DROPLET_INC) $(DINCLUDE) $(CXXFLAGS) $(@:.lo=.c)
if [ -d "$(@:.lo=.d)" ]; then $(MKDIR) $(CONF_EXTRA_DIR); $(CP) -r $(@:.lo=.d)/. $(CONF_EXTRA_DIR)/.; fi

$(ELASTO_LOBJS):
$(ELASTO_LOBJS): $(ELASTO_SRCS)
@echo "Compiling $(@:.lo=.c)"
$(NO_ECHO)$(LIBTOOL_COMPILE) $(CXX) $(DEFS) $(DEBUG) -c $(WCFLAGS) $(CPPFLAGS) $(INCLUDES) $(ELASTO_INC) $(DINCLUDE) $(CXXFLAGS) $(@:.lo=.c)
if [ -d "$(@:.lo=.d)" ]; then $(MKDIR) $(CONF_EXTRA_DIR); $(CP) -r $(@:.lo=.d)/. $(CONF_EXTRA_DIR)/.; fi

$(GFAPI_LOBJS):
$(GFAPI_LOBJS): $(GFAPI_SRCS)
@echo "Compiling $(@:.lo=.c)"
$(NO_ECHO)$(LIBTOOL_COMPILE) $(CXX) $(DEFS) $(DEBUG) -c $(WCFLAGS) $(CPPFLAGS) $(INCLUDES) $(GLUSTER_INC) $(DINCLUDE) $(CXXFLAGS) $(@:.lo=.c)
if [ -d "$(@:.lo=.d)" ]; then $(MKDIR) $(CONF_EXTRA_DIR); $(CP) -r $(@:.lo=.d)/. $(CONF_EXTRA_DIR)/.; fi

$(RADOS_LOBJS):
$(RADOS_LOBJS): $(RADOS_SRCS)
@echo "Compiling $(@:.lo=.c)"
$(NO_ECHO)$(LIBTOOL_COMPILE) $(CXX) $(DEFS) $(DEBUG) -c $(WCFLAGS) $(CPPFLAGS) $(INCLUDES) $(RADOS_INC) $(DINCLUDE) $(CXXFLAGS) $(@:.lo=.c)
if [ -d "$(@:.lo=.d)" ]; then $(MKDIR) $(CONF_EXTRA_DIR); $(CP) -r $(@:.lo=.d)/. $(CONF_EXTRA_DIR)/.; fi
Expand Down

0 comments on commit 0aadb37

Please sign in to comment.