Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
Add new retry filter.
This filter can be used to transparently reopen/retry when a plugin
fails.  The connection is closed and reopened which for most plugins
causes them to attempt to reconnect to their source.

For example if doing a long or slow SSH copy:

  nbdkit -U - ssh host=remote /var/tmp/test.iso \
    --run 'qemu-img convert -p -f raw $nbd -O qcow2 test.qcow2'

if the SSH connection or network goes down in the middle then the
whole operation will fail.

By adding the retry filter:

  nbdkit -U - ssh --filter=retry host=remote /var/tmp/test.iso \
    --run 'qemu-img convert -p -f raw $nbd -O qcow2 test.qcow2'

this operation can recover from temporary failures in at least some
circumstances.  The NBD connection (a local Unix domain socket in the
example above) is not interrupted during retries, so NBD clients don't
need to be taught how to retry - everything is handled internally by
nbdkit.
  • Loading branch information
rwmjones committed Sep 20, 2019
1 parent fb72fcd commit f0f0ec4
Show file tree
Hide file tree
Showing 10 changed files with 516 additions and 0 deletions.
16 changes: 16 additions & 0 deletions TODO
Expand Up @@ -136,6 +136,22 @@ nbdkit-rate-filter:
* split large requests to avoid long, lumpy sleeps when request size
is much larger than rate limit

nbdkit-retry-filter:

* allow user to specify which errors cause a retry and which ones are
passed through; for example there's probably no point retrying on
ENOMEM

* implement a softer mode (retry-reopen=no) where we don't reopen the
plugin, we just retry the data command that failed

* there are all kinds of extra complications possible here,
eg. specifying a pattern of retrying and reopening:
retry-method=RRRORRRRRORRRRR meaning to retry the data command 3
times, reopen, retry 5 times, etc.

* subsecond times

Filters for security
--------------------

Expand Down
2 changes: 2 additions & 0 deletions configure.ac
Expand Up @@ -872,6 +872,7 @@ filters="\
partition \
rate \
readahead \
retry \
stats \
truncate \
xz \
Expand Down Expand Up @@ -952,6 +953,7 @@ AC_CONFIG_FILES([Makefile
filters/partition/Makefile
filters/rate/Makefile
filters/readahead/Makefile
filters/retry/Makefile
filters/stats/Makefile
filters/truncate/Makefile
filters/xz/Makefile
Expand Down
3 changes: 3 additions & 0 deletions docs/nbdkit-captive.pod
Expand Up @@ -92,6 +92,9 @@ help performance:
nbdkit -U - --filter=readahead curl https://example.com/disk.img \
--run 'qemu-img convert $nbd disk.img'

If the source suffers from temporary network failures
L<nbdkit-retry-filter(1)> may help.

To overwrite a file inside an uncompressed tar file (the file being
overwritten must be the same size), use L<nbdkit-tar-plugin(1)> like
this:
Expand Down
1 change: 1 addition & 0 deletions filters/readahead/nbdkit-readahead-filter.pod
Expand Up @@ -35,6 +35,7 @@ plugin in the normal way.
L<nbdkit(1)>,
L<nbdkit-cache-filter(1)>,
L<nbdkit-curl-plugin(1)>,
L<nbdkit-retry-filter(1)>,
L<nbdkit-ssh-plugin(1)>,
L<nbdkit-vddk-plugin(1)>,
L<nbdkit-filter(3)>,
Expand Down
67 changes: 67 additions & 0 deletions filters/retry/Makefile.am
@@ -0,0 +1,67 @@
# nbdkit
# Copyright (C) 2019 Red Hat Inc.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# * Neither the name of Red Hat nor the names of its contributors may be
# used to endorse or promote products derived from this software without
# specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.

include $(top_srcdir)/common-rules.mk

EXTRA_DIST = nbdkit-retry-filter.pod

filter_LTLIBRARIES = nbdkit-retry-filter.la

nbdkit_retry_filter_la_SOURCES = \
retry.c \
$(top_srcdir)/include/nbdkit-filter.h \
$(NULL)

nbdkit_retry_filter_la_CPPFLAGS = \
-I$(top_srcdir)/include \
-I$(top_srcdir)/common/include \
-I$(top_srcdir)/common/utils \
$(NULL)
nbdkit_retry_filter_la_CFLAGS = $(WARNINGS_CFLAGS)
nbdkit_retry_filter_la_LDFLAGS = \
-module -avoid-version -shared \
-Wl,--version-script=$(top_srcdir)/filters/filters.syms \
$(NULL)
nbdkit_retry_filter_la_LIBADD = \
$(top_builddir)/common/utils/libutils.la \
$(NULL)

if HAVE_POD

man_MANS = nbdkit-retry-filter.1
CLEANFILES += $(man_MANS)

nbdkit-retry-filter.1: nbdkit-retry-filter.pod
$(PODWRAPPER) --section=1 --man $@ \
--html $(top_builddir)/html/$@.html \
$<

endif HAVE_POD
108 changes: 108 additions & 0 deletions filters/retry/nbdkit-retry-filter.pod
@@ -0,0 +1,108 @@
=head1 NAME

nbdkit-retry-filter - reopen connection on error

=head1 SYNOPSIS

nbdkit --filter=retry PLUGIN [retries=N] [retry-delay=N]
[retry-exponential=yes|no]
[retry-readonly=yes|no]

=head1 DESCRIPTION

C<nbdkit-retry-filter> is a filter that transparently reopens the
plugin connection when an error is encountered. It can be used to
make long-running copy operations reliable in the presence of
temporary network failures, without requiring any changes to the
plugin or the NBD client.

Several optional parameters are available to control:

=over 4

=item *

how many times we retry,

=item *

the delay between retries, and whether we wait longer each time (known
as “exponential back-off”),

=item *

if we reopen the plugin in read-only mode after the first failure.

=back

The default (with no parameters) is designed to offer a happy medium
between recovering from short temporary failures but not doing
anything too bad when permanent or unrecoverable failures happen. The
default behaviour is: we retry 5 times with exponential back-off,
waiting in total about 1 minute before we give up.

=head1 EXAMPLE

In this example we copy and convert a large file using
L<nbdkit-ssh-plugin(1)>, L<qemu-img(1)> and L<nbdkit-captive(1)>.

nbdkit -U - \
ssh host=remote.example.com /var/tmp/test.iso \
--filter=retry \
--run 'qemu-img convert -p -f raw $nbd -O qcow2 test.qcow2'

Without I<--filter=retry> a temporary failure would cause the copy to
fail (for example, the remote host’s firewall is restarted causing the
SSH connection to be dropped). Adding this filter means that it may
be possible to transparently recover.

=head1 PARAMETERS

=over 4

=item B<retries=>N

The number of times any single operation will be retried before we
give up and fail the operation. The default is 5.

=item B<retry-delay=>N

The number of seconds to wait before retrying. The default is 2
seconds.

=item B<retry-exponential=yes>

Use exponential back-off. The retry delay is doubled between each
retry. This is the default.

=item B<retry-exponential=no>

Do not use exponential back-off. The retry delay is the same between
each retry.

=item B<retry-readonly=yes>

As soon as a failure occurs, switch the underlying plugin to read-only
mode for the rest of this connection. (A new NBD client connection
will still open the plugin in the original mode.)

=item B<retry-readonly=no>

Do not change the read-write/read-only mode of the plugin when
retrying. This is the default.

=back

=head1 SEE ALSO

L<nbdkit(1)>,
L<nbdkit-filter(3)>,
L<nbdkit-readahead-filter(1)>.

=head1 AUTHORS

Richard W.M. Jones

=head1 COPYRIGHT

Copyright (C) 2019 Red Hat Inc.

0 comments on commit f0f0ec4

Please sign in to comment.