Skip to content

Commits

Permalink
2019-retry-fil…
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Commits on Sep 20, 2019

  1. retry: Add a test of this filter.

    We use a custom sh plugin to test retries are working.
    rwmjones committed Sep 20, 2019
    Copy the full SHA
    7543498 View commit details
    Browse the repository at this point in the history
  2. Add new retry filter.

    This filter can be used to transparently reopen/retry when a plugin
    fails.  The connection is closed and reopened which for most plugins
    causes them to attempt to reconnect to their source.
    
    For example if doing a long or slow SSH copy:
    
      nbdkit -U - ssh host=remote /var/tmp/test.iso \
        --run 'qemu-img convert -p -f raw $nbd -O qcow2 test.qcow2'
    
    if the SSH connection or network goes down in the middle then the
    whole operation will fail.
    
    By adding the retry filter:
    
      nbdkit -U - ssh --filter=retry host=remote /var/tmp/test.iso \
        --run 'qemu-img convert -p -f raw $nbd -O qcow2 test.qcow2'
    
    this operation can recover from temporary failures in at least some
    circumstances.  The NBD connection (a local Unix domain socket in the
    example above) is not interrupted during retries, so NBD clients don't
    need to be taught how to retry - everything is handled internally by
    nbdkit.
    rwmjones committed Sep 20, 2019
    Copy the full SHA
    f0f0ec4 View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2019

  1. backend: Implement next_ops->reopen call.

    This is intended for use by the forthcoming retry filter to close and
    reopen the backend chain.  It is handled entirely by server/backend.c
    as no cooperation is needed with the plugin.
    
    Note the explicit readonly parameter: An alternative would be to store
    the previous readonly setting in the b_conn_handle struct.  However
    passing it explicitly allows the retry filter to retry as readonly,
    which might be useful.  This design does however require any filter
    which might call .reopen to save the original readonly parameter from
    the .open call.
    rwmjones committed Sep 19, 2019
    Copy the full SHA
    fb72fcd View commit details
    Browse the repository at this point in the history
  2. Copy the full SHA
    cfbbe17 View commit details
    Browse the repository at this point in the history
  3. tests: rate: Simplify tests of the rate filter using bash $SECONDS.

    Thanks: Eric Blake for this suggestion.
    rwmjones committed Sep 19, 2019
    Copy the full SHA
    ddd90ce View commit details
    Browse the repository at this point in the history
  4. Copy the full SHA
    115f095 View commit details
    Browse the repository at this point in the history
  5. filters: Remove bogus THREAD_MODEL.

    Since commit ee61d23 the THREAD_MODEL
    macro has been ignored by filters.  In any case it defaults to
    parallel, so defining it would be useless.
    rwmjones committed Sep 19, 2019
    Copy the full SHA
    3a4c4d5 View commit details
    Browse the repository at this point in the history
  6. docs: Tidy up paragraphs discussing next_ops.

    Add paragraphs, subheadings.  Just documentation reformatting.
    rwmjones committed Sep 19, 2019
    Copy the full SHA
    c761d0a View commit details
    Browse the repository at this point in the history
  7. Copy the full SHA
    7b7ef59 View commit details
    Browse the repository at this point in the history
  8. server: Remove tricksy initialization of struct b_conn_handle.

    b_conn_handle fields exportsize and can_* have a special meaning when
    they are -1.  It means that the value of the field has not been
    computed and cached yet.
    
    The struct was being initialized using memset (_, -1, _) which does
    indeed have the effect of setting all the fields to -1, although it's
    somewhat non-obvious to say the least.
    
    This commit replaces it with ordinary field initialization, also
    setting h->handle to NULL.  By keeping the initialization function
    next to the struct definition, hopefully they will be updated in
    tandem in future.
    
    GCC 9.2.1 actually optimizes this back into the memset equivalent
    using inline AVX instructions, so good job there!
    
    Simple refactoring, should have no effect on how the code works.
    
    See commit d60d0f4.
    rwmjones committed Sep 19, 2019
    Copy the full SHA
    5cdf4d6 View commit details
    Browse the repository at this point in the history
  9. server: Fix back-to-back SET_META_CONTEXT

    The NBD spec says that if a client requests a 1-query SET_META_CONTEXT
    for context "base:allocation", then changes its mind and requests a
    1-query SET_META_CONTEXT for "other:context", only the final query
    matters; in such a case, since we did not reply with a context the
    second time, the client must not call NBD_CMD_BLOCK_STATUS, and the
    server should fail it with EINVAL.  If the client actually wants two
    contexts, it must request them in a 2-query SET_META_CONTEXT.
    However, our code didn't reset the boolean between two uses of the
    option, so we were not catching an invalid client that requests block
    status in spite of no response to their second SET_META_CONTEXT.
    
    Note that there are no known clients in the wild that can actually
    perform this secondary SET_META_CONTEXT request; this was found by
    inspection.  As nbdkit does not crash in this situation, I don't see
    the point in hacking libnbd to make it possible to become such a
    client.
    
    Fixes: 26455d4
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 19, 2019
    Copy the full SHA
    7ed83d1 View commit details
    Browse the repository at this point in the history
  10. server: Fix regression for NBD_OPT_INFO before NBD_OPT_GO

    Most known NBD clients do not bother with NBD_OPT_INFO (except for
    clients like 'qemu-nbd --list' that don't ever intend to connect), but
    go straight to NBD_OPT_GO.  However, it's not too hard to hack up qemu
    to add in an extra client step (whether info on the same name, or more
    interestingly, info on a different name), as a patch against qemu
    commit 6f214b30445:
    
    | diff --git i/nbd/client.c w/nbd/client.c
    | index f6733962b49b..425292ac5ea9 100644
    | --- i/nbd/client.c
    | +++ w/nbd/client.c
    | @@ -1038,6 +1038,14 @@ int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
    |           * TLS).  If it is not available, fall back to
    |           * NBD_OPT_LIST for nicer error messages about a missing
    |           * export, then use NBD_OPT_EXPORT_NAME.  */
    | +        if (getenv ("HACK"))
    | +            info->name[0]++;
    | +        result = nbd_opt_info_or_go(ioc, NBD_OPT_INFO, info, errp);
    | +        if (getenv ("HACK"))
    | +            info->name[0]--;
    | +        if (result < 0) {
    | +            return -EINVAL;
    | +        }
    |          result = nbd_opt_info_or_go(ioc, NBD_OPT_GO, info, errp);
    |          if (result < 0) {
    |              return -EINVAL;
    
    This works just fine in 1.14.0, where we call .open only once (so the
    INFO and GO repeat calls into the same plugin handle), but in 1.14.1
    it regressed into causing an assertion failure: we are now calling
    .open a second time on a connection that is already opened:
    
    $ nbdkit -rfv null &
    $ hacked-qemu-io -f raw -r nbd://localhost -c quit
    ...
    nbdkit: null[1]: debug: null: open readonly=1
    nbdkit: backend.c:179: backend_open: Assertion `h->handle == NULL' failed.
    
    Worse, on the mainline development, we have recently made it possible
    for plugins to actively report different information for different
    export names; for example, a plugin may choose to report different
    answers for .can_write on export A than for export B; but if we share
    cached handles, then an NBD_OPT_INFO on one export prevents correct
    answers for NBD_OPT_GO on the second export name.  (The HACK envvar in
    my qemu modifications can be used to demonstrate cross-name requests,
    which are even less likely in a real client).
    
    The solution is to call .close after NBD_OPT_INFO, coupled with enough
    glue logic to reset cached connection handles back to the state
    expected by .open.  This in turn means factoring out another backend_*
    function, but also gives us an opportunity to change
    backend_set_handle to no longer accept NULL.
    
    The assertion failure is, to some extent, a possible denial of service
    attack (one client can force nbdkit to exit by merely sending OPT_INFO
    before OPT_GO, preventing the next client from connecting), although
    this is mitigated by using TLS to weed out untrusted clients.  Still,
    the fact that we introduced a potential DoS attack while trying to fix
    a traffic amplification security bug is not very nice.
    
    Sadly, as there are no known clients that easily trigger this mode of
    operation (OPT_INFO before OPT_GO), there is no easy way to cover this
    via a testsuite addition.  I may end up hacking something into libnbd.
    
    Fixes: c05686f
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 19, 2019
    Copy the full SHA
    a6b88b1 View commit details
    Browse the repository at this point in the history

Commits on Sep 18, 2019

  1. server: Saner filter .close calls

    I found a core dump when forcing nbdkit-xz-filter to fail .open:
    
    term1$ nbdkit -fv --filter=xz null xz-max-depth=$((0x4000000000000000))
    term1...
    term2$ qemu-nbd --list
    term1...
    nbdkit: debug: null: open readonly=1
    nbdkit: error: calloc: Cannot allocate memory
    nbdkit: debug: xz: close
    Segmentation fault (core dumped)
    
    Note that the use of --run or daemonizing nbdkit makes it a bit harder
    to see the core dump, but it is still happening. There are five other
    filters that currently override .open: cow, log, partition, rate, and
    truncate, but making them fail .open requires getting a malloc()
    failure rather than having a user parameter that can be supplied with
    a bogusly large value.
    
    The failure happens because xz's .open fails after null's has
    succeeded, but the cleanup code that decides whether to call the
    .close chain looks solely at whether the plugin's handle is non-NULL.
    However, our code would call into a filter's .close even if handle was
    NULL (this is because we did not distinguish between a filter that
    failed .open, like xz, vs. a filter that lacks an override for .open).
    And calling xz.close(NULL) causes a SIGSEGV.  Of course, we could
    patch the xz filter to paper over this by short-circuiting if handle
    is NULL, but that feels dirty compared to never calling .close with a
    handle not returned by a successful .open.  Thus, a more generic fix
    to this is to make filters.c always set a non-NULL handle on .open
    success, then only call .close when the handle was set, because that's
    the only time .open succeeded.  Doing this changes the handle passed
    to filters that don't override .open or which return
    NBDKIT_HANDLE_NOT_NEEDED - but such filters shouldn't care.
    
    Conversely, if a plugin's .open fails, we skip .close for the entire
    backend chain.  This could potentially be problematic (a memory leak)
    if any of our filters had returned success to .open without calling
    the backend's .open - but fortunately all of our filters called next()
    prior to doing anything locally, and that practice matches
    documentation.  Still, we can ensure that things are sane by asserting
    that if .open succeeded, the backend's handle is also set.
    
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 18, 2019
    Copy the full SHA
    fd2deeb View commit details
    Browse the repository at this point in the history
  2. server: Remove useless thread local sockaddr.

    When accepting a connection on a TCP or Unix domain socket we recorded
    the peer address in both the thread_data struct and thread-local
    storage.  But for no reason because it was never used anywhere.  Since
    we were only allocating a ‘struct sockaddr’ (rather than a ‘struct
    sockaddr_storage’) it's likely that some peer addresses would have
    been truncated.
    
    Remove all this code, it had no effect.
    
    Plugins that want to get the peer address can use nbdkit_peer_name()
    which was added in commit 03a2cc3 and doesn't suffer from the
    above truncation problem.
    
    (I considered an alternative where we use the saved address to answer
    nbdkit_peer_name but since that call will in general be used very
    rarely it doesn't make sense to do the extra work for all callers.)
    rwmjones committed Sep 18, 2019
    Copy the full SHA
    c4a9519 View commit details
    Browse the repository at this point in the history
  3. server: Change unreachable oldstyle error to assert

    Ever since we introduced TLS support, main.c has always given a better
    error message for attempting --tls=require with -o, preventing the
    lower-quality message in oldstyle from ever being emitted.
    
    Fixes: c5e7649
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 18, 2019
    Copy the full SHA
    163a54b View commit details
    Browse the repository at this point in the history
  4. server: Fix debug message

    In refactoring .prepare, I inadvertently changed the log message.
    
    Fixes: e4d334e
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 18, 2019
    Copy the full SHA
    48c60c5 View commit details
    Browse the repository at this point in the history
  5. tests: Fix test-cow-null when port 10809 is occupied

    Use a unix socket instead.
    
    Fixes: 7822f8f
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 18, 2019
    Copy the full SHA
    6d2828e View commit details
    Browse the repository at this point in the history

Commits on Sep 17, 2019

  1. tests: Test same set of reflected names

    test-export-name.sh was covering a few more names than
    test-reflection-raw.sh or test-reflection-base64.sh.  Also, testing
    "%%" ensures that there is no accidental printf formatting happening
    on the user string.
    
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 17, 2019
    Copy the full SHA
    d3cce1c View commit details
    Browse the repository at this point in the history
  2. reflection: Note default is exportname in --help output.

    Fixes commit 2a29634
    and commit 1d4c8ab.
    
    Thanks: Eric Blake.
    rwmjones committed Sep 17, 2019
    Copy the full SHA
    7692a24 View commit details
    Browse the repository at this point in the history
  3. reflection: Enhance plugin to support client address mode.

    This adds nbdkit reflection mode=address which reflects the IP address
    and port number back to the client.  Although this is reflecting
    client data back to the server we believe it is implemented securely.
    rwmjones committed Sep 17, 2019
    Copy the full SHA
    1d4c8ab View commit details
    Browse the repository at this point in the history
  4. Add reflection plugin.

    The source for the easter egg example is not included, but it is:
    
            org 0x7c00
            ;  clear screen
            mov ah,0
            mov al,3
            int 0x10
            ; print string
            mov ah,0x13
            mov bl,0xa
            mov al,1
            mov cx,len
            mov dh,0
            mov dl,0
            mov bp,hello
            int 0x10
            hlt
    hello:  db "*** Hello from nbdkit! ***",0xd,0xa
    len:    equ $-hello
            ; pad to end of sector
            times 510-($-$$) db 0
            ; boot signature
            db 0x55,0xaa
    rwmjones committed Sep 17, 2019
    Copy the full SHA
    2a29634 View commit details
    Browse the repository at this point in the history
  5. server: Add nbdkit_peer_name() to return the client address.

    Works essentially just like calling getpeername(2), because that's how
    it is implemented.
    rwmjones committed Sep 17, 2019
    Copy the full SHA
    03a2cc3 View commit details
    Browse the repository at this point in the history
  6. guestfs, libvirt: Rename ‘connect’ global to avoid -Wshadow warning.

    In a future commit we intend to change <nbdkit-common.h> so it
    includes <sys/socket.h>.  However because this exports connect(2) we
    need to rename globals called ‘connect’ to avoid a conflict.
    rwmjones committed Sep 17, 2019
    Copy the full SHA
    e70365b View commit details
    Browse the repository at this point in the history

Commits on Sep 16, 2019

  1. common/bitmap: Don't fail on realloc (ptr, 0)

    The following commands:
    
      nbdkit -fv --filter=cow memory size=512 --run 'qemu-img info $nbd'
      nbdkit -fv --filter=cache memory size=512 --run 'qemu-img info $nbd'
      nbdkit -fv --filter=cow null --run 'qemu-img info $nbd'
    
    all fail with:
    
      nbdkit: memory[1]: error: realloc: Success
    
    Initial git bisect pointed to commit 3166d2b (this is not real
    cause, it merely exposes the bug).
    
    The reason this happens is because the new behaviour after
    commit 3166d2b is to round down the size of the underlying disk
    to the cow/cache filter block size.
    
    => Size of the underlying disk is 512 or 0 bytes,
       block size is 4096 bytes.
    
    => Size of the disk is rounded down to 0.
    
    => The cow/cache filter requests a bitmap of size 0.
    
    => bitmap_resize calls realloc (ptr, 0).
    
    => The glibc implementation of realloc returns NULL + errno == 0.
       (Other realloc implementations can behave differently.)
    
    => This is not an error, but the existing code thinks it is because of
       the NULL return.
    
    This commit changes the code so it doesn't bother to call realloc if
    the new bitmap size would be 0.
    
    There are many other places in nbdkit where we call realloc, and I did
    not vet any of them to see if similar bugs could be present, but it is
    quite likely.
    
    Note in passing that the correct way to use the cow/cache filter with
    a disk which isn't a multiple of the block size is to combine it with
    the truncate filter, eg:
    
      nbdkit -fv --filter=cow --filter=truncate memory size=512 round-up=4096
    
    Thanks: Eric Blake
    rwmjones committed Sep 16, 2019
    Copy the full SHA
    7822f8f View commit details
    Browse the repository at this point in the history

Commits on Sep 15, 2019

  1. tests: Be more robust to arbitrary export names

    We failed to handle export names beginning with - or containing \;
    and our handling of a name ending in newline worked by sheer luck.
    
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 15, 2019
    Copy the full SHA
    c9812cd View commit details
    Browse the repository at this point in the history

Commits on Sep 13, 2019

  1. docs: Mark up nbdkit_error as C<> (code).

    Fixes commit 4f303e4.
    rwmjones committed Sep 13, 2019
    Copy the full SHA
    0f2cba2 View commit details
    Browse the repository at this point in the history
  2. Copy the full SHA
    c6d7356 View commit details
    Browse the repository at this point in the history

Commits on Sep 12, 2019

  1. tests: Add a simple test of nbdkit_export_name.

    Thanks: Eric Blake for simplifying the test.
    rwmjones committed Sep 12, 2019
    Copy the full SHA
    2f35cd2 View commit details
    Browse the repository at this point in the history
  2. sh: Pass export name as an extra parameter to the open method.

    In nbdkit API v3 we will probably add the export name as an extra
    parameter, but while we are using API v2 we can get the same effect by
    calling nbdkit_export_name().
    rwmjones committed Sep 12, 2019
    Copy the full SHA
    01c2379 View commit details
    Browse the repository at this point in the history
  3. server: Add nbdkit_export_name() to allow export name to be read.

    This allows plugins (or filters) to read the export name which was
    passed to the server from the client.
    rwmjones committed Sep 12, 2019
    Copy the full SHA
    4f303e4 View commit details
    Browse the repository at this point in the history
  4. TODO: Mention idea about skipping handshake phase

    The kernel's /dev/nbdX is a client side that starts life with an fd
    already in transmission phase; it may be nice to let nbdkit be a
    counterpart server that likewise skips the standardized NBD handshake
    phase.
    
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 12, 2019
    Copy the full SHA
    8787054 View commit details
    Browse the repository at this point in the history
  5. Version 1.15.1.

    rwmjones committed Sep 12, 2019
    Copy the full SHA
    b43e06b View commit details
    Browse the repository at this point in the history
  6. server: Wait until handshake complete before calling .open callback

    Currently we call the plugin .open callback as soon as we receive a
    TCP connection:
    
      $ nbdkit -fv --tls=require --tls-certificates=tests/pki null \
               --run "telnet localhost 10809"
      [...]
      Trying ::1...
      Connected to localhost.
      Escape character is '^]'.
      nbdkit: debug: accepted connection
      nbdkit: debug: null: open readonly=0       ◀ NOTE
      nbdkit: null[1]: debug: newstyle negotiation: flags: global 0x3
      NBDMAGICIHAVEOPT
    
    In plugins such as curl, guestfs, ssh, vddk and others we do a
    considerable amount of work in the .open callback (such as making a
    remote connection or launching an appliance).  Therefore we are
    providing an easy Denial of Service / Amplification Attack for
    unauthorized clients to cause a lot of work to be done for only the
    cost of a simple TCP 3 way handshake.
    
    This commit moves the call to the .open callback after the NBD
    handshake has mostly completed.  In particular TLS authentication must
    be complete before we will call into the plugin.
    
    It is unlikely that there are plugins which really depend on the
    current behaviour of .open (which I found surprising even though I
    guess I must have written it).  If there are then we could add a new
    .connect callback or similar to allow plugins to get control at the
    earlier point in the connection.
    
    After this change you can see that the .open callback is not called
    from just a simple TCP connection:
    
      $ ./nbdkit -fv --tls=require --tls-certificates=tests/pki null \
                 --run "telnet localhost 10809"
      [...]
      Trying ::1...
      Connected to localhost.
      Escape character is '^]'.
      nbdkit: debug: accepted connection
      nbdkit: null[1]: debug: newstyle negotiation: flags: global 0x3
      NBDMAGICIHAVEOPT
      xx
      nbdkit: null[1]: debug: newstyle negotiation: client flags: 0xd0a7878
      nbdkit: null[1]: error: client requested unknown flags 0xd0a7878
      Connection closed by foreign host.
      nbdkit: debug: null: unload plugin
    
    Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
    rwmjones committed Sep 12, 2019
    Copy the full SHA
    c05686f View commit details
    Browse the repository at this point in the history
  7. server: Add --mask-handshake option for integration testing

    Similar to --no-sr, it can be handy for testing a client
    implementation to have a server that can easily be forced into older
    behaviors, without having to recompile a one-off hack into a server or
    dig up an older server binary that lacked a newer feature.
    
    To see the patch in action, try things like:
    $ ./nbdkit -U - -fv --mask-handshake=0 null \
      --run 'qemu-nbd --list -k $unixsocket'
    
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 12, 2019
    Copy the full SHA
    89b5288 View commit details
    Browse the repository at this point in the history
  8. server: Skip option haggling from client lacking fixed newstyle

    The NBD protocol states that servers may still choose to honor various
    NBD_OPT_* from a client that did not reply with
    NBD_FLAG_C_FIXED_NEWSTYLE; however, for integration testing purposes,
    it's a lot nicer if we reject everything except NBD_OPT_EXPORT_NAME
    from such a client (for example, with this in place, we might have
    spotted the bug fixed in commit e03b34d a bit sooner).  Thus, a
    client that does not claim to understand fixed newstyle can now no
    longer trigger TLS, structured replies, meta contexts, or the nicer
    handling of NBD_OPT_GO.  All well-known clients listed in
    nbdkit-protocol.pod default to requesting fixed newstyle, so this
    shouldn't affect normal usage.
    
    Signed-off-by: Eric Blake <eblake@redhat.com>
    ebblake committed Sep 12, 2019
    Copy the full SHA
    5caab0a View commit details
    Browse the repository at this point in the history
Older