Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Commits on Nov 26, 2012
  1. @gregkh

    Linux 3.4.20

    gregkh authored
  2. @alexelder @gregkh

    libceph: drop declaration of ceph_con_get()

    alexelder authored gregkh committed
    commit 2610302 upstream.
    
    For some reason the declaration of ceph_con_get() and
    ceph_con_put() did not get deleted in this commit:
        d59315c libceph: drop ceph_con_get/put helpers and nref member
    
    Clean that up.
    
    Signed-off-by: Alex Elder <elder@inktank.com>
    Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  3. @gregkh

    Revert "serial: omap: fix software flow control"

    Felipe Balbi authored gregkh committed
    commit a4f7438 upstream.
    
    This reverts commit 957ee72
    (serial: omap: fix software flow control).
    
    As Russell has pointed out, that commit isn't fixing
    Software Flow Control at all, and it actually makes
    it even more broken.
    
    It was agreed to revert this commit and use Russell's
    latest UART patches instead.
    
    Signed-off-by: Felipe Balbi <balbi@ti.com>
    Cc: Russell King <linux@arm.linux.org.uk>
    Acked-by: Tony Lindgren <tony@atomide.com>
    Cc: Andreas Bießmann <andreas.devel@googlemail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  4. @GArik @gregkh

    ACPI video: Ignore errors after _DOD evaluation.

    GArik authored gregkh committed
    commit fba4e08 upstream.
    
    There are systems where video module known to work fine regardless
    of broken _DOD and ignoring returned value here doesn't cause
    any issues later. This should fix brightness controls on some laptops.
    
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47861
    
    Signed-off-by: Igor Murzov <e-mail@date.by>
    Reviewed-by: Sergey V <sftp.mtuci@gmail.com>
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    Signed-off-by: Abdallah Chatila <abdallah.chatila@ericsson.com>
  5. @alexelder @gregkh

    ceph: avoid 32-bit page index overflow

    alexelder authored gregkh committed
    (cherry picked from commit 6285bc2)
    
    A pgoff_t is defined (by default) to have type (unsigned long).  On
    architectures such as i686 that's a 32-bit type.  The ceph address
    space code was attempting to produce 64 bit offsets by shifting a
    page's index by PAGE_CACHE_SHIFT, but the result was not what was
    desired because the shift occurred before the result got promoted
    to 64 bits.
    
    Fix this by converting all uses of page->index used in this way to
    use the page_offset() macro, which ensures the 64-bit result has the
    intended value.
    
    This fixes http://tracker.newdream.net/issues/3112
    
    Reported-by:  Mohamed Pakkeer <pakkeer.mohideen@realimage.com>
    Signed-off-by: Alex Elder <elder@inktank.com>
    Reviewed-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  6. @liewegas @gregkh

    libceph: check for invalid mapping

    liewegas authored gregkh committed
    (cherry picked from commit d63b77f)
    
    If we encounter an invalid (e.g., zeroed) mapping, return an error
    and avoid a divide by zero.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  7. @gregkh

    ceph: Fix oops when handling mdsmap that decreases max_mds

    Yan, Zheng authored gregkh committed
    (cherry picked from commit 3e8f43a)
    
    When i >= newmap->m_max_mds, ceph_mdsmap_get_addr(newmap, i) return
    NULL. Passing NULL to memcmp() triggers oops.
    
    Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  8. @liewegas @gregkh

    libceph: avoid NULL kref_put when osd reset races with alloc_msg

    liewegas authored gregkh committed
    (cherry picked from commit 9bd9526)
    
    The ceph_on_in_msg_alloc() method drops con->mutex while it allocates a
    message.  If that races with a timeout that resends a zillion messages and
    resets the connection, and the ->alloc_msg() method returns a NULL message,
    it will call ceph_msg_put(NULL) and BUG.
    
    Fix by only calling put if msg is non-NULL.
    
    Fixes http://tracker.newdream.net/issues/3142
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  9. @alexelder @gregkh

    rbd: reset BACKOFF if unable to re-queue

    alexelder authored gregkh committed
    (cherry picked from commit 588377d)
    
    If ceph_fault() is unable to queue work after a delay, it sets the
    BACKOFF connection flag so con_work() will attempt to do so.
    
    In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't
    result in newly-queued work, it simply ignores this condition and
    proceeds as if no backoff delay were desired.  There are two
    problems with this--one of which is a bug.
    
    The first problem is simply that the intended behavior is to back
    off, and if we aren't able queue the work item to run after a delay
    we're not doing that.
    
    The only reason queue_delayed_work() won't queue work is if the
    provided work item is already queued.  In the messenger, this
    means that con_work() is already scheduled to be run again.  So
    if we simply set the BACKOFF flag again when this occurs, we know
    the next con_work() call will again attempt to hold off activity
    on the connection until after the delay.
    
    The second problem--the bug--is a leak of a reference count.  If
    queue_delayed_work() returns 0 in con_work(), con->ops->put() drops
    the connection reference held on entry to con_work().  However,
    processing is (was) allowed to continue, and at the end of the
    function a second con->ops->put() is called.
    
    This patch fixes both problems.
    
    Signed-off-by: Alex Elder <elder@inktank.com>
    Reviewed-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  10. @alexelder @gregkh

    libceph: only kunmap kmapped pages

    alexelder authored gregkh committed
    (cherry picked from commit 5ce765a)
    
    In write_partial_msg_pages(), pages need to be kmapped in order to
    perform a CRC-32c calculation on them.  As an artifact of the way
    this code used to be structured, the kunmap() call was separated
    from the kmap() call and both were done conditionally.  But the
    conditions under which the kmap() and kunmap() calls were made
    differed, so there was a chance a kunmap() call would be done on a
    page that had not been mapped.
    
    The symptom of this was tripping a BUG() in kunmap_high() when
    pkmap_count[nr] became 0.
    
    Reported-by: Bryan K. Wright <bryan@virginia.edu>
    Signed-off-by: Alex Elder <elder@inktank.com>
    Reviewed-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  11. @gregkh

    libceph: avoid truncation due to racing banners

    Jim Schutt authored gregkh committed
    (cherry picked from commit 6d4221b)
    
    Because the Ceph client messenger uses a non-blocking connect, it is
    possible for the sending of the client banner to race with the
    arrival of the banner sent by the peer.
    
    When ceph_sock_state_change() notices the connect has completed, it
    schedules work to process the socket via con_work().  During this
    time the peer is writing its banner, and arrival of the peer banner
    races with con_work().
    
    If con_work() calls try_read() before the peer banner arrives, there
    is nothing for it to do, after which con_work() calls try_write() to
    send the client's banner.  In this case Ceph's protocol negotiation
    can complete succesfully.
    
    The server-side messenger immediately sends its banner and addresses
    after accepting a connect request, *before* actually attempting to
    read or verify the banner from the client.  As a result, it is
    possible for the banner from the server to arrive before con_work()
    calls try_read().  If that happens, try_read() will read the banner
    and prepare protocol negotiation info via prepare_write_connect().
    prepare_write_connect() calls con_out_kvec_reset(), which discards
    the as-yet-unsent client banner.  Next, con_work() calls
    try_write(), which sends the protocol negotiation info rather than
    the banner that the peer is expecting.
    
    The result is that the peer sees an invalid banner, and the client
    reports "negotiation failed".
    
    Fix this by moving con_out_kvec_reset() out of
    prepare_write_connect() to its callers at all locations except the
    one where the banner might still need to be sent.
    
    [elder@inktak.com: added note about server-side behavior]
    
    Signed-off-by: Jim Schutt <jaschut@sandia.gov>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  12. @liewegas @gregkh

    libceph: delay debugfs initialization until we learn global_id

    liewegas authored gregkh committed
    (cherry picked from commit d1c338a)
    
    The debugfs directory includes the cluster fsid and our unique global_id.
    We need to delay the initialization of the debug entry until we have
    learned both the fsid and our global_id from the monitor or else the
    second client can't create its debugfs entry and will fail (and multiple
    client instances aren't properly reflected in debugfs).
    
    Reported by: Yan, Zheng <zheng.z.yan@intel.com>
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  13. @smunaut @gregkh

    libceph: fix crypto key null deref, memory leak

    smunaut authored gregkh committed
    (cherry picked from commit f0666b1)
    
    Avoid crashing if the crypto key payload was NULL, as when it was not correctly
    allocated and initialized.  Also, avoid leaking it.
    
    Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  14. @liewegas @gregkh

    libceph: recheck con state after allocating incoming message

    liewegas authored gregkh committed
    (cherry picked from commit 6139919)
    
    We drop the lock when calling the ->alloc_msg() con op, which means
    we need to (a) not clobber con->in_msg without the mutex held, and (b)
    we need to verify that we are still in the OPEN state when we retake
    it to avoid causing any mayhem.  If the state does change, -EAGAIN
    will get us back to con_work() and loop.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  15. @liewegas @gregkh

    libceph: change ceph_con_in_msg_alloc convention to be less weird

    liewegas authored gregkh committed
    (cherry picked from commit 4740a62)
    
    This function's calling convention is very limiting.  In particular,
    we can't return any error other than ENOMEM (and only implicitly),
    which is a problem (see next patch).
    
    Instead, return an normal 0 or error code, and make the skip a pointer
    output parameter.  Drop the useless in_hdr argument (we have the con
    pointer).
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  16. @liewegas @gregkh

    libceph: avoid dropping con mutex before fault

    liewegas authored gregkh committed
    (cherry picked from commit 8636ea6)
    
    The ceph_fault() function takes the con mutex, so we should avoid
    dropping it before calling it.  This fixes a potential race with
    another thread calling ceph_con_close(), or _open(), or similar (we
    don't reverify con->state after retaking the lock).
    
    Add annotation so that lockdep realizes we will drop the mutex before
    returning.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  17. @liewegas @gregkh

    libceph: verify state after retaking con lock after dispatch

    liewegas authored gregkh committed
    (cherry picked from commit 7b862e0)
    
    We drop the con mutex when delivering a message.  When we retake the
    lock, we need to verify we are still in the OPEN state before
    preparing to read the next tag, or else we risk stepping on a
    connection that has been closed.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  18. @liewegas @gregkh

    libceph: revoke mon_client messages on session restart

    liewegas authored gregkh committed
    (cherry picked from commit 4f471e4)
    
    Revoke all mon_client messages when we shut down the old connection.
    This is mostly moot since we are re-using the same ceph_connection,
    but it is cleaner.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  19. @liewegas @gregkh

    libceph: fix handling of immediate socket connect failure

    liewegas authored gregkh committed
    (cherry picked from commit 8007b8d)
    
    If the connect() call immediately fails such that sock == NULL, we
    still need con_close_socket() to reset our socket state to CLOSED.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  20. @liewegas @gregkh

    libceph: clear all flags on con_close

    liewegas authored gregkh committed
    Signed-off-by: Sage Weil <sage@inktank.com>
    (cherry picked from commit 43c7427)
  21. @liewegas @gregkh

    libceph: clean up con flags

    liewegas authored gregkh committed
    (cherry picked from commit 4a86169)
    
    Rename flags with CON_FLAG prefix, move the definitions into the c file,
    and (better) document their meaning.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  22. @liewegas @gregkh

    libceph: replace connection state bits with states

    liewegas authored gregkh committed
    (cherry picked from commit 8dacc7d)
    
    Use a simple set of 6 enumerated values for the socket states (CON_STATE_*)
    and use those instead of the state bits.  All of the con->state checks are
    now under the protection of the con mutex, so this is safe.  It also
    simplifies many of the state checks because we can check for anything other
    than the expected state instead of various bits for races we can think of.
    
    This appears to hold up well to stress testing both with and without socket
    failure injection on the server side.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  23. @liewegas @gregkh

    libceph: drop unnecessary CLOSED check in socket state change callback

    liewegas authored gregkh committed
    (cherry picked from commit d7353dd)
    
    
    If we are CLOSED, the socket is closed and we won't get these.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  24. @liewegas @gregkh

    libceph: close socket directly from ceph_con_close()

    liewegas authored gregkh committed
    (cherry picked from commit ee76e07)
    
    It is simpler to do this immediately, since we already hold the con mutex.
    It also avoids the need to deal with a not-quite-CLOSED socket in con_work.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  25. @liewegas @gregkh

    libceph: drop gratuitous socket close calls in con_work

    liewegas authored gregkh committed
    (cherry picked from commit 2e8cb10)
    
    If the state is CLOSED or OPENING, we shouldn't have a socket.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  26. @liewegas @gregkh

    libceph: move ceph_con_send() closed check under the con mutex

    liewegas authored gregkh committed
    (cherry picked from commit a59b55a)
    
    Take the con mutex before checking whether the connection is closed to
    avoid racing with someone else closing it.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  27. @liewegas @gregkh

    libceph: move msgr clear_standby under con mutex protection

    liewegas authored gregkh committed
    (cherry picked from commit 0065093)
    
    Avoid dropping and retaking con->mutex in the ceph_con_send() case by
    leaving locking up to the caller.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  28. @liewegas @gregkh

    libceph: fix fault locking; close socket on lossy fault

    liewegas authored gregkh committed
    (cherry picked from commit 3b5ede0)
    
    If we fault on a lossy connection, we should still close the socket
    immediately, and do so under the con mutex.
    
    We should also take the con mutex before printing out the state bits in
    the debug output.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  29. @liewegas @gregkh

    libceph: reset connection retry on successfully negotiation

    liewegas authored gregkh committed
    (cherry picked from commit 85effe1)
    
    We exponentially back off when we encounter connection errors.  If several
    errors accumulate, we will eventually wait ages before even trying to
    reconnect.
    
    Fix this by resetting the backoff counter after a successful negotiation/
    connection with the remote node.  Fixes ceph issue #2802.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  30. @liewegas @gregkh

    libceph: protect ceph_con_open() with mutex

    liewegas authored gregkh committed
    (cherry picked from commit 5469155)
    
    Take the con mutex while we are initiating a ceph open.  This is necessary
    because the may have previously been in use and then closed, which could
    result in a racing workqueue running con_work().
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  31. @liewegas @gregkh

    libceph: (re)initialize bio_iter on start of message receive

    liewegas authored gregkh committed
    (cherry picked from commit a410702)
    
    Previously, we were opportunistically initializing the bio_iter if it
    appeared to be uninitialized in the middle of the read path.  The problem
    is that a sequence like:
    
     - start reading message
     - initialize bio_iter
     - read half a message
     - messenger fault, reconnect
     - restart reading message
     - ** bio_iter now non-NULL, not reinitialized **
     - read past end of bio, crash
    
    Instead, initialize the bio_iter unconditionally when we allocate/claim
    the message for read.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  32. @liewegas @gregkh

    libceph: resubmit linger ops when pg mapping changes

    liewegas authored gregkh committed
    (cherry picked from commit 6194ea8)
    
    The linger op registration (i.e., watch) modifies the object state.  As
    such, the OSD will reply with success if it has already applied without
    doing the associated side-effects (setting up the watch session state).
    If we lose the ACK and resubmit, we will see success but the watch will not
    be correctly registered and we won't get notifies.
    
    To fix this, always resubmit the linger op with a new tid.  We accomplish
    this by re-registering as a linger (i.e., 'registered') if we are not yet
    registered.  Then the second loop will treat this just like a normal
    case of re-registering.
    
    This mirrors a similar fix on the userland ceph.git, commit 5dd68b95, and
    ceph bug #2796.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  33. @liewegas @gregkh

    libceph: fix mutex coverage for ceph_con_close

    liewegas authored gregkh committed
    (cherry picked from commit 8c50c81)
    
    Hold the mutex while twiddling all of the state bits to avoid possible
    races.  While we're here, make not of why we cannot close the socket
    directly.
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  34. @liewegas @gregkh

    libceph: report socket read/write error message

    liewegas authored gregkh committed
    (cherry picked from commit 3a140a0)
    
    We need to set error_msg to something useful before calling ceph_fault();
    do so here for try_{read,write}().  This is more informative than
    
    libceph: osd0 192.168.106.220:6801 (null)
    
    Signed-off-by: Sage Weil <sage@inktank.com>
    Reviewed-by: Alex Elder <elder@inktank.com>
    Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  35. @gregkh

    libceph: prevent the race of incoming work during teardown

    Guanjun He authored gregkh committed
    (cherry picked from commit a2a3258)
    
    Add an atomic variable 'stopping' as flag in struct ceph_messenger,
    set this flag to 1 in function ceph_destroy_client(), and add the condition code
    in function ceph_data_ready() to test the flag value, if true(1), just return.
    
    Signed-off-by: Guanjun He <gjhe@suse.com>
    Reviewed-by: Sage Weil <sage@inktank.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Something went wrong with that request. Please try again.