Permalink
Commits on May 24, 2018
  1. Clean out Python 2.6 leftovers from splice.py

    smerritt committed May 24, 2018
    ctypes.ssize_t exists since Python 2.7 (not just the fuzzy 2.7.X, but
    CPython tag "v2.7"), so we don't need to define our own.
    
    Change-Id: Ib3c9b162ceffabd78622ae51c5accc4b7ba1294d
  2. Remove our reimplemented logging.NullHandler

    smerritt committed May 24, 2018
    Python 2.6 didn't have one, so we'd try to find logging.NullHandler
    but fall back to our own. Since 2.7+ has logging.NullHandler, we can
    just use it.
    
    Change-Id: Ie2c27407efc2882e698abe6e4379a00a1d3f4301
  3. Always pass capitalize_response_headers=False to eventlet.wsgi.server()

    smerritt committed May 24, 2018
    For a while, this was conditional because we supported old Eventlet
    versions that didn't have this keyword arg. Now, we require new-enough
    Eventlet that it's always available, so let's get rid of the
    conditional crud.
    
    The flag was introduced in Eventlet 0.15, and we require >= 0.17.4.
    
    Change-Id: Id089e29e7ecfc8cec79c520f604aa01bdae0dcf0
Commits on May 2, 2018
  1. s3api: simplify BaseAclHandler.request_with

    smerritt committed May 2, 2018
    It's just constructing a new object; there's no need for it to be a
    context manager.
    
    Change-Id: I9716f6c4e45bcdf80543cf661f922da681d602aa
Commits on Apr 25, 2018
  1. Make reconstructor go faster with --override-devices

    smerritt committed Mar 23, 2018
    The object reconstructor will now fork all available worker processes
    when operating on a subset of local devices.
    
    Example:
      A system has 24 disks, named "d1" through "d24"
      reconstructor_workers = 8
      invoked with --override-devices=d1,d2,d3,d4,d5,d6
    
    In this case, the reconstructor will now use 6 worker processes, one
    per disk. The old behavior was to use 2 worker processes, one for d1,
    d3, and d5 and the other for d2, d4, and d6 (because 24 / 8 = 3, so we
    assigned 3 disks per worker before creating another).
    
    I think the new behavior better matches operators' expectations. If I
    give a concurrent program six tasks to do and tell it to operate on up
    to eight at a time, I'd expect it to do all six tasks at once, not run
    two concurrent batches of three tasks apiece.
    
    This has no effect when --override-devices is not specified. When
    operating on all local devices instead of a subset, the new and old
    code produce the same result.
    
    The reconstructor's behavior now matches the object replicator's
    behavior.
    
    Change-Id: Ib308c156c77b9b92541a12dd7e9b1a8ea8307a30
Commits on Apr 24, 2018
  1. Multiprocess object replicator

    smerritt authored and tipabu committed Mar 23, 2018
    Add a multiprocess mode to the object replicator. Setting the
    "replicator_workers" setting to a positive value N will result in the
    replicator using up to N worker processes to perform replication
    tasks.
    
    At most one worker per disk will be spawned, so one can set
    replicator_workers=99999999 to always get one worker per disk
    regardless of the number of disks in each node. This is the same
    behavior that the object reconstructor has.
    
    Worker process logs will have a bit of information prepended so
    operators can tell which messages came from which worker. It looks
    like this:
    
      [worker 1/2 pid=16529] 154/154 (100.00%) partitions replicated in 1.02s (150.87/sec, 0s remaining)
    
    The prefix is "[worker M/N pid=P] ", where M is the worker's index, N
    is the total number of workers, and P is the process ID. Every message
    from the replicator's logger will have the prefix; this includes
    messages from down in diskfile, but does not include things printed to
    stdout or stderr.
    
    Drive-by fix: don't dump recon stats when replicating only certain
    policies. When running the object replicator with replicator_workers >
    0 and "--policies=X,Y,Z", the replicator would update recon stats
    after running. Since it only ran on a subset of objects, it should not
    update recon, much like it doesn't update recon when run with
    --devices or --partitions.
    
    Change-Id: I6802a9ad9f1f9b9dafb99d8b095af0fdbf174dc5
Commits on Apr 20, 2018
  1. py3: port gatekeeper

    smerritt committed Apr 20, 2018
    There were a couple of cleanups in swob as part of this.  First,
    status lines are always native str objects (as PEP 3333 wants), rather
    than being encoded to bytes under py3. Second, _resp_body_property
    now works (only) with bytestrings from the app iter.
    
    In gatekeeper, we now deal with dict.items() returning an object of
    type "dict_items" in py3, not a list. Also fixed a NameError caused by
    py2's list comprehensions leaking variables to function scope where
    py3's don't.
    
    Change-Id: I6da8eceb91edb2b47aa345d61b825c7199a5569b
Commits on Apr 11, 2018
  1. Add ability to run specific tests in py35 tox environment.

    smerritt committed Apr 11, 2018
    This lets you run something like
    
        $ tox -e py35 -- -s test.unit.common.test_swob
    
    and just run the swob tests. This is handy if you've got one or two
    test failures to debug or if you're trying to get another test file
    passing under Python 3.
    
    Also removed the old py34 environment. Nothing uses it.
    
    Change-Id: I244b8903ef0a445abd028004bf02d6c581a8afe2
Commits on Apr 10, 2018
  1. Improve check for O_TMPFILE support in unit tests

    smerritt committed Apr 10, 2018
    On Python < 3.4, there is no os.O_TMPFILE, so the check for O_TMPFILE
    support was always reporting no-support. We now use utils.O_TMPFILE,
    which is always defined.
    
    Also, when opening an anonymous tempfile, you must open the directory
    (not a file in the directory) with three flags or-ed together:
    O_DIRECTORY, O_WRONLY, and O_TMPFILE. Calling open() with a subset of
    those flags will receive either EISDIR or EINVAL. The check was trying
    to open a file in the temp dir, so was always reporting no-support for
    O_TMPFILE. We now correctly open the tempdir.
    
    Change-Id: Iaf48ad9aecea73df4e794818dd15070360bff19f
Commits on Mar 23, 2018
  1. Remove object replicator's lockup detector/mitigator.

    smerritt committed Mar 13, 2018
    Sometimes, an rsync process just won't die. You can send SIGKILL, but
    it isn't very effective. This is sometimes seen due to attempted I/O
    on a failing disk; with some disks, an rsync process won't die until
    Linux finishes the current I/O operation (whether success or failure),
    but the disk can't succeed and will retry forever instead of
    failing. The net effect is an unkillable rsync process.
    
    The replicator was dealing with this by sending SIGKILL to any rsync
    that ran too long, then calling waitpid() in a loop[1] until the rsync
    died so it could reap the child process. This worked pretty well
    unless it met an unkillable rsync; in that case, one greenthread would
    end up blocked for a very long time. Since the replicator's main loop
    works by (a) gathering all replication jobs, (b) performing them in
    parallel with some limited concurrency, then (c) waiting for all jobs
    to complete, an unkillable rsync would block the entire replicator.
    
    There was an attempt to address this by adding a lockup detector: if
    the replicator failed to complete any replication cycle in N seconds
    [2], all greenthreads except the main one would be terminated and the
    replication cycle restarted. It works okay, but only handles total
    failure. If you have 20 greenthreads working and 19 of them are
    blocked on unkillable rsyncs, then as long as the 20th greenthread
    manages to replicate at least one partition every N seconds, the
    replicator will just keep limping along.
    
    This commit removes the lockup detector. Instead, when a replicator
    greenthread happens upon an rsync that doesn't die promptly after
    receiving SIGKILL, the process handle is sent to a background
    greenthread; that background greenthread simply waits for those rsync
    processes to finally die and reaps them. This lets the replicator make
    better progress in the presence of unkillable rsyncs.
    
    [1] It's a call to subprocess.Popen.wait(); the looping and sleeping
    happens in eventlet.
    
    [2] The default is 1800 seconds = 30 minutes, but the value is
    configurable.
    
    Change-Id: If6dc7b003e18ab4e8a5ed687c965025ebd417dfa
Commits on Mar 13, 2018
  1. Don't double-filter replication jobs

    smerritt committed Mar 8, 2018
    ObjectReplicator.collect_jobs() takes and correctly applies the
    various overrides, so there's no need to check the returned jobs
    against the overrides.
    
    Change-Id: I2a59b26410d1732a5f2c8f1f32e397d77550860e
Commits on Mar 6, 2018
  1. Temporarily disable flaky test.

    smerritt committed Mar 6, 2018
    test.unit.obj.test_replicator.TestObjectReplicator.test_replicate_lockup_detector
    is failing a lot in the gate. Let's disable it for now so that other
    patches can continue to land.
    
    Change-Id: I1790ebcbc0c8d075c2786aebca4e8ccf7547b178
  2. Support -d <devs> and -p <partitions> in DB replicators.

    smerritt committed Feb 28, 2018
    Similar to the object replicator and reconstructor, these arguments
    are comma-separated lists of device names and partitions,
    respectively, on which the account or container replicator will
    operate. Other devices and partitions are ignored.
    
    Change-Id: Ic108f5c38f700ac4c7bcf8315bf4c55306951361
Commits on Feb 17, 2018
  1. Add handoffs-only mode to DB replicators.

    smerritt committed Feb 17, 2018
    The object reconstructor has a handoffs-only mode that is very useful
    when a cluster requires rapid rebalancing, like when disks are nearing
    fullness. This mode's goal is to remove handoff partitions from disks
    without spending effort on primary partitions. The object replicator
    has a similar mode, though it varies in some details.
    
    This commit adds a handoffs-only mode to the account and container
    replicators.
    
    Change-Id: I588b151ee65ae49d204bd6bf58555504c15edf9f
    Closes-Bug: 1668399
  2. Make DB replicators ignore non-partition directories

    smerritt committed Feb 15, 2018
    If a cluster operator has some tooling that makes directories in
    /srv/node/<disk>/accounts, then the account replicator will treat
    those directories as partition dirs and may remove empty
    subdirectories contained therein. This wastes time and confuses the
    operator.
    
    This commit makes DB replicators skip partition directories whose
    names don't look like positive integers. This doesn't completely avoid
    the problem since an operator can still use an all-digit name, but it
    will skip directories like "tmp21945".
    
    Change-Id: I8d6682915a555f537fc0ce8c39c3d52c99ff3056
Commits on Feb 16, 2018
  1. Small cleanup in DB replicator's error handling

    smerritt committed Feb 16, 2018
    Generally, we shouldn't compare integers with "is". It happens to work
    because (a) CPython has only one instance of each integer betwen -5
    and 256, and (b) errno.ENOTEMPTY == 66 on Linux, but it's better not
    to rely on those details.
    
    Change-Id: I494a4adbb2bb35adcf83b2d23e790c1b220e75c7
Commits on Feb 2, 2018
  1. Cleanup for iterators in SegmentedIterable

    smerritt committed Jan 27, 2018
    We had a pair of large, complicated iterators to handle fetching all
    the segment data, and they were hard to read and think about. I tried
    to break them out into some simpler pieces:
    
     * one to handle coalescing multiple requests to the same segment
    
     * one to handle fetching the bytes from each segment
    
     * one to check that the download isn't taking too long
    
     * one to count the bytes and make sure we sent the right number
    
     * one to catch errors and handle cleanup
    
    It's more nesting, but each level now does just one thing.
    
    Change-Id: If6f5cbd79edeff6ecb81350792449ce767919bcc
Commits on Feb 1, 2018
  1. Remove some cruft from ratelimit tests

    smerritt committed Feb 1, 2018
    The tests were carefully setting up a mock for the 'http_connect'
    function in the ratelimit module, but there is no such function
    imported by the ratelimit module.
    
    As far as I can tell, this has been the case since the ratelimit
    middleware first appeared in 72d40bd (Mon Oct 4 14:11:48 2010 -0700).
    
    Change-Id: If047184c6435aa1647050f50b499dc9feff4318d
Commits on Jan 23, 2018
  1. Make statsd errors correspond to 5xx only

    smerritt committed Jan 23, 2018
    The goal is to make the successful statsd buckets
    (e.g. "object-server.GET.timing") have timing information for all the
    requests that the server handled correctly, while the error buckets
    (e.g. "object-server.GET.errors.timing") have the rest.
    
    Currently, we don't do that great a job of it. We special-case a few
    4xx status codes (404, 412, 416) to not count as errors, but we leave
    some pretty large holes. If you're graphing errors, you'll see spikes
    when client is sending bogus requests (400) or failing to
    re-authenticate (403). You'll also see spikes when your drives are
    unmounted (507) and when there's bugs that need fixing (500).
    
    This commit makes .errors.timing be just 5xx in the hope that its
    graph will be more useful.
    
    Change-Id: I92b41bcbb880c0688c37ab231c19ebe984b18215
Commits on Jan 17, 2018
  1. Improve object-updater's stats logging

    smerritt committed Jan 12, 2018
    The object updater has five different stats, but its logging only told
    you two of them (successes and failures), and it only told you after
    finishing all the async_pendings for a device. If you have a cluster
    that's been sick and has millions upon millions of async_pendings
    laying around, then your object-updaters are frustratingly
    silent. I've seen one cluster with around 8 million async_pendings per
    disk where the object-updaters only emitted stats every 12 hours.
    
    Yes, if you have StatsD logging set up properly, you can go look at
    your graphs and get real-time feedback on what it's doing. If you
    don't have that, all you get is a frustrating silence.
    
    Now, the object updater tells you all of its stats (successes,
    failures, quarantines due to bad pickles, unlinks, and errors), and it
    tells you incremental progress every five minutes. The logging at the
    end of a pass remains and has been expanded to also include all stats.
    
    Also included is a small change to what counts as an error: unmounted
    drives no longer do. The goal is that only abnormal things count as
    errors, like permission problems, malformed filenames, and so
    on. These are things that should never happen, but if they do, may
    require operator intervention. Drives fail, so logging an error upon
    encountering an unmounted drive is not useful.
    
    Change-Id: Idbddd507f0b633d14dffb7a9834fce93a10359ab
  2. Remove old post-as-copy leftovers from tests.

    smerritt committed Jan 17, 2018
    Since commit 1e79f82, we don't need to test with post_as_copy=True
    any more since we haven't got post_as_copy at all.
    
    Change-Id: I9c96ce0b812d877bbe11bdb50eb160d6ffa5933d
  3. Don't make async_pendings during object expiration

    smerritt committed Jan 10, 2018
    After deleting an object, the object expirer deletes the corresponding
    row from the expirer queue by making DELETE requests directly to the
    container servers. The same thing happens after attempting to delete
    an object, but failing because the object has already been deleted. If
    the DELETE requests fail, then the expirer will encounter that row
    again on its next pass and retry the DELETE at that time. Therefore,
    it is not necessary for the object server to write an async_pending
    for that queue row's deletion.
    
    Currently, however, two of the object servers do write such
    async_pendings. Given Rc container replicas, that's 2 * Rc updates
    from async_pendings and another Rc from the object expirer
    directly. Given a typical Rc of 3, that's 9 container updates per
    expiring object.
    
    This commit makes the object server write no async_pendings for DELETE
    requests coming from the object expirer. This reduces the number of
    container server requests to Rc (typically 3), all issued directly
    from the object expirer.
    
    Closes-Bug: 1076202
    Change-Id: Icd63c80c73f864d2561e745c3154fbfda02bd0cc
Commits on Jan 16, 2018
  1. Minor cleanup in monitoring doc.

    smerritt committed Jan 16, 2018
    Change-Id: Ia21f8743bfd745f2579db8658624f888461c2cc2
Commits on Jan 11, 2018
  1. Limit object-expirer queue updates on object DELETE, PUT, POST

    smerritt committed Jan 9, 2018
    Currently, on deletion of an expiring object, each object server
    writes an async_pending to update the expirer queue and remove the row
    for that object. Each async_pending is processed by the object updater
    and results in all container replicas being updated. This is also true
    for PUT and POST requests for existing expiring objects.
    
    If you have Rc container replicas and Ro object replicas (or EC
    pieces), then the number of expirer-queue requests made is Rc * Ro [1].
    
    For a 3-replica cluster, that number is 9, which is not terrible. For
    a cluster with 3 container replicas and a 15+4 EC scheme, that number
    is 57, which is terrible.
    
    This commit makes it so at most two object servers will write out the
    async_pending files needed to update the queue, dropping the request
    count to 2 * Rc [2]. The object server now looks for a header
    "X-Backend-Clean-Expiring-Object-Queue: <true|false>" and writes or
    does not write expirer-queue async_pendings as appropriate. The proxy
    sends that header to 2 object servers.
    
    The queue update is not necessary for the proper functioning of the
    object expirer; if the queue update fails, then the object expirer
    will try to delete the object, receive 404s or 412s, and remove the
    queue entry. Removal on object PUT/POST/DELETE is helpful but not
    required.
    
    [1] assuming no retries needed by the object updater
    
    [2] or Rc, if a cluster has only one object replica
    
    Change-Id: I4d64f4d1d107c437fd3c23e19160157fdafbcd42
Commits on Jan 10, 2018
  1. proxy: make the right number of container updates

    smerritt committed Jan 6, 2018
    When the proxy is putting X-Container headers into object PUT
    requests, it should put out just enough to make the container update
    durable in the worst case. It shouldn't do more, since that results in
    extra work for the container servers; and it shouldn't do less, since
    that results in objects not showing up in listings.
    
    The current code gets the number right as long as you have 3 container
    replicas and an odd number of object replicas, but it comes up with
    some bogus numbers in other cases. The number it computes is
    (object-quorum + 1).
    
    This patch changes the number to (container-quorum +
    max_put_failures).
    
    Example: given an EC 12+5 policy and 3 container replicas, you can
    lose up to 4 connections and still succeed. Since you need to have 2
    container updates happen for durability, you need 6 connections to
    have X-Container headers. That way, you can lose 4 and still have 2
    left. The current code would put X-Container headers on 14 of the
    connections, resulting in more than double the workload on the
    container servers; this patch changes the number to 6.
    
    Example 2: given a (crazy) EC 3+6 policy and 3 container replicas, you
    can lose up to 5 connections, so you need X-Container headers on
    7. The current code only sends 5, giving a worst-case result of a PUT
    succeeds but never reaches the containers. This patch changes the
    number to 7.
    
    Other examples:
                              |  current  |  this change  |
                            --+-----------+---------------+
    EC 10+4, 3x container     |    12     |      5        |
    EC 10+4, 5x container     |    12     |      6        |
    EC 15+4, 3x container     |    17     |      5        |
    EC 15+4, 5x container     |    17     |      6        |
    EC 4+8, 3x container      |    6      |      9        |
    7x object, 3x container   |    5      |      5        |
    6x object, 3x container   |    4      |      5        |
    
    Change-Id: I34efd48655b890340912810ab111bb63445e5c8b
Commits on Jan 5, 2018
  1. Fix time skew when using X-Delete-After

    smerritt and alistairncoles committed Jan 5, 2018
    When a client sent "X-Delete-After: <n>", the proxy and all object
    servers would each compute X-Delete-At as "int(time.time() +
    n)". However, since they don't all compute it at exactly the same
    time, the objects stored on disk can end up with differing values for
    X-Delete-At, and in that case, the object-expirer queue has multiple
    entries for the same object (one for each distinct X-Delete-At value).
    
    This commit makes two changes, either one of which is sufficient to
    fix the bug.
    
    First, after computing X-Delete-At from X-Delete-After, X-Delete-After
    is removed from the request's headers. Thus, the proxy computes
    X-Delete-At, and the object servers don't, so there's only a single
    value.
    
    Second, computation of X-Delete-At now uses the request's X-Timestamp
    instead of time.time(). In the proxy, these values are essentially the
    same; the proxy is responsible for setting X-Timestamp. In the object
    server, this ensures that all computed X-Delete-At values are
    identical, even if the object servers' clocks are not, or if one
    object server takes an extra second to respond to a PUT request.
    
    Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
    Change-Id: I9a1b6826c4c553f0442cfe2bb78cdf49508fa4a5
    Closes-Bug: 1741371
  2. Ignore directory .stestr

    smerritt committed Jan 5, 2018
    After running the functional tests, this directory shows up. I don't
    know what's in it, but I'm fairly certain I don't want to commit it.
    
    Change-Id: If9179330c337daf2ae0a01e6c8aa8d349969e737
Commits on Dec 29, 2017
  1. Fix socket leak on 416 EC GET responses.

    smerritt committed Dec 28, 2017
    Sometimes, when handling an EC GET request with a Range header, the
    object servers reply 206 to the proxy, but the proxy (correctly)
    replies 416 to the client[1]. In that case, the connections to the object
    servers were not being closed. This was due to improper error handling
    in ECAppIter.
    
    Since ECAppIter is intended to be a WSGI iterable, it expects to have
    its close() method called when the caller is done with it. In this
    particular case, the caller (ECAppIter.kickoff()) was not calling
    close() when an exception was raised. Now it is.
    
    [1] consider a 4+2 EC policy with segment size 1024, an 20 byte
    object, and a request with "Range: bytes=21-50". The proxy needs whole
    fragments to decode, so it asks the object server for "Range:
    bytes=0-255" [2], the object server says 206, and then the proxy
    realizes that the client's request is unsatisfiable and tells the
    client 416.
    
    [2] segment size 1024 and 4 data fragments means the fragments have
    size 1024 / 4 = 256, hence "bytes=0-255" asks for the first whole
    fragment
    
    Change-Id: Ide2edf8c449c97d45f48c2dbbbff7aebefa4b158
    Closes-Bug: 1738804
Commits on Dec 21, 2017
  1. Fix sometimes-flaky container name functional test.

    smerritt committed Dec 21, 2017
    You've got two test classes: TestContainer and TestContainerUTF8. They
    each try to create the same set of containers with names of varying
    lengths to make sure the container-name length limit is being honored.
    
    Also, each test class tries to clean up pre-existing data in its
    setUpClass method. If TestContainerUTF8 fails to delete a contaienr
    that TestContainer made, then its testContainerNameLimit method will
    fail because the container PUT response has status 202 instead of 201,
    which is because the container still existed from the prior test.
    
    I've made the test consider both 201 and 202 as success. For purposes
    of testing the maximum container name length, any 2xx is fine.
    
    Change-Id: I7b343a8ed0d12537659c051ddf29226cefa78a8f
Commits on Dec 15, 2017
  1. Save ring builder if dispersion changes

    smerritt authored and clayg committed Jun 29, 2017
    There are cases where a rebalance improves dispersion, but doesn't
    improve balance. This is because the balance of a ring builder is
    taken to be the balance of its least-balanced device, so if there's a
    device that has no partitions, wants some, but can't get them, then
    we'll never save the ring builder even if every other device in the
    ring got better.
    
    We can detect this situation by looking at the dispersion number; if it
    changes, then the rebalance needs to be saved in order to continue to
    make progress.
    
    Partial-Bug: #1697543
    
    Change-Id: Ie239b958fc7e0547ffda2bebf61546bd4ef3d829
Commits on Dec 14, 2017
  1. Show missing branches in coverage report.

    smerritt committed Dec 14, 2017
    This used to be the default in coverage 3.x, but 4.0+ requires you to
    explicitly configure it.
    
    Change-Id: I3b06154c7862c300b5a2b3afb14cced1e8411468
Commits on Dec 8, 2017
  1. Fix small error in a doc string

    smerritt committed Dec 8, 2017
    Change-Id: I1c743fdea637ce047d09a49db0a43b2eb37305fa
Commits on Dec 7, 2017
  1. Fix suffix-byte-range responses for zero-byte replicated objects.

    smerritt committed Dec 7, 2017
    Previously, given a GET request like "Range: bytes=-12345" for a
    zero-byte object, Swift would return a 206 response with the header
    "Content-Range: bytes 0--1/0". This is clearly incorrect. Now Swift
    returns a 200 with the whole (zero-byte) object.
    
    Note: this does not fix the bug for EC objects, only for replicated
    ones. The fix for EC objects will follow in a separate commit.
    
    Change-Id: If1edb665b0ae000da78c4efff6faddd94d75da6b
    Partial-Bug: 1736840
Commits on Nov 13, 2017
  1. Add metadata checksums to old objects in auditor.

    smerritt committed Oct 20, 2017
    When the object auditor examines an object, it will now add any
    missing metadata checksums. This goes for both .data and .meta files,
    but not .ts files, as tombstones don't live very long anyway.
    
    Change-Id: I9417a8b0cc5099470845c0504c834746188d89e8
Commits on Nov 3, 2017
  1. Add checksum to object extended attributes

    smerritt authored and thiagodasilva committed Jun 30, 2016
    Currently, our integrity checking for objects is pretty weak when it
    comes to object metadata. If the extended attributes on a .data or
    .meta file get corrupted in such a way that we can still unpickle it,
    we don't have anything that detects that.
    
    This could be especially bad with encrypted etags; if the encrypted
    etag (X-Object-Sysmeta-Crypto-Etag or whatever it is) gets some bits
    flipped, then we'll cheerfully decrypt the cipherjunk into plainjunk,
    then send it to the client. Net effect is that the client sees a GET
    response with an ETag that doesn't match the MD5 of the object *and*
    Swift has no way of detecting and quarantining this object.
    
    Note that, with an unencrypted object, if the ETag metadatum gets
    mangled, then the object will be quarantined by the object server or
    auditor, whichever notices first.
    
    As part of this commit, I also ripped out some mocking of
    getxattr/setxattr in tests. It appears to be there to allow unit tests
    to run on systems where /tmp doesn't support xattrs. However, since
    the mock is keyed off of inode number and inode numbers get re-used,
    there's lots of leakage between different test runs. On a real FS,
    unlinking a file and then creating a new one of the same name will
    also reset the xattrs; this isn't the case with the mock.
    
    The mock was pretty old; Ubuntu 12.04 and up all support xattrs in
    /tmp, and recent Red Hat / CentOS releases do too. The xattr mock was
    added in 2011; maybe it was to support Ubuntu Lucid Lynx?
    
    Bonus: now you can pause a test with the debugger, inspect its files
    in /tmp, and actually see the xattrs along with the data.
    
    Since this patch now uses a real filesystem for testing filesystem
    operations, tests are skipped if the underlying filesystem does not
    support setting xattrs (eg tmpfs or more than 4k of xattrs on ext4).
    
    References to "/tmp" have been replaced with calls to
    tempfile.gettempdir(). This will allow setting the TMPDIR envvar in
    test setup and getting an XFS filesystem instead of ext4 or tmpfs.
    
    THIS PATCH SIGNIFICANTLY CHANGES TESTING ENVIRONMENTS
    
    With this patch, every test environment will require TMPDIR to be
    using a filesystem that supports at least 4k of extended attributes.
    Neither ext4 nor tempfs support this. XFS is recommended.
    
    So why all the SkipTests? Why not simply raise an error? We still need
    the tests to run on the base image for OpenStack's CI system. Since
    we were previously mocking out xattr, there wasn't a problem, but we
    also weren't actually testing anything. This patch adds functionality
    to validate xattr data, so we need to drop the mock.
    
    `test.unit.skip_if_no_xattrs()` is also imported into `test.functional`
    so that functional tests can import it from the functional test
    namespace.
    
    The related OpenStack CI infrastructure changes are made in
    https://review.openstack.org/#/c/394600/.
    
    Co-Authored-By: John Dickinson <me@not.mn>
    
    Change-Id: I98a37c0d451f4960b7a12f648e4405c6c6716808