Skip to content

Commits

Permalink
J-corwin-Cobur…
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Commits on May 23, 2023

  1. Enable configuration and building of dm-vdo.

    This adds dm-vdo to the drivers/md Kconfig and Makefile.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    1ebc77d View commit details
    Browse the repository at this point in the history
  2. Add dm-vdo-target.c

    This adds the dm-vdo target.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    3808291 View commit details
    Browse the repository at this point in the history
  3. Add vdo debugging support.

    Add support for dumping detailed vdo state to the kernel log via a dmsetup
    message. The dump code is not thread-safe and is generally intended for use
    only when the vdo is hung.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    ed68743 View commit details
    Browse the repository at this point in the history
  4. Add sysfs support for setting vdo parameters and fetching statistics.

    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    3b41878 View commit details
    Browse the repository at this point in the history
  5. Add statistics tracking.

    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    b0fc6a6 View commit details
    Browse the repository at this point in the history
  6. Add the on-disk formats and marshalling of vdo structures.

    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    3d9520d View commit details
    Browse the repository at this point in the history
  7. Add the vdo structure itself.

    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    51e2901 View commit details
    Browse the repository at this point in the history
  8. Add repair (crash recovery and read-only rebuild) of damaged vdos.

    When a vdo is restarted after a crash, it will automatically attempt to
    recover from its journals.
    
    If a vdo encounters an unrecoverable error, it will enter read-only mode.
    This mode indicates that some previously acknowledged data may have been
    lost. The vdo may be instructed to rebuild as best it can in order to
    return to a writable state. Although some data may be lost, this process
    will ensure that the vdo's own metadata is self-consistent.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    3c6c8d4 View commit details
    Browse the repository at this point in the history
  9. Add the vdo recovery journal.

    The recovery journal is used to amortize updates across the block map and
    slab depot. Each write request causes an entry to be made in the journal.
    Entries are either "data remappings" or "block map remappings." For a data
    remapping, the journal records the logical address affected and its old and
    new physical mappings. For a block map remapping, the journal records the
    block map page number and the physical block allocated for it (block map
    pages are never reclaimed, so the old mapping is always 0). Each journal
    entry and the data write it represents must be stable on disk before the
    other metadata structures may be updated to reflect the operation.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    dcb3bf7 View commit details
    Browse the repository at this point in the history
  10. Implement the vdo block map page cache.

    The set of leaf pages of the block map tree is too large to fit in memory,
    so each block map zone maintains a cache of leaf pages. This patch adds the
    implementation of that cache.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    b23468c View commit details
    Browse the repository at this point in the history
  11. Add the vdo block map.

    The block map contains the logical to physical mapping. It can be thought
    of as an array with one entry per logical address. Each entry is 5 bytes:
    36 bits contain the physical block number which holds the data for the
    given logical address, and the remaining 4 bits are used to indicate the
    nature of the mapping. Of the 16 possible states, one represents a logical
    address which is unmapped (i.e. it has never been written, or has been
    discarded), one represents an uncompressed block, and the other 14 states
    are used to indicate that the mapped data is compressed, and which of the
    compression slots in the compressed block this logical address maps to.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    4091392 View commit details
    Browse the repository at this point in the history
  12. Add the slab depot itself.

    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    489d29d View commit details
    Browse the repository at this point in the history
  13. Add the block allocators and physical zones.

    Each slab is independent of every other. They are assigned to "physical
    zones" in round-robin fashion. If there are P physical zones, then slab n
    is assigned to zone n mod P. The set of slabs in each physical zone is
    managed by a block allocator.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    c795ded View commit details
    Browse the repository at this point in the history
  14. Add the slab summary.

    The slab depot maintains an additional small data structure, the "slab
    summary," which is used to reduce the amount of work needed to come back
    online after a crash. The slab summary maintains an entry for each slab
    indicating whether or not the slab has ever been used, whether it is clean
    (i.e. all of its reference count updates have been persisted to storage),
    and approximately how full it is. During recovery, each physical zone will
    attempt to recover at least one slab, stopping whenever it has recovered a
    slab which has some free blocks. Once each zone has some space (or has
    determined that none is available), the target can resume normal operation
    in a degraded mode. Read and write requests can be serviced, perhaps with
    degraded performance, while the remainder of the dirty slabs are recovered.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    3c03a8f View commit details
    Browse the repository at this point in the history
  15. Add vdo_slab.

    Most of the vdo volume belongs to the slab depot. The depot contains a
    collection of slabs. The slabs can be up to 32GB, and are divided into
    three sections. Most of a slab consists of a linear sequence of 4K blocks.
    These blocks are used either to store data, or to hold portions of the
    block map (see subsequent patches). In addition to the data blocks, each
    slab has a set of reference counters, using 1 byte for each data block.
    Finally each slab has a journal. Reference updates are written to the slab
    journal, which is written out one block at a time as each block fills. A
    copy of the reference counters is kept in memory, and are written out a
    block at a time, in oldest-dirtied-order whenever there is a need to
    reclaim slab journal space. The journal is used both to ensure that the
    main recovery journal (see subsequent patches) can regularly free up space,
    and also to amortize the cost of updating individual reference blocks.
    
    This patch adds the slab structure as well as the slab journal and
    reference counters.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    3ddabe5 View commit details
    Browse the repository at this point in the history
  16. Add the compressed block bin packer.

    When blocks do not deduplicate, vdo will attempt to compress them. Up to 14
    compressed blocks may be packed into a single data block (this limitation
    is imposed by the block map). The packer implements a simple best-fit
    packing algorithm and also manages the formatting and writing of compressed
    blocks when bins fill.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    f74dd75 View commit details
    Browse the repository at this point in the history
  17. Add use of the deduplication index in hash zones.

    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    9cdccb5 View commit details
    Browse the repository at this point in the history
  18. Add hash locks and hash zones.

    In order to deduplicate concurrent writes of the same data (to different
    locations), data_vios which are writing the same data are grouped together
    in a "hash lock," named for and keyed by the hash of the data being
    written. Each hash lock is assigned to a hash zone based on a portion of
    its hash.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    6fdf41f View commit details
    Browse the repository at this point in the history
  19. Add the vdo io_submitter.

    The io_submitter handles bio submission from vdo data store to the storage
    below. It will merge bios when possible.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    32c8bb1 View commit details
    Browse the repository at this point in the history
  20. Add flush support to vdo.

    This patch adds support for handling incoming flush and/or FUA bios. Each
    such bio is assigned to a struct vdo_flush. These are allocated as needed,
    but there is always one kept in reserve in case allocations fail. In the
    event of an allocation failure, bios may need to wait for an outstanding
    flush to complete.
    
    The logical address space is partitioned into logical zones, each handled
    by its own thread. Each zone keeps a list of all data_vios handling write
    requests for logical addresses in that zone. When a flush bio is processed,
    each logical zone is informed of the flush. When all of the writes which
    are in progress at the time of the notification have completed in all
    zones, the flush bio is then allowed to complete.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    c620d67 View commit details
    Browse the repository at this point in the history
  21. Add data_vio, the request object which services incoming bios.

    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    89aaa3e View commit details
    Browse the repository at this point in the history
  22. Add vio, the request object for vdo metadata.

    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    afede43 View commit details
    Browse the repository at this point in the history
  23. Add administrative state and scheduling for vdo.

    This patch adds the admin_state structures which are used to track the
    states of individual vdo components for handling of operations like suspend
    and resume. It also adds the action manager which is used to schedule and
    manage cross-thread administrative and internal operations.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    da16c37 View commit details
    Browse the repository at this point in the history
  24. Implement external deduplication index interface.

    The deduplication index interface for index clients includes the
    deduplication request and index session structures. This is the interface
    that the rest of the vdo target uses to make requests, receive responses,
    and collect statistics.
    
    This patch also adds sysfs nodes for inspecting various index properties at
    runtime.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    4fbd7e3 View commit details
    Browse the repository at this point in the history
  25. Implement top-level deduplication index.

    The top-level deduplication index brings all the earlier components
    together. The top-level index creates the separate zone structures that
    enable the index to handle several requests in parallel, handles
    dispatching requests to the right zones and components, and coordinates
    metadata to ensure that it remain consistent. It also coordinates recovery
    in the event of an unexpected index failure.
    
    If sparse caching is enabled, the top-level index also handles the
    coordination required by the sparse chapter index cache, which (unlike most
    index structures) is shared among all zones.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    a9f3aac View commit details
    Browse the repository at this point in the history
  26. Implement the chapter volume store.

    The volume store structures manage the reading and writing of chapter
    pages. When a chapter is closed, it is packed into a read-only structure,
    split across several pages, and written to storage.
    
    The volume store also contains a cache and specialized queues that sort and
    batch requests by the page they need, in order to minimize latency and I/O
    requests when records have to be read from storage. The cache and queues
    also coordinate with the volume index to ensure that the volume does not
    waste resources reading pages that are no longer valid.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    5e8ef49 View commit details
    Browse the repository at this point in the history
  27. Implement the open chapter and chapter indexes.

    Deduplication records are stored in groups called chapters. New records are
    collected in a structure called the open chapter, which is optimized for
    adding, removing, and sorting records.
    
    When a chapter fills, it is packed into a read-only structure called a
    closed chapter, which is optimized for searching and reading. The closed
    chapter includes a delta index, called the chapter index, which maps each
    record name to the record page containing the record and allows the index
    to read at most one record page when looking up a record.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    0d3cf29 View commit details
    Browse the repository at this point in the history
  28. Implement the volume index.

    The volume index is a large delta index that maps each record name to the
    chapter which contains the newest record for that name. The volume index
    can contain several million records and is stored entirely in memory while
    the index is operating, accounting for the majority of the deduplication
    index's memory budget.
    
    The volume index is composed of two subindexes in order to handle sparse
    hook names separately from regular names. If sparse indexing is not
    enabled, the sparse hook portion of the volume index is not used or
    instantiated.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    ab63359 View commit details
    Browse the repository at this point in the history
  29. Implement the delta index.

    The delta index is a space and memory efficient alternative to a hashtable.
    Instead of storing the entire key for each entry, the entries are sorted by
    key and only the difference between adjacent keys (the delta) is stored.
    If the keys are evenly distributed, the size of the deltas follows an
    exponential distribution, and the deltas can use a Huffman code to take up
    even less space.
    
    This structure allows the index to use many fewer bytes per entry than a
    traditional hash table, but it is slightly more expensive to look up
    entries, because a request must read and sum every entry in a list of
    deltas in order to find a given record. The delta index reduces this lookup
    cost by splitting its key space into many sub-lists, each starting at a
    fixed key value, so that each individual list is short.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    fe9bdf0 View commit details
    Browse the repository at this point in the history
  30. Add deduplication index storage interface.

    This patch adds infrastructure for managing reads and writes to the
    underlying storage layer for the deduplication index. The deduplication
    index uses dm-bufio for all of its reads and writes, so part of this
    infrastructure is managing the various dm-bufio clients required. It also
    adds the buffered reader and buffered writer abstractions, which simplify
    reading and writing metadata structures that span several blocks.
    
    This patch also includes structures and utilities for encoding and decoding
    all of the deduplication index metadata, collectively called the index
    layout.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    7acabab View commit details
    Browse the repository at this point in the history
  31. Add deduplication configuration structures.

    Add structures which record the configuration of various deduplication
    index parameters. This also includes facilities for saving and loading the
    configuration and validating its integrity.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    ef5f847 View commit details
    Browse the repository at this point in the history
  32. Add basic data structures.

    This patch adds two hash maps, one keyed by integers, the other by
    pointers, and also a priority heap. The integer map is used for locking of
    logical and physical addresses. The pointer map is used for managing
    concurrent writes of the same data, ensuring that those writes are
    deduplicated. The priority heap is used to minimize the search time for
    free blocks.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    ccfb396 View commit details
    Browse the repository at this point in the history
  33. Add specialized request queueing functionality.

    This patch adds funnel_queue, a mostly lock-free multi-producer,
    single-consumer queue. It also adds the request queue used by the dm-vdo
    deduplication index, and the work_queue used by the dm-vdo data store. Both
    of these are built on top of funnel queue and are intended to support the
    dispatching of many short-running tasks. The work_queue also supports
    priorities. Finally, this patch adds vdo_completion, the structure which is
    enqueued on work_queues.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    64a7990 View commit details
    Browse the repository at this point in the history
  34. Add thread and synchronization utilities.

    This patch adds utilities for managing and using named threads, as well as
    several locking and sychronization utilities. These utilities help dm-vdo
    minimize thread transitions nad manage cross-thread interations.
    
    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    608f4b8 View commit details
    Browse the repository at this point in the history
  35. Add vdo type declarations, constants, and simple data structures.

    Signed-off-by: J. corwin Coburn <corwin@redhat.com>
    corwin authored and intel-lab-lkp committed May 23, 2023
    Copy the full SHA
    e9b24a5 View commit details
    Browse the repository at this point in the history
Older