A pair of kernel modules which provide pools of deduplicated and/or compressed block storage.
Clone or download
corwin Version 6.2.0.273
Note: This is a pre-release version, future versions of VDO may not support
VDO devices created with this version.
- Fixed more error path memory leaks in the uds and kvdo modules.
- Fixed module loading issues with the spec file on Fedora.
- Removed the read cache.
- Fixed error handling in preresume.
- Converted table line parsing to use existing DM functions.
- Fixed a bug which prevented parsing of version 0 table lines.
- In order to properly handle version 0 table lines, made no-op physical
  growth not an error.
- Limited the number of logical zones to 60.
- Converted to use the kernel's bio zeroing method instead of a VDO
  specific one.
- Added a missing call to flush_cache_page() after writing pages which may
  be owned by the page cache or a user as required by the kernel.
- Added a version 2 table line which uses DM-style optional parameters.
- Fixed a bug in the statistics tracking partial I/Os.
- Added a maximum discard size table line parameter and removed the
  corresponding sysfs parameter which applied to all VDO devices.
Latest commit 2f1ca50 Nov 15, 2018
Permalink
Failed to load latest commit information.
uds Version 6.2.0.273 Nov 16, 2018
vdo Version 6.2.0.273 Nov 16, 2018
CONTRIBUTORS.txt Version 6.2.0.273 Nov 16, 2018
COPYING Initial GPL Release Oct 23, 2017
Makefile Initial GPL Release Oct 23, 2017
README.md Update to version 6.1.0.61 Nov 26, 2017
TODO Initial GPL Release Oct 23, 2017
kvdo.spec Version 6.2.0.273 Nov 16, 2018

README.md

kvdo

A pair of kernel modules which provide pools of deduplicated and/or compressed block storage.

Background

VDO (which includes kvdo and vdo) is software that provides inline block-level deduplication, compression, and thin provisioning capabilities for primary storage. VDO installs within the Linux device mapper framework, where it takes ownership of existing physical block devices and remaps these to new, higher-level block devices with data-efficiency capabilities.

Deduplication is a technique for reducing the consumption of storage resources by eliminating multiple copies of duplicate blocks. Compression takes the individual unique blocks and shrinks them with coding algorithms; these reduced blocks are then efficiently packed together into physical blocks. Thin provisioning manages the mapping from LBAs presented by VDO to where the data has actually been stored, and also eliminates any blocks of all zeroes.

With deduplication, instead of writing the same data more than once each duplicate block is detected and recorded as a reference to the original block. VDO maintains a mapping from logical block addresses (used by the storage layer above VDO) to physical block addresses (used by the storage layer under VDO). After deduplication, multiple logical block addresses may be mapped to the same physical block address; these are called shared blocks and are reference-counted by the software.

With VDO's compression, multiple blocks (or shared blocks) are compressed with the fast LZ4 algorithm, and binned together where possible so that multiple compressed blocks fit within a 4 KB block on the underlying storage. Mapping from LBA is to a physical block address and index within it for the desired compressed data. All compressed blocks are individually reference counted for correctness.

Block sharing and block compression are invisible to applications using the storage, which read and write blocks as they would if VDO were not present. When a shared block is overwritten, a new physical block is allocated for storing the new block data to ensure that other logical block addresses that are mapped to the shared physical block are not modified.

This public source release of VDO includes two kernel modules, and a set of userspace tools for managing them. The "kvdo" module implements fine-grained storage virtualization, thin provisioning, block sharing, and compression; the "uds" module provides memory-efficient duplicate identification. The userspace tools include a pair of python scripts, "vdo" for creating and managing VDO volumes, and "vdostats" for extracting statistics from those volumes.

Documentation

Project documentation is being converted from its proprietary form and will be added to this repository at a later date.

Status

VDO was originally developed by Permabit Technology Corp. as a proprietary set of kernel modules and userspace tools. This software and technology has been acquired by Red Hat, has been relicensed under the GPL (v2 or later), and this repository begins the process of preparing for integration with the upstream kernel.

While this software has been relicensed there are a number of issues that must still be addressed to be ready for upstream. These include:

  • Conformance with kernel coding standards
  • Use of existing EXPORT_SYMBOL_GPL kernel interfaces where appropriate
  • Refactoring of primitives (e.g. cryptographic) to appropriate kernel subsystems
  • Support for non-x86-64 platforms
  • Refactoring of platform layer abstractions and other changes requested by upstream maintainers

We expect addressing these issues to take some time. In the meanwhile, this project allows interested parties to begin using VDO immediately. The technology itself is thoroughly tested, mature, and in production use since 2014 in its previous proprietary form.

Building

In order to build the kernel modules, invoke the following command from the top directory of this tree:

    make -C /usr/src/kernels/`uname -r` M=`pwd`

Communication channels

Community feedback, participation and patches are welcome to the vdo-devel@redhat.com mailing list -- subscribe here.

Contributing

This project is currently a stepping stone towards integration with the Linux kernel. As such, contributions are welcome via a process similar to that for Linux kernel development. Patches should be submitted to the vdo-devel@redhat.com mailing list, where they will be considered for inclusion. This project does not accept pull requests.

Licensing

GPL v2.0 or later. All contributions retain ownership by their original author, but must also be licensed under the GPL 2.0 or later to be merged.