From 30464f128b1d31c16c1b798d7b193b90971e1342 Mon Sep 17 00:00:00 2001
From: Paul Khuong <pvk@pvk.ca>
Date: Tue, 7 Sep 2021 21:51:43 -0400
Subject: [PATCH] more documentation

More linkage, and add some sample usage to `lib.rs`.

TESTED=it's all comments.
---
 src/lib.rs           | 246 ++++++++++++++++++++++++++++++++++++++++++-
 src/plain.rs         |  15 +--
 src/raw_cache.rs     |   6 +-
 src/readonly.rs      |  51 ++++++---
 src/second_chance.rs |  11 +-
 src/sharded.rs       |  21 ++--
 src/stack.rs         |  91 +++++++++++-----
 7 files changed, 374 insertions(+), 67 deletions(-)

diff --git a/src/lib.rs b/src/lib.rs
index c9ede62..786fa06 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1,3 +1,235 @@
+//! Kismet implements multiprocess lock-free[^lock-free-fs]
+//! application-crash-safe (roughly) bounded persistent caches stored
+//! in filesystem directories, with a
+//! [Second Chance (Clock)](https://en.wikipedia.org/wiki/Page_replacement_algorithm#Second-chance)
+//! eviction strategy.  The maintenance logic is batched and invoked
+//! at periodic jittered intervals to make sure accesses amortise to a
+//! constant number of filesystem system calls and logarithmic (in the
+//! number of cached file) time complexity.  That's good for performance,
+//! and enables lock-freedom,[^unlike-ccache] but does mean that
+//! caches are expected to temporarily grow past their capacity
+//! limits, although rarely by more than a factor of 2 or 3.
+//!
+//! [^lock-free-fs]: Inasmuch as anything that makes a lot of syscalls
+//! can be "lock-free."  The cache access algorithms implement a
+//! protocol that makes a bounded number of file open, rename, or link
+//! syscalls; in other words, reads and writes are as wait-free as
+//! these syscalls.  The batched second-chance maintenance algorithm,
+//! on the other hand, is merely lock-free: it could theoretically get
+//! stuck if writers kept adding new files to a cache (sub)directory.
+//! Again, this guarantee is in terms of file access and directory
+//! enumeration syscalls, and maintenance is only as lock-free as the
+//! underlying syscalls.  However, we can assume the kernel is
+//! reasonably well designed, and doesn't let any sequence of syscalls
+//! keep its hand on kernel locks forever.
+//!
+//! [^unlike-ccache]: This design choice is different from, e.g.,
+//! [ccache](https://ccache.dev/)'s, which attempts to maintain
+//! statistics per shard with locked files.  Under high load, the lock
+//! to update ccache statistics becomes a bottleneck.  Yet, despite
+//! taking this hit, ccache batches evictions like Kismet, because
+//! cleaning up a directory is slow; up-to-date access statistics
+//! aren't enough to enforce tight cache capacity limits.
+//!
+//! In addition to constant per-cache space overhead, each Kismet
+//! cache maintains a variable-length [`std::path::PathBuf`] for the
+//! directory, one byte of lock-free metadata per shard, and no other
+//! non-heap resource (i.e., Kismet caches do not hold on to file
+//! objects).  This holds for individual cache directories; when
+//! stacking multiple caches in a [`Cache`], the read-write cache and
+//! all constituent read-only caches will each have their own
+//! `PathBuf` and per-shard metadata.
+//!
+//! When a Kismet cache triggers second chance evictions, it will
+//! allocate temporary data.  That data's size is proportional to the
+//! number of files in the cache shard subdirectory undergoing
+//! eviction (or the whole directory for a plain unsharded cache), and
+//! includes a copy of the name (without the path prefix) for each
+//! cached file in the subdirectory (or plain cache directory).  This
+//! eviction process is linearithmic-time in the number of files in
+//! the cache subdirectory (directory), and is invoked periodically,
+//! so as to amortise the maintenance time overhead to logarithmic
+//! per write to a cache subdirectory.
+//!
+//! Kismet does not pre-allocate any long-lived file object, so it may
+//! need to temporarily open file objects.  However, each call into
+//! Kismet will always bound the number of concurrently allocated file
+//! objects; the current logic never allocates more than two
+//! concurrent file objects.
+//!
+//! The load (number of files) in each cache may exceed the cache's
+//! capacity because there is no centralised accounting, except for
+//! what filesystems provide natively.  This design choice forces
+//! Kismet to amortise maintenance calls with randomisation, but also
+//! means that any number of threads or processes may safely access
+//! the same cache directories without any explicit synchronisation.
+//!
+//! Filesystems can't be trusted to provide much; Kismet only relies
+//! on file modification times (`mtime`), and on file access times
+//! (`atime`) that are either less than or equal to the `mtime`, or
+//! greater than the `mtime` (i.e., `relatime` is acceptable).  This
+//! implies that cached files should not be linked in multiple Kismet
+//! cache directories.  It is however safe to hardlink cached files in
+//! multiple places, as long as the files are not modified, or their
+//! `mtime` otherwise updated, through these non-Kismet links.
+//!
+//! Kismet cache directories are plain (unsharded) or sharded.
+//!
+//! Plain Kismet caches are simply directories where the cache entry for
+//! "key" is the file named "key."  These are most effective for
+//! read-only access to cache directories managed by some other
+//! process, or for small caches of up to ~100 cached files.
+//!
+//! Sharded caches scale to higher capacities, by indexing into one of
+//! a constant number of shard subdirectories with a hash, and letting
+//! each shard manage fewer files (ideally 10-100 files).  They are
+//! also much less likely to grow to multiples of the target capacity
+//! than plain (unsharded) cache directories.
+//!
+//! Simple usage should be covered by the [`ReadOnlyCache`] or
+//! [`Cache`] structs, which wrap [`plain::Cache`] and
+//! [`sharded::Cache`] in a convenient type-erased interface.  The
+//! caches *do not* invoke [`std::fs::File::sync_all`] or [`std::fs::File::sync_data`]:
+//! the caller should sync files before letting Kismet persist them in
+//! a cache if necessary.  File synchronisation is not automatic
+//! because it makes sense to implement persistent filesystem caches
+//! that are erased after each boot, e.g., via
+//! [tmpfiles.d](https://www.freedesktop.org/software/systemd/man/tmpfiles.d.html),
+//! or by tagging cache directories with a
+//! [boot id](https://man7.org/linux/man-pages/man3/sd_id128_get_machine.3.html).
+//!
+//! The cache code also does not sync the parent cache directories: we
+//! assume that it's safe, if unfortunate, for caches to lose data or
+//! revert to an older state after kernel or hardware crashes.  In
+//! general, the code attempts to be robust again direct manipulation
+//! of the cache directories.  It's always safe to delete cache files
+//! from kismet directories (ideally not recently created files in
+//! `.kismet_temp` directories), and even *adding* files should mostly
+//! do what one expects: they will be picked up if they're in the
+//! correct place (in a plain unsharded cache directory or in the
+//! correct shard subdirectory), and eventually evicted if useless or
+//! in the wrong shard.
+//!
+//! It is however essential to only publish files atomically to the
+//! cache directories, and it probably never makes sense to modify
+//! cached file objects in place.  In fact, Kismet always set files
+//! readonly before publishing them to the cache and always returns
+//! read-only [`std::fs::File`] objects for cached data.
+//!
+//! # Sample usage
+//!
+//! One could access a list of read-only caches with a [`ReadOnlyCache`].
+//!
+//! ```no_run
+//! const NUM_SHARDS: usize = 10;
+//!
+//! let read_only = kismet_cache::ReadOnlyCacheBuilder::new()
+//!     .plain("/tmp/plain_cache")  // Read first here
+//!     .sharded("/tmp/sharded_cache", NUM_SHARDS)  // Then try there.
+//!     .build();
+//!
+//! // Attempt to read the file for key "foo", with primary hash 1
+//! // and second hash 2, first from `/tmp/plain.cache`, and then
+//! // from `/tmp/sharded_cache`.  In practice, the hashes should
+//! // probably be populated with by implementing the `From<&'a T>`
+//! // trait, and passing a `&T` to the cache methods.
+//! read_only.get(kismet_cache::Key::new("foo", 1, 2));
+//! ```
+//!
+//! Read-write accesses should use a [`Cache`]:
+//!
+//! ```no_run
+//! struct CacheKey {
+//!   // ...
+//! }
+//!
+//! fn get_contents(key: &CacheKey) -> Vec<u8> {
+//!   // ...
+//!   # unreachable!()
+//! }
+//!
+//! impl<'a> From<&'a CacheKey> for kismet_cache::Key<'a> {
+//!   fn from(key: &CacheKey) -> kismet_cache::Key {
+//!     // ...
+//!     # unreachable!()
+//!   }
+//! }
+//!
+//!
+//! // It's easier to increase the capacity than the number of shards,
+//! // so, when in doubt, prefer a few too many shards with a lower
+//! // capacity.  It's not incorrect to increase the number of shards,
+//! // but will result in lost cached data (eventually deleted), since
+//! // Kismet does not assign shards with a consistent hash.
+//! const NUM_SHARDS: usize = 100;
+//! const CAPACITY: usize = 1000;
+//!
+//! # fn main() -> std::io::Result<()> {
+//! use std::io::Read;
+//! use std::io::Write;
+//!
+//! let cache = kismet_cache::CacheBuilder::new()
+//!     .sharded_writer("/tmp/root_cache", NUM_SHARDS, CAPACITY)
+//!     .plain_reader("/tmp/extra_cache")  // Try to fill cache misses here
+//!     .build();
+//!
+//! let key: CacheKey = // ...
+//!     # CacheKey {}
+//!     ;
+//!
+//! // Fetches the current cached value for `key`, or populates it with
+//! // the closure argument if missing.
+//! let mut cached_file = cache
+//!     .ensure(&key, |file| {
+//!         file.write_all(&get_contents(&key))?;
+//!         file.sync_all()
+//!     })?;
+//! let mut contents = Vec::new();
+//! cached_file.read_to_end(&mut contents)?;
+//! # Ok(())
+//! # }
+//! ```
+//!
+//! # Cache directory structure
+//!
+//! Plain (unsharded) cache directories simply store the value for
+//! each `key` under a file named `key`.  They also have a single
+//! `.kismet_temp` subdirectory, for temporary files.
+//!
+//! The second chance algorithm relies on mtime / atime (`relatime`
+//! updates suffice), so merely opening a file automatically updates
+//! the relevant read tracking metadata.
+//!
+//! Sharded cache directories store the values for each `key` under
+//! one of two shard subdirectories.  The primary and second potential
+//! shards are respectively determined by multiplying `Key::hash` and
+//! `Key::secondary_hash` by different odd integers before mapping the
+//! result to the shard universe with a fixed point scaling.
+//!
+//! Each subdirectory is named `.kismet_$HEX_SHARD_ID`, and contains
+//! cached files with name equal to the cache key, and a
+//! `.kismet_temp` subsubdirectory, just like plain unsharded caches.
+//! In fact, each such shard is managed exactly like a plain cache.
+//!
+//! Sharded caches attempt to balance load between two potential
+//! shards for each cache key in an attempt to make all shards grow at
+//! roughly the same rate.  Once all the shards have reached their
+//! capacity, the sharded cache will slowly revert to storing cache
+//! keys in their primary shards.
+//!
+//! This scheme lets plain cache directories easily interoperate with
+//! other programs that are not aware of Kismet, and also lets an
+//! application use the same directory to back both a plain and a
+//! sharded cache (concurrently or not) without any possibility of
+//! collision between cached files and Kismet's internal directories.
+//!
+//! Kismet will always store its internal data in files or directories
+//! start start with a `.kismet` prefix, and cached data lives in
+//! files with names equal to their keys.  Since Kismet sanitises
+//! cache keys to forbid them from starting with `.`, `/`, or `\\`, it
+//! is always safe for an application to store additional data in
+//! files or directories that start with a `.`, as long as they do not
+//! collide with the `.kismet` prefix.
 mod cache_dir;
 pub mod plain;
 pub mod raw_cache;
@@ -18,10 +250,16 @@ pub use stack::CacheHitAction;
 /// subdirectory.
 pub const KISMET_TEMPORARY_SUBDIRECTORY: &str = ".kismet_temp";
 
-/// Sharded cache keys consist of a filename and two hash values.  The
-/// two hashes should be computed by distinct functions of the key's
-/// name, and each hash function must be identical for all processes
-/// that access the same sharded cache directory.
+/// Cache keys consist of a filename and two hash values.  The two
+/// hashes should ideally be computed by distinct functions of the
+/// key's name, but Kismet will function correctly if the `hash` and
+/// `secondary_hash` are the same.  Each hash function **must** be
+/// identical for all processes that access the same sharded cache
+/// directory.
+///
+/// The `name` should not be empty nor start with a dot, forward
+/// slash, a backslash: caches will reject any operation on such names
+/// with an `ErrorKind::InvalidInput` error.
 #[derive(Clone, Copy, Debug)]
 pub struct Key<'a> {
     pub name: &'a str,
diff --git a/src/plain.rs b/src/plain.rs
index 1895e3c..59c40d2 100644
--- a/src/plain.rs
+++ b/src/plain.rs
@@ -1,11 +1,14 @@
-//! A `plain::Cache` stores all cached file in a single directory, and
-//! periodically scans for evictions with a second chance strategy.
-//! This implementation does not scale up to more than a few hundred
-//! files per cache directory (a `sharded::Cache` can go higher),
-//! but interoperates seamlessly with other file-based programs.
+//! A [`crate::plain::Cache`] stores all cached file in a single
+//! directory (there may also be a `.kismet_temp` subdirectory for
+//! temporary files), and periodically scans for evictions with a
+//! second chance strategy.  This implementation does not scale up to
+//! more than a few hundred files per cache directory (a
+//! [`crate::sharded::Cache`] can go higher), but interoperates
+//! seamlessly with other file-based programs that store cache files
+//! in flat directories.
 //!
 //! This module is useful for lower level usage; in most cases, the
-//! `Cache` is more convenient and just as efficient.
+//! [`crate::Cache`] is more convenient and just as efficient.
 //!
 //! The cache's contents will grow past its stated capacity, but
 //! should rarely reach more than twice that capacity.
diff --git a/src/raw_cache.rs b/src/raw_cache.rs
index 2466eb9..e2c5873 100644
--- a/src/raw_cache.rs
+++ b/src/raw_cache.rs
@@ -1,12 +1,12 @@
 //! The raw cache module manages directories of read-only files
 //! subject to a (batched) Second Chance eviction policy.  Calling
-//! `prune` deletes files to make sure a cache directory does not
+//! [`prune`] deletes files to make sure a cache directory does not
 //! exceed its capacity, in file count.  The deletions will obey a
 //! Second Chance policy as long as insertions and updates go through
-//! `insert_or_update` or `insert_or_touch`, in order to update the
+//! [`insert_or_update`] or [`insert_or_touch`], in order to update the
 //! cached files' modification times correctly.  Opening the cached
 //! file will automatically update its metadata to take that access
-//! into account, but a path can also be `touch`ed explicitly.
+//! into account, but a path can also be [`touch`]ed explicitly.
 //!
 //! This module implements mechanisms, but does not hardcode any
 //! policy... except the use of a second chance strategy.
diff --git a/src/readonly.rs b/src/readonly.rs
index de70d06..8b51443 100644
--- a/src/readonly.rs
+++ b/src/readonly.rs
@@ -4,6 +4,8 @@
 //! and easy-to-use interface that erases the difference between plain
 //! and sharded caches.
 use std::fs::File;
+#[allow(unused_imports)] // We refer to this enum in comments.
+use std::io::ErrorKind;
 use std::io::Result;
 use std::path::Path;
 use std::sync::Arc;
@@ -49,7 +51,7 @@ impl ReadSide for ShardedCache {
     }
 }
 
-/// Construct a `ReadOnlyCache` with this builder.  The resulting
+/// Construct a [`ReadOnlyCache`] with this builder.  The resulting
 /// cache will access each constituent cache directory in the order
 /// they were added.
 ///
@@ -59,13 +61,20 @@ pub struct ReadOnlyCacheBuilder {
     stack: Vec<Box<dyn ReadSide>>,
 }
 
-/// A `ReadOnlyCache` wraps an arbitrary number of caches, and
-/// attempts to satisfy `get` and `touch` requests by hitting each
-/// constituent cache in order.  This interface hides the difference
-/// between plain and sharded cache directories, and should be the
-/// first resort for read-only uses.
+/// A [`ReadOnlyCache`] wraps an arbitrary number of
+/// [`crate::plain::Cache`] and [`crate::sharded::Cache`], and attempts
+/// to satisfy [`ReadOnlyCache::get`] and [`ReadOnlyCache::touch`]
+/// requests by hitting each constituent cache in order.  This
+/// interface hides the difference between plain and sharded cache
+/// directories, and should be the first resort for read-only uses.
 ///
 /// The default cache wraps an empty set of constituent caches.
+///
+/// [`ReadOnlyCache`] objects are stateless and cheap to clone; don't
+/// put an [`Arc`] on them.  Avoid creating multiple
+/// [`ReadOnlyCache`]s for the same stack of directories: there is no
+/// internal state to maintain, so multiple instances simply waste
+/// memory without any benefit.
 #[derive(Clone, Debug)]
 pub struct ReadOnlyCache {
     stack: Arc<[Box<dyn ReadSide>]>,
@@ -114,7 +123,7 @@ impl ReadOnlyCacheBuilder {
         self
     }
 
-    /// Returns a fresh `ReadOnlyCache` for the builder's search list
+    /// Returns a fresh [`ReadOnlyCache`] for the builder's search list
     /// of constituent cache directories.
     pub fn build(self) -> ReadOnlyCache {
         ReadOnlyCache::new(self.stack)
@@ -135,15 +144,19 @@ impl ReadOnlyCache {
     }
 
     /// Attempts to open a read-only file for `key`.  The
-    /// `ReadOnlyCache` will query each constituent cache in order of
-    /// registration, and return a read-only file for the first hit.
+    /// [`ReadOnlyCache`] will query each constituent cache in order
+    /// of registration, and return a read-only file for the first
+    /// hit.
     ///
-    /// Fails with `ErrorKind::InvalidInput` if `key.name` is invalid
-    /// (empty, or starts with a dot or a forward or back slash).
+    /// Fails with [`ErrorKind::InvalidInput`] if `key.name` is
+    /// invalid (empty, or starts with a dot or a forward or back slash).
     ///
-    /// Returns `None` if no file for `key` can be found in any of the
-    /// constituent caches, and bubbles up the first I/O error
+    /// Returns [`None`] if no file for `key` can be found in any of
+    /// the constituent caches, and bubbles up the first I/O error
     /// encountered, if any.
+    ///
+    /// In the worst case, each call to `get` attempts to open two
+    /// files for each cache directory in the `ReadOnlyCache` stack.
     pub fn get<'a>(&self, key: impl Into<Key<'a>>) -> Result<Option<File>> {
         fn doit(stack: &[Box<dyn ReadSide>], key: Key) -> Result<Option<File>> {
             for cache in stack.iter() {
@@ -163,14 +176,18 @@ impl ReadOnlyCache {
     }
 
     /// Marks a cache entry for `key` as accessed (read).  The
-    /// `ReadOnlyCache` will touch the same file that would be returned
-    /// by `get`.
+    /// [`ReadOnlyCache`] will touch the same file that would be
+    /// returned by `get`.
     ///
-    /// Fails with `ErrorKind::InvalidInput` if `key.name` is invalid
-    /// (empty, or starts with a dot or a forward or back slash).
+    /// Fails with [`ErrorKind::InvalidInput`] if `key.name` is
+    /// invalid (empty, or starts with a dot or a forward or back slash).
     ///
     /// Returns whether a file for `key` could be found, and bubbles
     /// up the first I/O error encountered, if any.
+    ///
+    /// In the worst case, each call to `touch` attempts to update the
+    /// access time on two files for each cache directory in the
+    /// `ReadOnlyCache` stack.
     pub fn touch<'a>(&self, key: impl Into<Key<'a>>) -> Result<bool> {
         fn doit(stack: &[Box<dyn ReadSide>], key: Key) -> Result<bool> {
             for cache in stack.iter() {
diff --git a/src/second_chance.rs b/src/second_chance.rs
index 2610039..06a42b5 100644
--- a/src/second_chance.rs
+++ b/src/second_chance.rs
@@ -1,8 +1,9 @@
-//! The Second Chance or Clock page replacement policy is a simple
-//! approximation of the Least Recently Used policy.  Kismet uses the
-//! second chance policy because it can be easily implemented on top
-//! of the usual file modification and access times that we can trust
-//! operating systems to update for us.
+//! The [Second Chance or Clock](https://en.wikipedia.org/wiki/Page_replacement_algorithm#Second-chance)
+//! page replacement policy is a simple approximation of the Least
+//! Recently Used policy.  Kismet uses the second chance policy
+//! because it can be easily implemented on top of the usual file
+//! modification and access times that we can trust operating systems
+//! to update for us.
 //!
 //! This second chance implementation is optimised for *batch*
 //! maintenance: the caller is expected to perform a number of
diff --git a/src/sharded.rs b/src/sharded.rs
index bb38167..cd9ca42 100644
--- a/src/sharded.rs
+++ b/src/sharded.rs
@@ -1,13 +1,18 @@
-//! A `sharded::Cache` uses the same basic file-based second chance
-//! strategy as a `plain::Cache`.  However, while the simple plain
-//! cache is well suited to small caches (down to 2-3 files, and up
-//! maybe one hundred), this sharded version can scale nearly
-//! arbitrarily high: each shard should have fewer than one hundred or
-//! so files, but there may be arbitrarily many shards (up to
-//! filesystem limits, since each shard is a subdirectory).
+//! A [`crate::sharded::Cache`] uses the same basic file-based second
+//! chance strategy as a [`crate::plain::Cache`].  However, while the
+//! simple plain cache is well suited to small caches (down to 2-3
+//! files, and up maybe one hundred), this sharded version can scale
+//! nearly arbitrarily high: each shard should have fewer than one
+//! hundred or so files, but there may be arbitrarily many shards (up
+//! to filesystem limits, since each shard is a subdirectory).
+//!
+//! A sharded cache directory consists of shard subdirectories (with
+//! name equal to the shard index printed as `%04x`), each of which
+//! contains the cached files for that shard, under their `key` name,
+//! and an optional `.kismet_temp` subdirectory for temporary files.
 //!
 //! This module is useful for lower level usage; in most cases, the
-//! `Cache` is more convenient and just as efficient.
+//! [`crate::Cache`] is more convenient and just as efficient.
 //!
 //! The cache's contents will grow past its stated capacity, but
 //! should rarely reach more than twice that capacity, especially
diff --git a/src/stack.rs b/src/stack.rs
index dcef42d..fc850e6 100644
--- a/src/stack.rs
+++ b/src/stack.rs
@@ -1,8 +1,9 @@
-//! We expect most callers to interact with Kismet via the `Cache`
-//! struct defined here.  A `Cache` hides the difference in behaviour
-//! between plain and sharded caches via late binding, and lets
-//! callers transparently handle misses by looking in a series of
-//! secondary cache directories.
+//! We expect most callers to interact with Kismet via the [`Cache`]
+//! struct defined here.  A [`Cache`] hides the difference in
+//! behaviour between [`crate::plain::Cache`] and
+//! [`crate::sharded::Cache`] via late binding, and lets callers
+//! transparently handle misses by looking in a series of secondary
+//! cache directories.
 use std::borrow::Cow;
 use std::fs::File;
 use std::io::Error;
@@ -98,20 +99,29 @@ impl FullCache for ShardedCache {
     }
 }
 
-/// Construct a `Cache` with this builder.  The resulting cache will
+/// Construct a [`Cache`] with this builder.  The resulting cache will
 /// always first access its write-side cache (if defined), and, on
-/// misses, will attempt to service `get` and `touch` calls by
-/// iterating over the read-only caches.
+/// misses, will attempt to service [`Cache::get`] and
+/// [`Cache::touch`] calls by iterating over the read-only caches.
 #[derive(Debug, Default)]
 pub struct CacheBuilder {
     write_side: Option<Arc<dyn FullCache>>,
     read_side: ReadOnlyCacheBuilder,
 }
 
-/// A `Cache` wraps either up to one plain or sharded read-write cache
-/// in a convenient interface, and may optionally fulfill read
+/// A [`Cache`] wraps either up to one plain or sharded read-write
+/// cache in a convenient interface, and may optionally fulfill read
 /// operations by deferring to a list of read-only cache when the
 /// read-write cache misses.
+///
+/// The default cache has no write-side and an empty stack of backup
+/// read-only caches.
+///
+/// [`Cache`] objects are cheap to clone and lock-free; don't put an
+/// [`Arc`] on them.  Avoid opening multiple caches for the same set
+/// of directories: using the same [`Cache`] object improves the
+/// accuracy of the write cache's lock-free in-memory statistics, when
+/// it's a sharded cache.
 #[derive(Clone, Debug, Default)]
 pub struct Cache {
     write_side: Option<Arc<dyn FullCache>>,
@@ -129,7 +139,7 @@ pub enum CacheHit<'a> {
     Secondary(&'a mut File),
 }
 
-/// What to do with a cache hit in a `get_or_update` call?
+/// What to do with a cache hit in a [`Cache::get_or_update`] call?
 pub enum CacheHitAction {
     /// Return the cache hit as is.
     Accept,
@@ -209,7 +219,7 @@ impl CacheBuilder {
         self
     }
 
-    /// Returns a fresh `Cache` for the builder's write cache and
+    /// Returns a fresh [`Cache`] for the builder's write cache and
     /// additional search list of read-only cache directories.
     pub fn build(self) -> Cache {
         Cache {
@@ -225,12 +235,16 @@ impl Cache {
     /// additional read-only cache, in definition order, and return a
     /// read-only file for the first hit.
     ///
-    /// Fails with `ErrorKind::InvalidInput` if `key.name` is invalid
+    /// Fails with [`ErrorKind::InvalidInput`] if `key.name` is invalid
     /// (empty, or starts with a dot or a forward or back slash).
     ///
-    /// Returns `None` if no file for `key` can be found in any of the
+    /// Returns [`None`] if no file for `key` can be found in any of the
     /// constituent caches, and bubbles up the first I/O error
     /// encountered, if any.
+    ///
+    /// In the worst case, each call to `get` attempts to open two
+    /// files for the [`Cache`]'s read-write directory and for each
+    /// read-only backup directory.
     pub fn get<'a>(&self, key: impl Into<Key<'a>>) -> Result<Option<File>> {
         fn doit(
             write_side: Option<&dyn FullCache>,
@@ -257,8 +271,10 @@ impl Cache {
     /// populates the cache with a file filled by `populate`.  Returns
     /// a file in all cases (unless the call fails with an error).
     ///
-    /// Fails with `ErrorKind::InvalidInput` if `key.name` is invalid
-    /// (empty, or starts with a dot or a forward or back slash).
+    /// Fails with [`ErrorKind::InvalidInput`] if `key.name` is
+    /// invalid (empty, or starts with a dot or a forward or back slash).
+    ///
+    /// See [`Cache::get_or_update`] for more control over the operation.
     pub fn ensure<'a>(
         &self,
         key: impl Into<Key<'a>>,
@@ -276,12 +292,24 @@ impl Cache {
     /// filled by `populate`; otherwise obeys the value returned by
     /// `judge` to determine what to do with the hit.
     ///
-    /// Fails with `ErrorKind::InvalidInput` if `key.name` is invalid
-    /// (empty, or starts with a dot or a forward or back slash).
+    /// Fails with [`ErrorKind::InvalidInput`] if `key.name` is
+    /// invalid (empty, or starts with a dot or a forward or back slash).
     ///
     /// When we need to populate a new file, `populate` is called with
     /// a mutable reference to the destination file, and the old
     /// cached file (in whatever state `judge` left it), if available.
+    /// cached file, if available.
+    ///
+    /// See [`Cache::ensure`] for a simpler interface.
+    ///
+    /// In the worst case, each call to `get_or_update` attempts to
+    /// open two files for the [`Cache`]'s read-write directory and
+    /// for each read-only backup directory, and fails to find
+    /// anything.  `get_or_update` then publishes a new cached file
+    /// (in a constant number of file operations), but not before
+    /// triggering a second chance maintenance (time linearithmic in
+    /// the number of files in the directory chosen for maintenance,
+    /// but amortised to logarithmic).
     pub fn get_or_update<'a>(
         &self,
         key: impl Into<Key<'a>>,
@@ -357,13 +385,18 @@ impl Cache {
 
     /// Inserts or overwrites the file at `value` as `key` in the
     /// write cache directory.  This will always fail with
-    /// `Unsupported` if no write cache was defined.
+    /// [`ErrorKind::Unsupported`] if no write cache was defined.
     ///
-    /// Fails with `ErrorKind::InvalidInput` if `key.name` is invalid
+    /// Fails with [`ErrorKind::InvalidInput`] if `key.name` is invalid
     /// (empty, or starts with a dot or a forward or back slash).
     ///
     /// Always consumes the file at `value` on success; may consume it
     /// on error.
+    ///
+    /// Executes in a bounded number of file operations, except for
+    /// the lock-free maintenance, which needs time linearithmic in
+    /// the number of files in the directory chosen for maintenance,
+    /// amortised to logarithmic.
     pub fn set<'a>(&self, key: impl Into<Key<'a>>, value: impl AsRef<Path>) -> Result<()> {
         match self.write_side.as_ref() {
             Some(write) => write.set(key.into(), value.as_ref()),
@@ -376,13 +409,19 @@ impl Cache {
 
     /// Inserts the file at `value` as `key` in the cache directory if
     /// there is no such cached entry already, or touches the cached
-    /// file if it already exists.
+    /// file if it already exists.  This will always fail with
+    /// [`ErrorKind::Unsupported`] if no write cache was defined.
     ///
-    /// Fails with `ErrorKind::InvalidInput` if `key.name` is invalid
+    /// Fails with [`ErrorKind::InvalidInput`] if `key.name` is invalid
     /// (empty, or starts with a dot or a forward or back slash).
     ///
     /// Always consumes the file at `value` on success; may consume it
     /// on error.
+    ///
+    /// Executes in a bounded number of file operations, except for
+    /// the lock-free maintenance, which needs time linearithmic in
+    /// the number of files in the directory chosen for maintenance,
+    /// amortised to logarithmic.
     pub fn put<'a>(&self, key: impl Into<Key<'a>>, value: impl AsRef<Path>) -> Result<()> {
         match self.write_side.as_ref() {
             Some(write) => write.put(key.into(), value.as_ref()),
@@ -393,14 +432,18 @@ impl Cache {
         }
     }
 
-    /// Marks a cache entry for `key` as accessed (read).  The `Cache`
+    /// Marks a cache entry for `key` as accessed (read).  The [`Cache`]
     /// will touch the same file that would be returned by `get`.
     ///
-    /// Fails with `ErrorKind::InvalidInput` if `key.name` is invalid
+    /// Fails with [`ErrorKind::InvalidInput`] if `key.name` is invalid
     /// (empty, or starts with a dot or a forward or back slash).
     ///
     /// Returns whether a file for `key` could be found, and bubbles
     /// up the first I/O error encountered, if any.
+    ///
+    /// In the worst case, each call to `touch` attempts to update the
+    /// access time on two files for each cache directory in the
+    /// `ReadOnlyCache` stack.
     pub fn touch<'a>(&self, key: impl Into<Key<'a>>) -> Result<bool> {
         fn doit(
             write_side: Option<&dyn FullCache>,