Skip to content

Commit

Permalink
Merge 2a26142 into 30464f1
Browse files Browse the repository at this point in the history
  • Loading branch information
pkhuong authored Sep 12, 2021
2 parents 30464f1 + 2a26142 commit 1f3c9be
Show file tree
Hide file tree
Showing 4 changed files with 243 additions and 59 deletions.
61 changes: 27 additions & 34 deletions src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
//! Kismet implements multiprocess lock-free[^lock-free-fs]
//! application-crash-safe (roughly) bounded persistent caches stored
//! crash-safe and (roughly) bounded persistent caches stored
//! in filesystem directories, with a
//! [Second Chance (Clock)](https://en.wikipedia.org/wiki/Page_replacement_algorithm#Second-chance)
//! [Second Chance](https://en.wikipedia.org/wiki/Page_replacement_algorithm#Second-chance)
//! eviction strategy. The maintenance logic is batched and invoked
//! at periodic jittered intervals to make sure accesses amortise to a
//! constant number of filesystem system calls and logarithmic (in the
Expand Down Expand Up @@ -36,20 +36,21 @@
//! directory, one byte of lock-free metadata per shard, and no other
//! non-heap resource (i.e., Kismet caches do not hold on to file
//! objects). This holds for individual cache directories; when
//! stacking multiple caches in a [`Cache`], the read-write cache and
//! all constituent read-only caches will each have their own
//! `PathBuf` and per-shard metadata.
//! stacking multiple caches in a [`Cache`] or [`ReadOnlyCache`], the
//! read-write cache and all constituent read-only caches will each
//! have their own `PathBuf` and per-shard metadata.
//!
//! When a Kismet cache triggers second chance evictions, it will
//! allocate temporary data. That data's size is proportional to the
//! number of files in the cache shard subdirectory undergoing
//! eviction (or the whole directory for a plain unsharded cache), and
//! includes a copy of the name (without the path prefix) for each
//! includes a copy of the basename (without the path prefix) for each
//! cached file in the subdirectory (or plain cache directory). This
//! eviction process is linearithmic-time in the number of files in
//! the cache subdirectory (directory), and is invoked periodically,
//! so as to amortise the maintenance time overhead to logarithmic
//! per write to a cache subdirectory.
//! so as to amortise the maintenance overhead to logarithmic (in the
//! total number of files in the subdirectory) time per write to a
//! cache subdirectory, and constant file operations per write.
//!
//! Kismet does not pre-allocate any long-lived file object, so it may
//! need to temporarily open file objects. However, each call into
Expand All @@ -73,6 +74,8 @@
//! multiple places, as long as the files are not modified, or their
//! `mtime` otherwise updated, through these non-Kismet links.
//!
//! # Plain and sharded caches
//!
//! Kismet cache directories are plain (unsharded) or sharded.
//!
//! Plain Kismet caches are simply directories where the cache entry for
Expand All @@ -88,27 +91,20 @@
//!
//! Simple usage should be covered by the [`ReadOnlyCache`] or
//! [`Cache`] structs, which wrap [`plain::Cache`] and
//! [`sharded::Cache`] in a convenient type-erased interface. The
//! caches *do not* invoke [`std::fs::File::sync_all`] or [`std::fs::File::sync_data`]:
//! the caller should sync files before letting Kismet persist them in
//! a cache if necessary. File synchronisation is not automatic
//! because it makes sense to implement persistent filesystem caches
//! that are erased after each boot, e.g., via
//! [tmpfiles.d](https://www.freedesktop.org/software/systemd/man/tmpfiles.d.html),
//! or by tagging cache directories with a
//! [boot id](https://man7.org/linux/man-pages/man3/sd_id128_get_machine.3.html).
//!
//! The cache code also does not sync the parent cache directories: we
//! assume that it's safe, if unfortunate, for caches to lose data or
//! revert to an older state after kernel or hardware crashes. In
//! general, the code attempts to be robust again direct manipulation
//! of the cache directories. It's always safe to delete cache files
//! from kismet directories (ideally not recently created files in
//! `.kismet_temp` directories), and even *adding* files should mostly
//! do what one expects: they will be picked up if they're in the
//! correct place (in a plain unsharded cache directory or in the
//! correct shard subdirectory), and eventually evicted if useless or
//! in the wrong shard.
//! [`sharded::Cache`] in a convenient type-erased interface.
//!
//! While the cache code syncs cached data files by default, it does
//! not sync the parent cache directories: we assume that it's safe,
//! if unfortunate, for caches to lose data or revert to an older
//! state after kernel or hardware crashes. In general, the code
//! attempts to be robust again direct manipulation of the cache
//! directories. It's always safe to delete cache files from kismet
//! directories (ideally not recently created files in `.kismet_temp`
//! subdirectories), and even *adding* files should mostly do what one
//! expects: they will be picked up if they're in the correct place
//! (in a plain unsharded cache directory or in the correct shard
//! subdirectory), and eventually evicted if useless or in the wrong
//! shard.
//!
//! It is however essential to only publish files atomically to the
//! cache directories, and it probably never makes sense to modify
Expand Down Expand Up @@ -180,10 +176,7 @@
//! // Fetches the current cached value for `key`, or populates it with
//! // the closure argument if missing.
//! let mut cached_file = cache
//! .ensure(&key, |file| {
//! file.write_all(&get_contents(&key))?;
//! file.sync_all()
//! })?;
//! .ensure(&key, |file| file.write_all(&get_contents(&key)))?;
//! let mut contents = Vec::new();
//! cached_file.read_to_end(&mut contents)?;
//! # Ok(())
Expand Down Expand Up @@ -226,7 +219,7 @@
//! Kismet will always store its internal data in files or directories
//! start start with a `.kismet` prefix, and cached data lives in
//! files with names equal to their keys. Since Kismet sanitises
//! cache keys to forbid them from starting with `.`, `/`, or `\\`, it
//! cache keys to forbid them from starting with `.`, `/`, or `\`, it
//! is always safe for an application to store additional data in
//! files or directories that start with a `.`, as long as they do not
//! collide with the `.kismet` prefix.
Expand Down
6 changes: 5 additions & 1 deletion src/plain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,11 @@
//! in flat directories.
//!
//! This module is useful for lower level usage; in most cases, the
//! [`crate::Cache`] is more convenient and just as efficient.
//! [`crate::Cache`] is more convenient and just as efficient. In
//! particular, a `crate::plain::Cache` *does not* invoke
//! [`std::fs::File::sync_all`] or [`std::fs::File::sync_data`]: the
//! caller should sync files before letting Kismet persist them in a
//! directory, if necessary.
//!
//! The cache's contents will grow past its stated capacity, but
//! should rarely reach more than twice that capacity.
Expand Down
6 changes: 5 additions & 1 deletion src/sharded.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,11 @@
//! and an optional `.kismet_temp` subdirectory for temporary files.
//!
//! This module is useful for lower level usage; in most cases, the
//! [`crate::Cache`] is more convenient and just as efficient.
//! [`crate::Cache`] is more convenient and just as efficient. In
//! particular, a `crate::sharded::Cache` *does not* invoke
//! [`std::fs::File::sync_all`] or [`std::fs::File::sync_data`]: the
//! caller should sync files before letting Kismet persist them in a
//! directory, if necessary.
//!
//! The cache's contents will grow past its stated capacity, but
//! should rarely reach more than twice that capacity, especially
Expand Down
Loading

0 comments on commit 1f3c9be

Please sign in to comment.