Skip to content
This repository has been archived by the owner on May 8, 2021. It is now read-only.

Make Image Caches plugins #5

Open
sdpeters opened this issue Feb 13, 2019 · 0 comments
Open

Make Image Caches plugins #5

sdpeters opened this issue Feb 13, 2019 · 0 comments
Labels
enhancement New feature or request
Milestone

Comments

@sdpeters
Copy link
Owner

Jason and Sage want RWL to be a plugin. I interpreted that to mean that all ImageCache modules should be plugins. This would include PassThroughImageCache.

A plugin is separately loaded, only when a client uses an image that's configured to use that feature. This implies that opens must fail if the plugin is not available. There are some nuances to what that means we should discuss. (Does open really fail, or do we just fail to initialize the cache if we don't have the plugin? Probably the second. If so, can we still discard a dirty image cache without its plugin? We can for RWL. I think it should always be true.)

A plugin can be distributed in a separate package. PassThroughImageCache should be included with librbd. A statically linked (with PMDK) RWL plugin could be included in the librbd rpm, but that might no t be a good idea. We'd like RWL to be distributed in its own rpm (deb, whatever) so that rpm can depend on libpmemobj, libfabric, etc. That way RWL can use the installed PMDK (otherwise it really can't).

As noted in Here, there's a Ceph build option related to this (WITH_PMEM_PKG). What that means and does needs to be nailed down as part of this. That implies working out how an RWL plugin can get built and tested in every Ceph PR, even if those machines don't have PMDK or libfabric installed.

Making ImageCache modules plugins has the additional implication that enabling the image-cache feature should be generalized. Now that's just a bit. On open, the image-cache feature bit causes librbd to enable RWL and the ImageWriteBack object under it. This should take arguments, and enable the configuration of a combination of (stacked) write back caches. For example, a user might stack an instance of RWL using 256M of pmem on top of another instance of RWL using a mirrored NVMe-oF SSD (or some future SSD-based HA write back cache). I'd suggest that initially we support configuring RWL, PassThrough, or both. Unit tests that don't build RWL can at least build and test the enabling of PassThrough. If we allow combining multiple instances of the same cache plugin with different parameters, the ImageCache constructor will need to gain a "layer" argument, which the modules can use in the names of files (etc.) they create to implement the cache.

Allowing ImageCache modules to be configured in layers may mean that the "image-cache discard-dirty" command needs to selectively remove just one layer. If someone stacked an unreplicated RWL on top of a replicated one for some reason, they might want to discard only the one on top (flushing the next one down from its replica).

Stacking replicated write back caches introduces some other issues. If they don't replicate to the same node, then how do the replicas ever flush when the client fails? The replica for the top cache layer can't flush unless it can write to the cache layer below. If the cache layer below is replicated to a different node that won't be possible. One solution is to require all replicated caches to replicate to the same node, and all fail over at the same time (making that replica of all layers the master for its layer, and enabling writes down through the stack. Another solution is to flush the replica from the lowest layer first, then remove that layer before the replica for the next layer up is flushed. Each layer's replica rites directly to ImageWriteBack. The simplest solution might be to allow only one layer to use replication, or only one layer to be write-back.

@sdpeters sdpeters added the enhancement New feature or request label Feb 13, 2019
@sdpeters sdpeters added this to the RWL phase 1 milestone Feb 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant