Description
[Moved from the spec issues repository as this describes a new use case of handling multiple storage roots making up one repository. It includes both the aggregation of content in multiple storage roots and possibly replication of content.]
This may be a part of issue OCFL/spec#22 and it certainly follows on from the comment.
My institution can't provide a single 200TB volume (!). But they can give me 2 x 70TB and a 60TB volume. So for my use case I now need to have 3 OCFL filesystems that I interact with as a single unit from my service.
Given this, it would be nice to be able to define metadata at the repository level that says this filesystem is a part of a larger set of peers. Nice to haves would include defining a priority for each peer and perhaps the storage tier. That way, clients can make smart decisions about ranking peers by tier and then priority (I imagine these are properties defined by the administrators provisioning the storage).
The justification for this is that any connecting service or user inspecting the filesystem can identify that it is part of a larger set.
For example - a storage.json
or some such with content like:
{
peers: [
{
type: 'filesystem',
mountpoint: /mnt/ocfl-repo1
priority: 1,
tier: 'hot'
},
{
type: 'filesystem',
mountpoint: /mnt/ocfl-repo2
priority: 2,
tier: 'cold'
},
{
type: 's3'
endpointUrl: undefined (means aws S3) or URL (means something like a local minio instance),
forcePathStyle: true, false or undefined (=false) (required for minio),
priority: 2,
tier: 'warm'
},
{
type: 'filesystem',
mountpoint: /mnt/ocfl-repo3,
priority: 1,
tier: 'hot'
},
]
}
In this model priority can be any sequential number and class could be 'hot', 'warm', 'cold' to dovetail with typical nomenclature used in the industry.