Creating a repository
In this tutorial we'll create a basic setup for a bev
repository, then we'll discuss more advanced usage scenarios.
A storage is basically a folder, which stores the files added to a repository. Several repositories can share a storage, so you can skip this step, if you already have one.
We'll choose the folder /mount/hdd1/storage
for that.
- Create the folder
mkdir /mount/hdd1/storage
- In that folder create a storage config, named
config.yml
hash:
name: sha256
levels: [1, 31]
Here the hash: name: sha256
is the hashing algorithm. You can use any name available in the hashlib
library. In most cases sha256
or blake2b
are good default choices.
Now we can create a bev
repository, which is simply a folder with a .bev.yml
file in it.
This is a config file, that specifies which storage locations to use. Let's point it to our storage:
lv-426:
storage:
- root: /mount/hdd1/storage
Here lv-426
is the configuration name (more on that later) and storage
is the list of storage locations to use.
This is a basic configuration, now you can start using bev
.
If you want to handle massive amounts of data, you may want to store them in different locations, such as multiple HDDs. Let's suppose we have another location - /mount/hdd2/storage
. Repeat the steps from Create a Storage
to create a new storage there.
Now let's add it to our config:
lv-426:
storage:
- root: /mount/hdd1/storage
- root: /mount/hdd2/storage
We're almost there. Let's limit the storage size by setting a lower limit on free disk space. In each storage config add the free_disk_size
key:
hash:
name: sha256
levels: [1, 31]
free_disk_size: 10GB
Now your data will be spread across 2 storage locations: when hdd1
will have less than 10GB of free space, bev
will start writing files to the second storage.
Let's push this a little further. What if you have several machines, and each machine has a different number of HDDs? The .bev.yml
is not valid for the second machine anymore.
But fear not, we can simply create another configuration in the same file:
lv-426:
storage:
- root: /mount/hdd1/storage
- root: /mount/hdd2/storage
fury-161:
storage:
- root: /nfs/vol1/data
- root: /nfs/vol2/data
- root: /nfs/vol3/data
here we assume that you already created the /nfs/vol{1,2,3}/data
storage locations.
But how bev
can tell which configuration to use?
By default it selects the appropriate configuration based on the current hostname. So on a machine named lv-426
the first config will be selected.
Finally, what if several machines have the same configuration but different hostnames? You can explicitly provide a list of valid hostnames like so:
lv-426:
storage:
- root: /mount/hdd1/storage
- root: /mount/hdd2/storage
fury-161:
hostname:
- pandora
- titanic
storage:
- root: /nfs/vol1/data
- root: /nfs/vol2/data
- root: /nfs/vol3/data
Now fury-161
will be used on 2 machines: pandora
and titanic
.