Skip to content

Creating a repository

Max edited this page Aug 2, 2021 · 1 revision

In this tutorial we'll create a basic setup for a bev repository, then we'll discuss more advanced usage scenarios.

Create a storage

A storage is basically a folder, which stores the files added to a repository. Several repositories can share a storage, so you can skip this step, if you already have one.

We'll choose the folder /mount/hdd1/storage for that.

  1. Create the folder
mkdir /mount/hdd1/storage
  1. In that folder create a storage config, named config.yml
hash:
  name: sha256

levels: [1, 31]

Here the hash: name: sha256 is the hashing algorithm. You can use any name available in the hashlib library. In most cases sha256 or blake2b are good default choices.

Create the repository

Now we can create a bev repository, which is simply a folder with a .bev.yml file in it. This is a config file, that specifies which storage locations to use. Let's point it to our storage:

lv-426:
  storage:
    - root: /mount/hdd1/storage

Here lv-426 is the configuration name (more on that later) and storage is the list of storage locations to use.

This is a basic configuration, now you can start using bev.

Using multiple storage locations

If you want to handle massive amounts of data, you may want to store them in different locations, such as multiple HDDs. Let's suppose we have another location - /mount/hdd2/storage. Repeat the steps from Create a Storage to create a new storage there.

Now let's add it to our config:

lv-426:
  storage:
    - root: /mount/hdd1/storage
    - root: /mount/hdd2/storage

We're almost there. Let's limit the storage size by setting a lower limit on free disk space. In each storage config add the free_disk_size key:

hash:
  name: sha256

levels: [1, 31]
free_disk_size: 10GB

Now your data will be spread across 2 storage locations: when hdd1 will have less than 10GB of free space, bev will start writing files to the second storage.

Using multiple configurations

Let's push this a little further. What if you have several machines, and each machine has a different number of HDDs? The .bev.yml is not valid for the second machine anymore.

But fear not, we can simply create another configuration in the same file:

lv-426:
  storage:
    - root: /mount/hdd1/storage
    - root: /mount/hdd2/storage

fury-161:
  storage:
    - root: /nfs/vol1/data
    - root: /nfs/vol2/data
    - root: /nfs/vol3/data

here we assume that you already created the /nfs/vol{1,2,3}/data storage locations.

But how bev can tell which configuration to use? By default it selects the appropriate configuration based on the current hostname. So on a machine named lv-426 the first config will be selected.

Sharing configurations

Finally, what if several machines have the same configuration but different hostnames? You can explicitly provide a list of valid hostnames like so:

lv-426:
  storage:
    - root: /mount/hdd1/storage
    - root: /mount/hdd2/storage

fury-161:
  hostname:
    - pandora
    - titanic

  storage:
    - root: /nfs/vol1/data
    - root: /nfs/vol2/data
    - root: /nfs/vol3/data

Now fury-161 will be used on 2 machines: pandora and titanic.