R&D: New 'storage' stage/code as future replacement of the 'layout' stage/code #2510

jsmeix · 2020-11-06T14:20:16Z

The overall goal of a new 'storage' stage/code is to
make "rear recover" behave reasonably dynamically
depending on what there is on the replacement hardware
and depending on how the replacement hardware behaves.

The current 'layout' stage/code works rather statically
depending on what there is on the original system.
What there is on the original system
(like disks, partitions, volumes, filesystems, mountpoints)
is stored as some parameter values into a disklayout.conf file.
When "rear recover" runs on the replacement hardware
it reads the disklayout.conf file contents and generates
a diskrestore.sh script with basically static setup comands
to recreate partitions, volumes, filesystems, mountpoints.
Currently there is only some rather limited support when disk names change.
E.g. when /dev/sda on the original system becomes /dev/sdb on the replacement hardware
the same static setup comands get generated with /dev/sdb instead of /dev/sda as parameter.
But in general things fall apart during "rear recover" when the replacement hardware
is different and/or behaves different compared to the original system in less predictable ways.
A consequence of the current rather static dependancy on what there is on the original system
is that currently "rear recover" must run on fully compatible replacement hardware.

The intent of the new 'storage' stage/code is not to make "rear recover"
'simply just work' even on arbitrarily different replacement hardware
but the intent of the new 'storage' stage/code is to make "rear recover" behave
better (in particular more gracefully) on reasonably similar replacement hardware.

I got the official SUSE task that starting next week
I can spend a reasonable amount of my working time
(cf. #799 (comment) )
to do research and development (R&D) for a
new 'storage' stage/code that is intended
as future replacement of the 'layout' stage/code.

The topmost important driving issue behind is
#2254
"ReaR must work independent of kernel device names"

Another related driving goal behind is to improve ReaR for system migrations
when there are different kernel device names on the replacement hardware
e.g. from "usual" disks to NVMe disks or from hardware disks to virtual disks
(but not for advanced system migrations with a totally different disk structure).

As long as the new 'storage' stage/code is under research and development
(which will likely take some years) the stable 'layout' stage/code will still
be actively maintained and still be used by default as described in
#791 (comment)

When ReaR must work independent of kernel device names it means that the
kernel device names on the replacement hardware are no longer known in advance.
E.g. when the original system was on /dev/sda we can no longer assume
there will be also a /dev/sda on the replacement hardware.

Simply put:
Kernel device names on the replacement hardware are unpredictable.

When ReaR must work with unpredictable kernel device names it means that
it is no longer possible with reasonable effort to generate a diskrestore.sh script
from data in disklayout.conf (that is partially "old/outdated" data from the original system)
and then run that generated fixed/hardcoded diskrestore.sh script
to create partitions/volumes/filesystems/mountpoints/...
with a predefined static sequence of setup commands and parameters.

When ReaR must work with unpredictable kernel device names it means
we must create partitions/volumes/filesystems just in time as needed step by step
based on some possibly "old/outdated" data from the original system
that is more used as a guidance what to set up than as a strict specification.

We can no longer rely on that a newly set up "strorage thingy" will appear
with a kernel device name that is known in advance.
Instead we set up a new "strorage thingy" and then wait and see
with what kernel device name that new "strorage thingy" appears.
As a positive side-effect this new storage setup behaviour will solve
#791

Finally #799 will also be implemented
for the new 'storage' stage/code because it is needed as a precondition
for the new storage setup behaviour which must run on fully clean disks
where not any old leftover "strorage thingies" exist
because when we want to wait and see with what kernel device name
a new "strorage thingy" appears there must not be any old "strorage thingies".

In the end the new storage setup behaviour should be the same
as what an admin would do manually when he sets up a system from scratch.
So for an admin the new storage setup behaviour should be more predictable
because it should behave same as when he would set up things manually.

New layout/recreate/default/150_wipe_disks.sh to wipe disks. The disks that will be completely wiped are those disks where in diskrestore.sh the create_disk_label function is called (the create_disk_label function calls "parted -s $disk mklabel $label") i.e. the disks that will be completely overwritten by diskrestore.sh. This implements #799 "Clean up disks before recreating partitions/volumes/filesystems/..." The intent is to be also used later as a precondition for the future new 'storage' stage/code as future replacement of the 'layout' stage/code cf. #2510

jsmeix · 2020-11-12T11:44:39Z

Regarding the above
"[recreate] based on some possibly "old/outdated" data from the original system
that is more used as a guidance what to set up than as a strict specification"
see #2514 (comment)
that reads (excerpts):

When migrating from rotating rust disks to SSDs
new disk alignment is usually needed so that
the partition boundaries on the SSDs match the
phsyical "sector size" or "segment size" or "erase block size"
(i.e. the physical read/write unit size) of the SSD
which is 4 MiB or 8 MiB or even more on nowadays SSDs
cf. USB_PARTITION_ALIGN_BLOCK_SIZE in default.conf
https://github.com/rear/rear/blob/master/usr/share/rear/conf/default.conf#L853

So when migrating from rotating rust disks to SSDs
the partition begin values must be adapted to be
multiples of the physical read/write unit size of the SSD
i.e. multiples of 4 MiB or 8 MiB or even more.

So when migrating from rotating rust disks to SSDs
the partition begin values (that are usually multiples
of 1 MiB or less on rotating rust disks)
need to be shifted to multiples of 4 MiB or 8 MiB or more
which means in the end to recreate the disk partitions with an offset.

So an option to recreate the disk partitions with an offset
is needed anyway in ReaR and such an option can then
also be (mis)-used to avoid issues when recreating on a used disk
the same storage objects at exact same positions on the disk
(i.e. to avoid exact same positions when recreating
same storage objects)
in particular when recreating on the same hardware (i.e. when
the replacement hardware is the original system hardware).

Of course when we recreate the disk partitions with an offset
the next needed feature is to also adapt the sizes of nested
storage objects inside disk partitions like LVM volumes
which is yet another feature that we need since a long time
to make migration easier for the user.

But is is not always possible to change the size of a storage object
so there must be some way for the user to specify which sizes
could be changed and which sizes must be strictly kept.

In particular in some cases one cannot shrink the size at all
(not even by a single byte) for example when the content is
not stored as a usual backup of files but as some kind of
"whole storage object image" e.g. a 'dd' of a partition
in case of the BACKUP=BLOCKCLONE method, cf.
https://github.com/rear/rear/blob/master/doc/user-guide/12-BLOCKCLONE.adoc

github-actions · 2021-01-12T02:45:25Z

Stale issue message

jsmeix added enhancement Adaptions and new features cleanup discuss / RFC severe improvement Heavy improvements required for long term success of ReaR. labels Nov 6, 2020

jsmeix added this to the ReaR future milestone Nov 6, 2020

jsmeix self-assigned this Nov 6, 2020

jsmeix mentioned this issue Nov 9, 2020

Enhance btrfs_subvolumes_setup_generic() to make it work everywhere. #2085

Closed

jsmeix mentioned this issue Nov 11, 2020

Wipe disks before recreating partitions/volumes/filesystems/... #2514

Merged

github-actions bot added the no-issue-activity label Jan 12, 2021

github-actions bot closed this as completed Jan 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R&D: New 'storage' stage/code as future replacement of the 'layout' stage/code #2510

R&D: New 'storage' stage/code as future replacement of the 'layout' stage/code #2510

jsmeix commented Nov 6, 2020 •

edited

jsmeix commented Nov 12, 2020 •

edited

github-actions bot commented Jan 12, 2021

R&D: New 'storage' stage/code as future replacement of the 'layout' stage/code #2510

R&D: New 'storage' stage/code as future replacement of the 'layout' stage/code #2510

Comments

jsmeix commented Nov 6, 2020 • edited

jsmeix commented Nov 12, 2020 • edited

github-actions bot commented Jan 12, 2021

jsmeix commented Nov 6, 2020 •

edited

jsmeix commented Nov 12, 2020 •

edited