Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LVM RAID #286

Merged
merged 25 commits into from
Jan 7, 2016
Merged

LVM RAID #286

merged 25 commits into from
Jan 7, 2016

Conversation

vpodzime
Copy link
Contributor

This adds an elementary support of non-linear LVs to Blivet. The first patch is unrelated, but also useful. Tests will follow later.

@vpodzime vpodzime force-pushed the master-lvm_raid branch 6 times, most recently from c848e67 to e89bf61 Compare November 25, 2015 10:39
@@ -668,7 +676,7 @@ class Single(ErsatzRAID):

class Dup(RAIDLevel):

""" A RAID level which expresses one way btrfs metadata may be distributed.
""" A RAID level which expresses one way btrfs metadata may be distriuted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix this small typing error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, good catch.

@jkonecny12
Copy link
Contributor

Otherwise seems good to me as far as I can tell.

@jkonecny12 jkonecny12 added the ACK label Nov 30, 2015
@vpodzime vpodzime force-pushed the master-lvm_raid branch 3 times, most recently from 9f9ffa6 to bf55cd9 Compare December 2, 2015 15:08
@dwlehman
Copy link
Contributor

Should we support the "striped" and "mirror" segment types? I think at least one of those is the old lvm implementation and not using the md driver. My suspicion is the lvm team would steer you and anyone else away from using that.

@dwlehman
Copy link
Contributor

This looks good overall. It would be nice if md, btrfs, and lvm all accepted the same raid_level ctor argument (it could be an alias/alternative to seg_type for lvm).

@vpodzime
Copy link
Contributor Author

Should we support the "striped" and "mirror" segment types? I think at least one of those is the old lvm implementation and not using the md driver. My suspicion is the lvm team would steer you and anyone else away from using that.

Since there's no such thing as LVM RAID0, striped should definitely be supported. And while mirror is superseded by raid1, it's still quite commonly used, AFAICT, especially because it is the default on RHEL 5 (if you just specify --mirrors=1 and leave the decision about the type to LVM). So I think this decision should probably be left to anaconda and blivet-gui, but not blivet which also needs to deal with existing configurations.

@vpodzime
Copy link
Contributor Author

It would be nice if md, btrfs, and lvm all accepted the same raid_level ctor argument (it could be an alias/alternative to seg_type for lvm).

Yeah, that would be nice. But I'd like to fully support the seg_type argument -- e.g. once all the LV classes are unified by the single class and mixins, the thin-pool segment type will be used to create a thin pool LV.

@vpodzime
Copy link
Contributor Author

vpodzime commented Jan 6, 2016

Added unit tests and fixed 3 minor things they discovered. Unless somebody complains somehow in the next two days, I'm gonna merge this PR.

It's just more common these days and the definition is easier to find and
recognize in the sources.
This allows for more precise configuration of how the LVM setup should look
like. We already have some rudimentary support for specifying PVs for caches
which is a must, but it's useful for all LVs in general.

Thanks dlehman@redhat.com for pieces of code and ideas for this patch!
This really only takes care about passing the type down to libblockdev and thus
lvcreate. We need to a lot more to actually fully support various types of LVs.
If we want to support other types of LV than linear we need to know how much
space we have in each PV. For example a 1GiB RAID1 LV requires not only 2GiB
total space in the VG with 2 PVs, but at the same time it requires 1GiB space on
each of the PVs.
Striped LVs are essentially RAID0 LVs as far as all the calculations go.
We already support various segment types for LVs to some extent. However, we
need to get a better picture of how much space on such LVs' PVs is required. We
already have a code for that, so let's just use it.

This unfortunately requires the LVPVSpec to have read-write attributes/fields
and thus it cannot be a namedtuple. We should probably come up with some
"read-write namedtuple" thing for cases like this in the future.
LVM creates small internal LVs for RAID LVs that hold the necessary metadata. We
need to account for that in order to be able to do calculations of PV/VG free
space etc. This requires us to do some of the calculations in a more granular
manner -- separating data and metadata parts.
non-linear LVs require space on specific PVs and thus have stronger restrictions
than linear LVs which can be allocated from anywhere.
Useful for some manual testing as well as for people wondering how to do
something like that.
A useful simplification of what we have to check in a few places.
Right now, we only support creation of non-linear LVs under some conditions that
allow us avoid trying to do too crazy things. Let's make sure these conditions
are met when a new LVMLogicalVolumeDevice object is being created.
We need that information in order to do checks when adding more LVs and users
need this information to decide about where to place their LVs.

Let's not bother with existing LVs allocations for now. We can just ignore those
and only care about newly added (non-existing) LVs which we need to place
somewhere.
The word "copies" is accurate together with mirror/RAID1 RAID, but it's
misleading with other RAID levels LVM supports. The property doesn't seem to be
accessed anywhere outside the class so let's just replace it with a private
property with a more accurate name.

Also give incomplete/inaccurate information if we have incomplete/inaccurate
information instead of erroring out.
lv.size reports the size of the LV not the space occupied in the VG, that's what
data_vg_space_used is for. Under the same logic lv.metadata_size should report
the size of the metadata space LV has available leaving
lv.metadata_vg_space_used for reporting how much space from the VG the metadata
part(s) of the LV take.

If the LV exists, we should just go through the internal metadata LVs and sum
their sizes because that's the actual/real value.

Also document the property.

Please note that no changes are needed outside of these two properties because
they are already used properly and this just fixes the values such places in
code calculate with (like metadata_size passed to blockdev.thpoolcreate() or
calculation of the pmspare LV's size).
Now that we keep track of available space in the PVs we need to take into
account LVM caches because those specify PVs and thus we need to make sure that
they really fit in somewhere. Also the users need to know how much space they
still have available in their PVs if they add a cache to their LV(s).
It is shorter, faster and more reliable. Using pv.name was a remnant of
development version of the LVM cache support that worked with PV names instead
of PV (StorageDevice) objects.
LVM complains about a PV appearing multiple times in the list of PVs to use.

Add and use a function for deduplicating things in a list (keeping the ordering
of the items).
Useful for testing as well as for users wondering how to do something like this.
In order to make sure we are working with something we understand.
Creating an LV means some extents were allocated from its VG's PVs. In order to
prevent us from working with old values we need to make sure fresh values are
fetched.
…e size

When an LVM cache is created, an internal metadata LV is created for it. And for
LVM that also means that a special pmspare LV with a size greater or equal to
the size of the metadata LV has to exist (and thus may be created) in the same
VG. We don't want to bother user code with these calculations and thus we should
subtract this space from the requested cache's size.
We can call linear LVs to be RaidDevice instances using the Linear RAID
level. So we can make all (non-thin) LVs to be instances of the RaidDevice
class.
RAID LVs have/need one extent big internal meta data LVs. Instead of bothering
the user code with this, let's just make the RAID LVs one extent smaller than
requested and use that space for the meta data.

Also reserve the space for the 'mirror' segment type which is a different
implementation of RAID1 in Device Mapper.
For example "mirror" is the nick of the "RAID1Level" the name of which is
"raid1".
This adds tests for commits, changes and new things implemented for the support
of non-linear LVs (aka LVM RAID).
@jkonecny12
Copy link
Contributor

My quick look found nothing bad so I think it should be good for pushing :) .

vpodzime added a commit that referenced this pull request Jan 7, 2016
@vpodzime vpodzime merged commit 0b62859 into storaged-project:master Jan 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants