-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge redhat lvm.sh feature set into heartbeat LVM agent #222
Conversation
Hopefully this branch will help the review process go smoother. The first patch sets up the current LVM heartbeat agent for the feature set merger that happens in the second patch. Look at the commit message for the first patch for details on where functionality got moved. The refactoring is necessary to keep this agent manageable with the inclusion of the redhat lvm.sh features. I structured the agent the way I did so I could both re-use the existing logic with the new feature set and isolate the original behavior with as little impact as possible. |
There are no significant differences between the original pull request and this one. Do we need to reiterate the discussion and a plea to keep the changes minimal and to give explanation for those changes ("this is needed for the feature xy" is not only vague, but hard to verify too). This is a complete rewrite of the existing RA and as such difficult to validate. The at length discussed topic (on the linux-ha-dev ML, IIRC) on using lvm tools in the monitor operation and the resulting patch was also not observed. |
Good, this is something we can discuss. The original LVM agent only did an 'ls' on the /dev/volgrpname directory to see if something existed in it. This new logic actually figures out what logical volumes are present in the volume group and verifies they are up individually. This is proven logic we've supported in the redhat agent for years. I'd be happy to address any technical concerns you have with this approach. If you could link me to the ML discussion I'd like to understand why something similar to this was disagreed on. -- Vossel |
On 4/19/2013 4:09 PM, Dejan Muhamedagic wrote:
So ok.. just that we agree upfront.. what kind of level of details would Example (not even sure if it´s touched by the patch.. just to get to the replace LVM_status() with merged_LVM_status() old LVM_status supports both lvm1 and lvm2 and gather status via new merged_LVM_status supports both lvm1 and lvm2, lay down the path for god_of_vg_display then automatically detects... blablabla... ---- code/patch ----- now.. redhat lvm provide has tons of pre-made functions that´s probably code path expanded would make it easier to read? For example: god_of_vg_display calls into file_x:function_y:line to gather info Z. Does that help? it´s a lot of work to be done this way, but doable if it Fabio |
The relevant discussion ... I don't have that readily available. At any rate, the resulting changes are these two: a405e78 and b4b8775 The crux of the matter is that just about any lvm command goes through all VGs on the system. If one of them is borked in such a way that it is inaccessible, the command will hang and the operation time out. Thus, an unrelated storage can affect the storage actually managed in this resource instance. This is what was observed in the field and reported in several instances, both by customers as well as on the public ML. |
interesting. I'll rework the status function so it attempts to avoid using the lvm tools. There are situations where this will be unavoidable, but I believe I can restore this behavior for the pre-merger LVM use-cases. I'll give it a go and see what I come up with. |
Alright, take a look at the new vg_status_normal function in the first commit. It is very similar to the old status function. The only difference I made is that if it looks like an empty volume group, we will fall back to using lvm tools to read the metadata just to verify the vg is in fact empty. In normal use cases, the lvm tools will not be executed though. sound reasonable? |
I really appreciate that code merging is happening, kudos for that. And the new exclusive activation feature also rocks. Given that it is a merge of two mature code bases and not incremental improvements, slightly less-than-perfectly incremental patches are to be expected. Moving code from the RA to a common shell library probably makes sense if those are shared between multiple implementations, but I agree that this makes it harder to review than in-place updates. Oh well. ;-) Calling out to "lvs" in the monitor function though is dangerous, unfortunately. I'm surprised you've not hit that on RHEL. If IO hangs (which is particularly common with multipath), that'll hang and time out - and, like Dejan commented, this can be triggered by any path within the LVM filter spec, since lvs will open/read them all. I've never seen a VG directory (/dev/cvg/) exist if no LVs inside it were activated; i.e., never seen the directory exist but empty. Can this actually happen? In short, does this test - except for checking if device nodes exist - actually make sense? It seems harmless enough now though, especially if it can't happen. As far as the exclusive activation of individual LVs is concerned, I admit I have second thoughts on lumping this in with the general LVM RA that manages whole volume groups. I was thinking that perhaps a dedicated LV RA would make more sense. But it does share a lot of code. I agree with Dejan that such a merge will want plenty of testing, but we seem to gain some good functionality here |
I've updated the status function to further remove all references to 'lvs'. I tested the vg activation without any logical volumes and it appears the /dev/vg dir doesn't exist until a logical volume is created. For some reason I thought I had seen it before. |
Thank you! Looks good to me. |
To clarify, will the code in resource-agents/rgmanager/src/resources/ also switch to using these shared libraries? |
On 04/30/2013 01:38 PM, Lars Marowsky-Brée wrote:
Those libraries come from there. As the agents are merged, Fabio |
Anything I can do to move this along? |
As already mentioned in the first comment, it is difficult to validate the first changeset (davidvossel@0ef1449) because of the number of changes within. If you can split that in smaller chunks which are self-sufficient and well explained, that would certainly help. Further observations in the first changeset. |
The exclusive activation using tags is only questionable if the pacemaker deployment does not have fencing enabled. |
|
I looked into portability more. I'll see what I can do to port the changes to work with dash. |
I've ported the changes to work under with both bash and dash. checkbashisms passes for all the changed files as well. edit I've also gone through and validated all use cases (normal, tags, exclusive with clvm) in my test environment again. |
Functionality from the original heartbeat agent has moved into new function names and files. This is a map to understanding where the original functionality lives within the new merged agent. LVM_monitor and LVM_status --> vg_status_normal() in lvm_vg.sh The new monitor function does additional checking to verify both the volume group and each logical volume is correctly activated. LVM_start() ---> vg_start_normal() in lvm_vg.sh LVM_stop() ---> vg_stop_normal() in lvm_vg.sh LVM_validate_all() ---> verify_setup() in lvm_functions.sh
On Wed, May 08, 2013 at 08:10:10AM -0700, David Vossel wrote:
If you want to rely on the cluster resource manager and fencing |
It doesn't appear to have anything to do with trying to prevent an outside source from activating the volume. If that happens, the agent will just strip the tag from the outside source and claim the volume for the cluster. It looks like this functionality is present to prevent a node from potentially stripping another cluster node's ownership of the volume. It seems like for the stop action this would be important. |
I believe the tagging/stripping and the way it's implemented is designed to
"Don't do that" only goes so far, and is little comfort to an administrator There's probably some other bits and pieces I've forgotten; Jon Brassow On Mon, May 13, 2013 at 1:59 PM, David Vossel notifications@github.comwrote:
Lon Hohberger |
It also appears to protect against manual invocation of the LVM commands. I'm not perfectly sure the added validation against the host list is mandatory, but it also doesn't seem to hurt. |
I have re-written the portability patch to avoid using a subshell while parsing lvm tool output. |
I've made 2 new updates to these patches.
|
…heartbeat LVM agent This patch leaves the original LVM heartbeat functionality intact while adding these addition features from the redhat agent. 1. Exclusive activation using volume group tags. This feature allows a volume group to live on shared storage within the cluster without requiring the use of cLVM for metadata locking. 2. individual logical volume activation for local and cluster volume groups by using the new 'lvname' option. 3. Better setup validation when the 'exclusive' option is enabled. This patch validates that when exclusive activation is enabled, either a cluster volume group is in use with cLVM, or the tags variant is configured correctly. These new checks also makes it impossible to enable the exclusive activation for cloned resources.
Thanks to Lars Ellenberg for his review and help on this patch.
I'm closing this pull request and opening a new one with a new set of patches. |
No description provided.