-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HA error with 7.5 #49
Comments
Maybe |
here you are, I can see an error in the log at "Aug 14 12:18:16"
|
Can you reduce a bit the output around what do you think can be interesting? Hard to read that huge pile of text "as is". Thanks! |
Installing hosts in VMs to see if I can reproduce the issue. |
Okay I can't reproduce. I had 3 XCP-ng 7.5 VMs freshly installed (VM1, VM2 and VM3).
So it seems there is no issue with NFS. Note that GFS is not supported because XenServer code for it is not open source. |
It matters. GFS2 is not supported. |
But if I choose GFS2 it ask for create a cluster to manage that... 🤔 And it creates it |
But some GFS2 packages aren't Open Source, hence not included in XCP-ng, so there is a lot of chance that it doesn't work in the end. |
so, if I have a block device storage (FC, iSCSI) I cannot do thin provisioning over it ? only on file device storage such as NFS ? |
That's correct. |
Thanks |
Have to second this. Tried NFS and iSCSI (LVM only based on the above dialog), generating the same errors. |
I can't reproduce the issue here in the lab. Do you have NTP correctly set on all your hosts? (this is vital to get HA working) |
My firewall is the NTP server, NFS permissions verified, and both NFS and iSCSI retried after manually resyncing the servers to the firewall (all were around 0.0004s off). |
How many hosts? edit: 3, my bad, didn't see that edit2: tried again in the lab, worked perfectly 🤔 |
Figured it out. Clustering enabled on the pool is preventing HA from being set up. EDIT: Removed follow-up comment. PEBCAK on that one. |
That's good to know! But do not expect GFS2 to work in next XCP-ng release, it's NOT Open Source. To enjoy thin pro, use NFS. We'll probably work in the future on a solution for iSCSI to use a FS on top, until then, NFS is the best choice. |
@olivierlambert What exactly is it in GFS2, that is not opensource??? Am I mistaken, thinking that it's the same GFS2, that is in the Linux kernel?? Or is it some tools around it that is missing?? |
GFS2 support in XenServer is not opensource. GFS2 itself is. |
so, it's basically the module/utility that creates and mount the filesystem on the disks? |
Several packages are proprietary : xapi-clusterd, xapi-storage-plugins*... |
hmm.. xapi-clusterd is probably just some integration with lvm clusterd.. xapi-storage-plugins I don't know... Think I'm going to look in to this.. I been doing alot of LVM/GFS2 setup.. maybe I can reuse some of the knowledges to implement this |
If someone have a complete list of the proprietary packages, that would help alot.. |
Related to GFS2 support: Other non-free stuff: |
ok.. I will into it.. thanks |
Any idea where the xapi packages in this repo comes from?: |
Probably from an early version of the XCP-ng 7.4 installation ISO, or from the final 7.4 ISO itself. |
Did we have the source for those packages? or were they just copied from XenServer? |
They hadn't split parts of xapi into closed-source components yet at that time, so those are free. The source RPMs are those from XenServer 7.4 source ISO. |
ahh.. could we use those?.. might not be as optimized, but should be good enough? |
The code base has probably evolved a lot (clustering was an experimental feature and is being developed across several versions), so I foresee a lot of work to adapt it. |
hmmm.. you might be right.. maybe we could just use clvm and thin provisioned lv's... That has been stable for years |
sorry, no clvm, lvmlockd.. which supports thin provisioned lv's |
I will try to setup a POC.. |
I really wonder why it never has been implemented. I mean the LVM stuff, overall, works okay. The bigges problem is the size of snapshots - but from what I read through, you can set the size rather small and tell it to increment in case of filling up. So a configurable variable for the maximum initial size of snapshots should be pretty doable. |
Actually.. I remember that there WAS ISCSI LVM thin provisioning at one time (around Xenserver 5).. and then it changed to GFS2?.. @stormi or others.. I need the softdog.ko kernel module built.. what is the easiest way to build a kernel module for XCP-ng?? |
There has been discussion on the forum about that kind of thing. The issue is clustering. LVM thin provisioning on a single host is easy. It is not when you have several hosts that need to synchronize. Hence distributed systems such as GFS2. To build a kernel module, see https://github.com/xcp-ng/xcp-ng-build-env Pinging @Wescoeur who might want to elaborate about how we see the future of storage in XCP-ng. |
Clustering with lvmlockd instead of clvmd actually gives features like thin provisioning.. Lvmlockd is now default in SUSE 15 and have made it into RHEL8.. So I would consider it pretty stable.. As I see it, GFS2 would only be needed for shared volumes.. I have setup a POC, just needs to compile the softdog module to see how it works.. |
They decided to switch from block based (all LVM-ish solutions: shared iSCSI, local LVM, HBA) for multiple reasons:
|
sanlock is definitely not the way to go.. will try to do a setup with dlm/corosync.. |
corosync will be a pain to integrate properly (because it's already only partially integrated by XAPI but the rest is closed source). I plan a lot of pain if you try to do it. Good luck! |
pacemaker is maybe a better alternative?? Thinking of adopting something like this: https://www.suse.com/documentation/sle-ha-15/book_sleha_guide/data/sec_ha_clvm_config.html |
It's very complicated because you'll probably need to integrate your work into XAPI. This is why it's not a matter of just grabbing a tech, but integration. However, as I said, contributions/PoC are VERY welcome! |
Hello, I have similar problem - after I disabled HA and did some HW maintenance, rebooted couple of hosts and reassembled them back into the pool, I am no longer able to re-enable HA with exactly same error. I tried changing the SR of HA from iSCSI, to NFS etc. it's always same error |
These are the logs from xensource - very useless to be honest "Not_found" with no context whatsoever
|
So if I re-read correctly, this issue was related to "clustering" being enabled. This is documented in our officials docs now so I'm closing this issue. Feel free to reopen, or better yet create a new one (that would be more readable) if it's needed. |
already tested xcp 7.4 and 7.4.1 with no issues.
installed 7.5 from scratch and I cannot enable HA.
Tried with NFS and FC, also with GFS2 and LVM.
the log from /var/log/SMlog
The text was updated successfully, but these errors were encountered: