New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need more graceful handling of duplicate machine GUIDs #5488
Comments
It seems that the current implementation is informing the user with a warning, while at the same time replaces the host information with the one that just arrived. Offending line: Line 364 in 3b28e0f
From a brief research, i could see the following possible solutions:
|
The machine guid is supposed to be a rather permanent identification of a host. The registry depends on that more than it does on hostnames and urls. So we can't be changing it all the time. The other two options seem better. If we already identify the situation (you said you saw a warning), then we could auto-correct it on the master. We don't want to change the GUID format, but we could regenerate it. There is a question though on whether by that time we will already have some corrupted data. So it may actually be necessary for the master to refuse to process any of the slave's data, until the duplicate GUID situation is resolved, either manually or automatically. |
Fixed in #5497 |
@paulkatsoulakis please test that PR and we'll merge. |
The fix does not seem to get us all the way there. Line 874 in 2ee2d94
If i get the change right, we do block the eventual data corruption, but we still need to make sure we avoid modifying the master host details. So.. probably this check should go at the very beginning to completely avoid any further processing ? |
Ok, can you do a different PR based on #5497 to take care of the UI as well? |
Good morning, sure i 'm on it |
I think we can resolve this now (i can't resolve) |
Bug report summary
We spent a lot of time trying to identify the root cause of some problems in a master slave setup. The root cause ended up being that one of the slaves was created by cloning the master, taking the same machine GUID with it. We should have at least a detection of such cases, so we can warn users about it.
Steps To Reproduce
Master and slave with the same GUID.
The master hostname configured in netdata.conf is overwritten by the slave's hostname and we have issues with the charts as well.
Expected behavior
Report an error, or have a self-cleaning behavior (e.g. regenerate a machine GUID when a duplicate is reported).
The text was updated successfully, but these errors were encountered: