-
Notifications
You must be signed in to change notification settings - Fork 338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2.2] Empty hostgroup now cause Arbiter to fail to start #1468
Comments
arg yes, this is not good. I'll have a look, it's indeed a regression from On Tue, Jan 20, 2015 at 9:36 PM, lostmimic notifications@github.com wrote:
|
Hum.... cannot reproduce with the tests (look at I'll look with a full launch. On Tue, Jan 20, 2015 at 10:09 PM, nap naparuba@gmail.com wrote:
|
ok void host groups are ok, so it's we did apply service on void groups? On Tue, Jan 20, 2015 at 10:13 PM, nap naparuba@gmail.com wrote:
|
ok in fact the test did have this. Can you provide us a configuration sample? (or better a pull request for a Thanks On Tue, Jan 20, 2015 at 10:17 PM, nap naparuba@gmail.com wrote:
|
This used to work in 2.0.3 (we are upgrading due to another bug in a module we need that was fixed in 2.2) We are using the mod-import-aws plugin to generate the hosts, here is a sample one that threw an error: define host{ register 0 ... define service{ and the /tmp/bad_start_for_arbiter file had this in it: ESC[31m[1421782614] ERROR: [Shinken] [service::UNKNOWN-SERVICE] the hostgroup '[u'nginx-app-hosts']' is unknownESC[0m I think what is going on is that the hostgroup is not being created at all because no instance in that EC2 environment matches it. |
So I was able to fix error the unknown hostgroup by explicitly putting all the hostgroups missing into one of the hostgroup configs (did not have to do that on 2.0.3) But we still see these types of errors:
If a service is a template, do we still need a service_description? (didnt get blocked by this on 2.0.3)
It seems that if we define a service that is not attached to any hosts (because the hostgroup is an empty set) we get these errors. |
It looks like it is not splitting up the hosts in the host_group and taking them as one large string:
|
Don't bother, I made those commits. I may have a look this week end and fix it quicly with the provided doc :) |
Too late, I dug a little haha... Seems I found the change in shinken/complexexpression.py as part of "Merge : Rework-Parsing-Clean branch after drift" (2f0e42f)
It seems that is because the get_hosts() function changed to no longer return an empty string but a list (which was the oddity we noticed originally with empty groups I'm guessing)
And this is because the Itemgroup class changed members attribute from a string to a list thought it should have split on the commas...
But I havent gone further then that |
Yeah that the good way it should be done. Member should be a list that's not a issue. To not bailout the trick si to put the error into warning confis instead of errors config. See function append_unknown_host in itemgroup (or something like that) |
Hey Sebastion, what about this error:
It seems like it took the list joined it into a single host_name and then failed because it didnt exist. |
yeah, beacause host_name shoul be splitted on coma. Im pretty sure it was defined like that in the conf. The thing is the doc was not clear if this should be a list or a string. |
Any update on this bug? |
We should rework a bit more the code to handle that properly. As we are in freeze mode we will do it in next release |
This is related to issue #1050
I run multiple environments and in the service configs I had a large set of services that may or may not be used in that environment. Before, if the hostgroup was empty, it just ignored it an moved along. With the issue #1050, it now causes Arbiter to fail to start.
Looking through other issues, I found #851 where this exact scenario was basically discussed when someone wanted a nagios feature brought over. Naparuba said "But it should just be a warning, not a full block of the start." and we are now seeing a full block of the start in this scenario :(
The text was updated successfully, but these errors were encountered: