-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of Weights Data structures #4468
Conversation
5c69eff
to
d1e17cd
Compare
Adding this datastructure will allow us to resolve the issues hashicorp#1088 and hashicorp#4198 This new structure defaults to values: ``` { Passing: 1, Warning: 0 } ``` Which means, use weight of 0 for a Service in Warning State while use Weight 1 for a Healthy Service. Thus it remains compatible with previous Consul versions.
d1e17cd
to
dac209a
Compare
@pearkes Ok, this first implementation is already working for SRV record. I think this implementation is good enough. In another PR later, I will add support for applying weights to A/AAAA DNS queries. TestingRegister nodesWe register 1 old node without the new Weight attribute, which is equivalent to registering with We register 1 large node with higher weight, 2 working nodes, 1 working node, 1 critical and 1 maintenance. # Old way, regular node, without weights
curl -vfs --request PUT -d '{"Node": "noweight-host-redis-1-0", "Address": "192.168.12.1", "Service": {"Service": "redis", "Tags": ["http", "tag_host-redis-1-1.test.acme.com", "start_20184431"], "Port": 8000, "Checks":[{"id": "canard", "ServiceID": "redis", "Name":"script", "Interval": "10m", "Args":["/tmp/check_script.sh"],"Notes":"hello", "Status": "passing"}]}}' http://localhost:8500/v1/catalog/register
# Large node, weight 600
curl -vfs --request PUT -d '{"Node": "large-host-redis-1-0", "Address": "192.168.10.1", "Service": {"Service": "redis", "Tags": ["http", "tag_host-redis-1-1.test.acme.com", "start_20184431"], "Port": 8000, "Weights":{"Passing": 600, "Warning":30}}, "Checks":[{"id": "canard", "ServiceID": "redis", "Name":"script", "Interval": "10m", "Args":["/tmp/check_script.sh"],"Notes":"hello", "Status": "passing"}]}}' http://localhost:8500/v1/catalog/register
# 2 working nodes
curl -vfs --request PUT -d '{"Node": "working-host-redis-1-1", "Address": "192.168.1.1", "Service": {"Service": "redis", "Tags": ["http", "tag_host-redis-1-1.test.acme.com", "start_20184431"], "Port": 8000, "Weights":{"Passing": 100, "Warning":7}}, "Checks":[{"id": "canard", "ServiceID": "redis", "Name":"script", "Interval": "10m", "Args":["/tmp/check_script.sh"],"Notes":"hello", "Status": "passing"}]}}' http://localhost:8500/v1/catalog/register
curl -vfs --request PUT -d '{"Node": "working-host-redis-1-2", "Address": "192.168.1.2", "Service": {"Service": "redis", "Tags": ["http", "tag_host-redis-1-2.test.acme.com", "start_20184431"], "Port": 8000, "Weights":{"Passing": 100, "Warning":7}}, "Checks":[{"id": "canard", "ServiceID": "redis", "Name":"script", "Interval": "10m", "Args":["/tmp/check_script.sh"],"Notes":"hello", "Status": "passing"}]}}' http://localhost:8500/v1/catalog/register
# 1 warning node
curl -vfs --request PUT -d '{"Node": "warning-host-redis-1-3", "Address": "192.168.1.3", "Service": {"Service": "redis", "Tags": ["http", "tag_host-redis-1-3.test.acme.com", "start_20184431"], "Port": 8000, "Weights":{"Passing": 100, "Warning":7}}, "Checks":[{"id": "canard", "ServiceID": "redis", "Name":"script", "Interval": "10m", "Args":["/tmp/check_script.sh"],"Notes":"hello", "Status": "warning"}]}}' http://localhost:8500/v1/catalog/register
# 1 critical node
curl -vfs --request PUT -d '{"Node": "critical-host-redis-1-4", "Address": "192.168.1.4", "Service": {"Service": "redis", "Tags": ["http", "tag_host-redis-1-4.test.acme.com", "start_20184431"], "Port": 8000, "Weights":{"Passing": 100, "Warning":7}}, "Checks":[{"id": "canard", "ServiceID": "redis", "Name":"script", "Interval": "10m", "Args":["/tmp/check_script.sh"],"Notes":"hello", "Status": "critical"}]}}' http://localhost:8500/v1/catalog/register
# 1 maintenance node
curl -vfs --request PUT -d '{"Node": "maint-host-redis-1-5", "Address": "192.168.1.5", "Service": {"Service": "redis", "Tags": ["http", "tag_host-redis-1-4.test.acme.com", "start_20184431"], "Port": 8000, "Weights":{"Passing": 100, "Warning":7}}, "Checks":[{"id": "canard", "ServiceID": "redis", "Name":"_maint", "Interval": "10m", "Args":["/tmp/check_script.sh"],"Notes":"hello", "Status": "maintenance"}]}}' http://localhost:8500/v1/catalog/register Result
|
This looks awesome Pierre. Just 1 question from your high level description before I give the code a proper look:
This is not intuitive behaviour for me. The rest makes total sense but why would we treat The main reason I think this is an issue other than it being new and a bit magical, is that most DNS implementations ignore the weights and this change would cause people using What do you think? Did I misunderstand the current behaviour somehow? edit: I'm actually puzzled how that last example is working. Our built in maintenance checks use Critical: Line 3046 in 870a6ad
api.HealthMaint = "maintenance" defined but it's never actually used anywhere as far as I can see 😄.
And I don't see any explicit behaviour in your diff changing the behaviour for |
@banks Basically, it is just that HealthCheck Status do not enforce values as they are represented as Strings. Thus, any value different from "passing", "warning" is considered with weight = 0 Critical are not seen as already "cleaned up" when requesting Health => not visible since not returned by RPC call anyway In case of an unknown Status, we apply the rule:
I had to implement a rule for "Unknown statuses" since the API let you specify what you want, thus, it was to illustrate this behavior. BWT, should I also compare statuses with "PASSING" and "Passing" as well since in most Consul API, case sensitivity is sometime a bit confusing, especially in JSON payloads? Possible improvementsAnother Remark, and I would like your input: Upgrading an existing system might be tricky since when you start adding weights, default values Apply (it means 1 for "passing", "0" for "warning") The problem is that on a production system, if you have 100 nodes having a service, it is "hard" if not impossible to update all Consul definitions at once. So, if you choose to add "100" for weight on your service, when you update this node, it gets weight = 100 while all others keep weight = 1... => troubles. So an improvement would be to be able to provide in the Consul Server configuration a default value for Weight.Passing and Weight.Warning which would be applied by default. Thus, when you upgrade your Consul Cluster, you put this value to 100, and adding weights to service would less likely break you production. If you agree with such proposal, I might implement this in another PR |
… (backwards compatibility)
When using DNS interface with only_passing = false, all nodes with non-Critical healthcheck used to have a weight value of 1. While having weight.Warning = 0 as default value, this is probably a bad idea as it breaks ascending compatibility. Thus, we put a default value of 1 to be consistent with existing behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some doc nitpicks.
I'll try to dig through this PR in more depth this week and get it moved forwards.
Thanks as always for the work and patience!
DNS SRV responses. If this field is not specified, its default value is: | ||
`"weights": {"passing": 1, "warning": 1}`. | ||
When a service is `critical`, it is excluded from DNS responses, `warning` services | ||
will be in DNS responses as well unless the `only_passing = true` is set in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this reads a little confusingly and might be interpreted backwards, instead we could simplify it into multiple sentences:
When a service is
critical
, it is excluded from DNS responses. Services withwarning
checks are in included in responses by default, but excluded if the optional paramonly_passing = true
is present in agent DNS configuration or?passing
is used via the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DONE
When DNS SRV requests are issued, response will include the weight in the SRV | ||
response accordingly to the state of service. It allows to put more weights | ||
on services running on larger nodes and optionnally to put less pressure on | ||
services being in `warning` state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When DNS SRV requests are made, the response will include the weights specified given the state of the service. This allows some instances to be given hight weight if they have more capacity, and optionally allows reducing load on services with checks in
warning
status by givingpassing
instances a higher weight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@banks DONE as well in next patch
@mkeeler @banks if you have remarks regarding my comment #4468 (comment) do not hesitate Kind Regards |
Yep I agree with this. I think this is an issue with the PR/feature as it stands. We could land it with good docs to point that out though. I'd like to hold off adding centralised config and all the machinery needed to support that for now though for a couple of reasons:
I will talk through with the team on our call later, but I suspect the answer is either to land this as it is and document the gotcha well, or to hold this for a while (will be at least a few months) so we can build it as part of our bigger change. @pierresouchay are you using this in your deployment already? Are you keen to see it land soon for reason other than just to get it done 😄? |
@banks Thank you, here are my plans Migration to weightsFor the default value, I planned the following: Define a WeightDefaults (Agent Setting)The value is applied on all services that Have now weight set. It allow to set a config on a given host, for instance: 300 on a large instance, 50 on a tiny one. Define a ClusterDefaultWeight (Server Setting / DNS Client setting)When a Server receives a RPC request without a weight, Use this values to fill Weights => ensure that old clients CAN have default setting that are not When a Agent performing DNS query answers, if Weight is not set in Service, Use these values => ensure that existing Service will have similar values. => This will ensure that it is possible to work with DNS Agent upgraded, while servers are not OR services have not yet been migrated About our statusI am heavily pushing for weight standardisation internally, because I plan to use this for:
This is crucial for us since we are moving from pure bare-metal systems to virtualised ones and thus, there are very heterogeneous hosts for a given kind of services. We currently live without by using tags/metadata, but it is not sustainable to add this to each of our layers (for instance, complicate with systems ONLY using DNS), that is why I think this is a very important step. Of course, once HashiCorp will agree on this, I plan to implement:
|
Ah thanks. I forgot that all these queries are served by a server already so no need to "push" config out to agents to make this work. In that case I agree that having a way to set default weights in agent config that get applied by servers in response to RPCs where the stored instances have no weights defined makes sense and mitigates the issue you described with rolling out new weights. There will still be potential gotchas with changing those defaults or the relative size of weights in general - if you start with weights where 10 is normal for example and then want a way to have nodes that have less than 10% of load and decide to change to making 100 be the standard instead you would have the same issue that all the current nodes that defined a weight relative to 10 will need to update in sync or will have their traffic messed up for some time while it all rolls out. I think that's fundamentally unavoidable with weights defined in this way though so it's not a blocker but we probably do need to take care to spell that out in documentation for service weights. (I think this maybe warrants a few paragraphs in a section somewhere rather than just notes on the service definition fields for weights). Anything else => 0Regarding the default behaviour for non-standard check status strings. Even though those are kinda outside our expected values and don't mean much, I'm not sure it's OK to change load balancing behaviour out of the box for those. For example if someone has some bad script that is setting "OK" instead of "passing" as the status, right now I think (please correct!) DNS will still list their instances. I think your change here will cause those instances suddenly to be listed as weight = 0 and so some DNS clients will just not send any traffic to any OK instances if there are others with warnings! I realise it's super edge-casey but it seems safer to me to return weight of 1 for anything we don't understand then the behaviour now is preserved - we don't penalise any checks that for some reason are in a state we don't understand. What do you think? |
@banks I am Ok, will be fixed in next patch |
@banks DONE, I also handled all possible pre-defined states in my last patch (maint/critical) => even if in theory it is not possible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the changes look good to me.
One question we came up with when talking about this out of band is what happens when you have more SRV responses than can fit in a single message? Even with TCP DNS its still possible to run into this scenario. When we can't return all the data should we attempt at making a choice of which to return? Maybe return those with the highest weight first until we run out of space in the message.
For making load balancing decisions not having all the information would be problematic regardless as any record not returned would essentially be hidden and not get any traffic.
I am open to discussing what the best solution for this is. One way or another we need to be explicit about how we operate in the case of truncation.
agent/structs/structs.go
Outdated
return fmt.Errorf("Passing must be greater than 0") | ||
} | ||
if weights.Warning < 0 { | ||
return fmt.Errorf("Warning must be greater of equal to 0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "of" -> "or"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkeeler DONE
@mkeeler quickly from my phone: size of DNS SRV requests is the reason of my patches a few weeks ago regarding compression as well as performance (binary lookups for truncation) For now, with compression, edns and TCP, you can easily send around 600-700 nodes in a single request which is good enough for many usages. It could go even further by removing TXT metadata when size is too big for instance. Note that I implemented truncation bit during my patches even for TCP, that would allow for clever clients. During my initial patches, I thought about having a predictable order when sending response for SRV, this would allow for instance to implement paging, but I wanted first my initial patches to get merged quickly, but that would be an option: For most shops it should not be an issue anyway (the biggest LB aware services we have are around 500 nodes, we have some services with more than 6k nodes, but none we need to perform load balancing on, or in that case we use HTTP queries). I am quite confident nobody uses DNS SRV on very large services now since I discovered the issue (it was crashing DNS server and sending no results prior to my patch), so I think we have some time to talk with vendors such as HaProxy to discuss advanced possible strategies, but at least this patch will allow all vendors to use Weight with HTTP discovery requests and speak the same language and avoid relying on tags, (Fabio or some or our implementations) or service meta (like we do a bit as well), everybody using DNS or HTTP will use the same data, thus allowing for more compatibility between vendors Kind Regards |
f6b0d01
to
63f13ff
Compare
63f13ff
to
25eae9f
Compare
a01191e
to
c327158
Compare
ed8b28a
to
bea7d7e
Compare
@pierresouchay All that sounds reasonable. With TCP DNS and compression it should be much much more difficult to get into a scenario where you can't get all the SRV RRs for your query. Honestly the "problem" is probably more theoretical than practical. I think its probably sufficient to just add a sentence in the docs about the fact that when the results are truncated no attempt is made to select which records get returned but rather should be considered effectively random. |
@mkeeler DONE, added documentation in |
Will allow master branch to be merged properly
d7fff46
to
3ef126c
Compare
@pierresouchay Matt's on vacation until tomorrow and I'm waiting to check in with him - it looks like between you got this in a good shape while I was away. Hopefully he'll be able to catchup with things over next couple of days here. Thanks for your patience! |
@banks Ok, no problem, I'll wait :-) Regarding weights, there is also this one: #4576 => will allow to leverage easily this feature without having to modify existing healthcheck (for instance, allow easily to have a healthcheck to slow to be similar to 'Too Many requests' and thus applying those new Warning weights. I think this is quite interesting for all apps for which people cannot modify healthchecks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for the doc updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of super nitpicky comments about the docs but I think this is good to go.
…hts as requested by @banks
@banks DONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Great work on this Pierre. Will merge a little later today.
References: * hashicorp/consul#4468 * hashicorp/consul#4198 Change-Id: I808fc90f3dd87b43bcf6325edb9cab8bbadffe60
It implements setting Warning state for HTTP 200 or when TCP connection succeeds but is too slow as described in hashicorp#4522 Combined with hashicorp#4468 it will allow very easily to reduce calls/secs on slow nodes in a completely automatic way.
@pierresouchay : you mentioned the possibility of "applying weights to A/AAAA DNS queries" a few times in this thread. Do you remember what behavior you intended to achieve with that? (I ask because A/AAAA records don't include weight, so I'm curious what it would mean to apply weights to them.) I realize it's been a long time since this PR, so I don't expect that you'll remember! I also know (per your GH profile) that your current position no longer involves working with Consul. But I came across your comment recently while touching the Consul DNS code and wanted to ask you about it in case there's a remaining enhancement for us to consider making here. Whether or not you're in a position to respond, I also want to thank you for your many contributions to Consul over the years :) cheers! |
Hello @jkirschner-hashicorp Weights have been implemented in SRV requests that do support weights. My idea at that time was to return statistically according to weights the results on A/AAAA queries (and according to current passing/warning status). So, if you have weights: And given node2 being on warning state, you would return node2 2 times every 22 calls, node1 & node3 10 times every 22 calls. That was the idea. In my previous company, we never used that (we either used SRV records with DNS or HTTP APIs in the end), but I found this basic solution very interesting (while it has many implications on caching side, but that's an issue that already exists with current implementation) |
Adding this datastructure will allow us to resolve the
issues #1088 and #4198
This new structure defaults to values:
Which means, use weight of 0 for a Service in Warning State
while use Weight 1 for a Healthy Service.
Thus it remains compatible with previous Consul versions.