Skip to content

Commit

Permalink
Merge 82ed78e into 9a59486
Browse files Browse the repository at this point in the history
  • Loading branch information
blblack committed Dec 11, 2017
2 parents 9a59486 + 82ed78e commit 0bf3515
Show file tree
Hide file tree
Showing 5 changed files with 121 additions and 23 deletions.
40 changes: 27 additions & 13 deletions docs/gdnsd-plugin-multifo.podin
Expand Up @@ -56,10 +56,10 @@ RR-sets.

=head1 TOP-LEVEL PLUGIN CONFIG

At the top level of the plugin's configuration stanza, two special
parameters C<up_thresh> and C<service_types> are supported. These set
default per-resource options of the same name for any resources which do
not define them explicitly.
At the top level of the plugin's configuration stanza, three special
parameters C<up_thresh>, C<service_types>, and C<ignore_health> are supported.
These set default per-resource options of the same name for any resources which
do not define them explicitly.

The rest of the hash entries at the top level are the names of the
resources you define. Each resource gets a configuration hash of its own
Expand All @@ -73,8 +73,9 @@ specify a set of C<label =E<gt> address> pairs which are all the same
family (IPv4 or IPv6), or you can use the sub-stanzas C<addrs_v4> and/or
C<addrs_v6> to specify one or both families in the same resource.

The C<up_thresh> and C<service_types> parameters are inherited through
every level, and can be overridden at any level (even per-address-family):
The C<up_thresh>, C<service_types>, and C<ignore_health> parameters are
inherited through every level, and can be overridden at any level (even
per-address-family):

=over 4

Expand All @@ -91,16 +92,24 @@ provided, all will be monitored for each address, and the net monitored
state will be the minimum (worst) of the set. See L<gdnsd.config(8)> for
more details on service_types.

=item B<ignore_health>

Boolean, default false. If set to true, the health of individual addresses
will not affect whether multifo adds them to the set of output addresses, but
it will still be checked and used for the C<up_thresh> calculation which is
consumed by meta-plugins like geoip and metafo, which might use that
information to fail over to a completely different datacenter as a result.

=back

=head1 SHORTCUT CONFIG

If you have no parameters (service_types, up_thresh) to configure in a
given stanza (single-family direct resource config, or addrs_v[46]), and do
not care about the descriptive per-address labels used in monitoring, you
can replace the hash with an array of addresses. The labels will be
generated for you as a series of integers starting with C<1>. For example,
the following are equivalent:
If you have no parameters (service_types, up_thresh, ignore_health) to
configure in a given stanza (single-family direct resource config, or
addrs_v[46]), and do not care about the descriptive per-address labels used in
monitoring, you can replace the hash with an array of addresses. The labels
will be generated for you as a series of integers starting with C<1>. For
example, the following are equivalent:

res1 => { addrs_v4 => [ 192.0.2.1, 192.0.2.2 ] }
res1 => { addrs_v4 => { 1 => 192.0.2.1, 2 => 192.0.2.2 } }
Expand All @@ -116,7 +125,12 @@ determine the set of response addresses for that address family: 1) Add all
non-DOWN addresses to the result set. 2) If the set of non-DOWN addresses
fail the up_thresh check, add *all* addresses to the result set as a
fallback. 3) If any address is in the DOWN state, cut the
zonefile-specified TTL in half
zonefile-specified TTL in half.

If C<ignore_health> is true, all addresses are added to the result set
regardless of health, but the up_thresh and TTL effects still happen, and the
final resource-level state still reflects the overall state as it would without
C<ignore_health>.

This process is repeated independently for each of the IPv4 and IPv6
address subsets, in the case that a resource has both address families
Expand Down
37 changes: 28 additions & 9 deletions plugins/multifo.c
Expand Up @@ -46,6 +46,7 @@ typedef struct {
unsigned num_svcs;
unsigned count;
unsigned up_thresh;
bool ignore_health;
} addrset_t;

typedef struct {
Expand Down Expand Up @@ -91,6 +92,7 @@ static vscf_data_t* addrs_hash_from_array(vscf_data_t* ary, const char* resname,

vscf_hash_inherit(parent, newhash, "up_thresh", false);
vscf_hash_inherit(parent, newhash, "service_types", false);
vscf_hash_inherit(parent, newhash, "ignore_health", false);
return newhash;
}

Expand Down Expand Up @@ -177,6 +179,14 @@ static void config_addrs(const char* resname, const char* stanza, addrset_t* ase
log_fatal("plugin_multifo: resource %s (%s): 'up_thresh' must be a floating point value in the range (0.0 - 1.0]", resname, stanza);
}

aset->ignore_health = false;
vscf_data_t* ignore_health_cfg = vscf_hash_get_data_byconstkey(cfg, "ignore_health", true);
if(ignore_health_cfg) {
num_addrs--;
if(!vscf_is_simple(ignore_health_cfg) || !vscf_simple_get_as_bool(ignore_health_cfg, &aset->ignore_health))
log_fatal("plugin_multifo: resource %s (%s): 'ignore_health' must have a boolean value", resname, stanza);
}

if(!num_addrs)
log_fatal("plugin_multifo: resource '%s' (%s): must define one or more 'desc => IP' mappings, either directly or inside a subhash named 'addrs'", resname, stanza);

Expand Down Expand Up @@ -219,6 +229,7 @@ static void config_auto(res_t* res, const char* stanza, vscf_data_t* auto_cfg) {
// mark parameters
vscf_hash_get_data_byconstkey(auto_cfg, "up_thresh", true);
vscf_hash_get_data_byconstkey(auto_cfg, "service_types", true);
vscf_hash_get_data_byconstkey(auto_cfg, "ignore_health", true);

// clone down to just address-label keys
vscf_data_t* auto_cfg_noparams = vscf_clone(auto_cfg, true);
Expand Down Expand Up @@ -265,6 +276,7 @@ static bool config_res(const char* resname, unsigned resname_len V_UNUSED, vscf_
// inherit params downhill if applicable
vscf_hash_bequeath_all(opts, "up_thresh", true, false);
vscf_hash_bequeath_all(opts, "service_types", true, false);
vscf_hash_bequeath_all(opts, "ignore_health", true, false);

addrs_v4_cfg = vscf_hash_get_data_byconstkey(opts, "addrs_v4", true);
addrs_v6_cfg = vscf_hash_get_data_byconstkey(opts, "addrs_v6", true);
Expand Down Expand Up @@ -307,6 +319,8 @@ void plugin_multifo_load_config(vscf_data_t* config, const unsigned num_threads
num_resources--;
if(vscf_hash_bequeath_all(config, "service_types", true, false))
num_resources--;
if(vscf_hash_bequeath_all(config, "ignore_health", true, false))
num_resources--;

resources = xcalloc(num_resources, sizeof(res_t));
unsigned residx = 0;
Expand All @@ -333,26 +347,31 @@ static gdnsd_sttl_t resolve(const gdnsd_sttl_t* sttl_tbl, const addrset_t* aset,
dmn_assert(aset->count);

gdnsd_sttl_t rv = GDNSD_STTL_TTL_MAX;
unsigned added = 0;
unsigned notdown = 0;
for(unsigned i = 0; i < aset->count; i++) {
const addrstate_t* as = &aset->as[i];
const gdnsd_sttl_t as_sttl = gdnsd_sttl_min(sttl_tbl, as->indices, aset->num_svcs);
rv = gdnsd_sttl_min2(rv, as_sttl);
if(!(as_sttl & GDNSD_STTL_DOWN)) {
gdnsd_result_add_anysin(result, &as->addr);
added++;
notdown++;
}
else if(aset->ignore_health) {
gdnsd_result_add_anysin(result, &as->addr);
}
}

// if up_thresh was not met, signal upstream failure through rv and add all addresses
if(added < aset->up_thresh) {
if(notdown < aset->up_thresh) {
rv |= GDNSD_STTL_DOWN;
if(isv6)
gdnsd_result_wipe_v6(result);
else
gdnsd_result_wipe_v4(result);
for(unsigned i = 0; i < aset->count; i++)
gdnsd_result_add_anysin(result, &aset->as[i].addr);
if(!aset->ignore_health) {
if(isv6)
gdnsd_result_wipe_v6(result);
else
gdnsd_result_wipe_v4(result);
for(unsigned i = 0; i < aset->count; i++)
gdnsd_result_add_anysin(result, &aset->as[i].addr);
}
}
// else force non-down response in retval, even if "rv" currently has the down flag from
// the min/min2 operations on the individual addrs
Expand Down
53 changes: 52 additions & 1 deletion t/011upthresh/037up_thresh.t
Expand Up @@ -2,7 +2,7 @@

use _GDT ();
use File::Temp qw/tmpnam/;
use Test::More tests => 9;
use Test::More tests => 14;

# We use dns_port_2 as a custom http listener
# for something to monitor
Expand Down Expand Up @@ -72,6 +72,57 @@ _GDT->test_dns(
],
);

###### mmih -> metafo+multifo using "ignore_health"

# All are up by default, so we get all 3x DCA IPs
_GDT->test_dns(
qname => 'mmih.example.com', qtype => 'A',
answer => [
'mmih.example.com 86400 A 192.0.2.70',
'mmih.example.com 86400 A 192.0.2.71',
'mmih.example.com 86400 A 192.0.2.72',
],
);

_GDT->write_statefile('admin_state', qq{
192.0.2.70/up => DOWN/42
});
_GDT->test_log_output([
q{admin_state: state of '192.0.2.70/up' forced to DOWN/42, real state is UP/MAX},
]);

# Marking one down doesn't fail the default 0.5 up_thresh check, but would
# normally remove the failing IP from results here without "ignore_health".
# Note the TTLs are still affected (this can be controlled by clamping the
# minimum dynamic TTL at the zonefile level, if desired)
_GDT->test_dns(
qname => 'mmih.example.com', qtype => 'A',
answer => [
'mmih.example.com 43200 A 192.0.2.70',
'mmih.example.com 43200 A 192.0.2.71',
'mmih.example.com 43200 A 192.0.2.72',
],
);

_GDT->write_statefile('admin_state', qq{
192.0.2.70/up => DOWN/42
192.0.2.71/up => DOWN/42
});
_GDT->test_log_output([
q{admin_state: state of '192.0.2.71/up' forced to DOWN/42, real state is UP/MAX},
]);

# Now we've marked 2/3 down, which will fail the default 0.5 up_thresh, causing
# failover to the DCB datacenter.
_GDT->test_dns(
qname => 'mmih.example.com', qtype => 'A',
answer => [
'mmih.example.com 43200 A 192.0.2.80',
'mmih.example.com 43200 A 192.0.2.81',
'mmih.example.com 43200 A 192.0.2.82',
],
);

_GDT->test_kill_daemon($pid);
_GDT->test_kill_daemon($http_pid);

Expand Down
13 changes: 13 additions & 0 deletions t/011upthresh/etc/config.tmpl
Expand Up @@ -55,4 +55,17 @@ plugins => {
four = [ 127.0.0.1, 10 ]
}
}
# metafo with multifo underneath using "ignore_health"
metafo => {
service_types => up
resources => {
meta_multi_ignore_health => {
datacenters => [ DCA, DCB ]
dcmap => {
DCA => { lb01 => 192.0.2.70, lb02 => 192.0.2.71, lb03 => 192.0.2.72, ignore_health => true }
DCB => { lb01 => 192.0.2.80, lb02 => 192.0.2.81, lb03 => 192.0.2.82 }
}
}
}
}
}
1 change: 1 addition & 0 deletions t/011upthresh/etc/zones/example.com
Expand Up @@ -14,3 +14,4 @@ m3dn DYNA multifo!multi_3dead_normal
m3dl DYNA multifo!multi_3dead_lowthresh
wlow DYNA weighted!w_low
wnorm DYNA weighted!w_norm
mmih DYNA metafo!meta_multi_ignore_health

0 comments on commit 0bf3515

Please sign in to comment.