Skip to content
This repository has been archived by the owner on Feb 27, 2020. It is now read-only.

Commit

Permalink
Merge pull request #10 from Metaswitch/dns_failover
Browse files Browse the repository at this point in the history
[Reviewer: Ellie] Document the new DNS resiliency behaviour
  • Loading branch information
rkd-msw committed Mar 9, 2015
2 parents 60d7117 + 01c4dac commit 427e29e
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 4 deletions.
4 changes: 3 additions & 1 deletion docs/Clearwater_Configuration_Options_Reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,9 @@ This section describes optional configuration options, particularly for ensuring
* `ralf_listen_port` - the Diameter port which Ralf listens on. Defaults to 3869 to avoid clashes when colocated with Homestead.
* `alias_list` - this defines additional hostnames and IP addresses which Sprout or Bono will treat as local for the purposes of SIP routing (e.g. when removing Route headers).
* `default_session_expires` - determines the Session-Expires value which Sprout will add to INVITEs, to force UEs to send keepalive messages during calls so they can be tracked for billing purposes.
* `enum_server` - a comma-separated list of DNS servers which can handle ENUM queries.
* `enum_suffix` - determines the DNS suffix used for ENUM requests (after the digits of the number). Defaults to "e164.arpa"
* `enum_file` - if set (to a file path), Sprout will use this local JSON file for ENUM lookups rather than a DNS server. An example file is at https://github.com/Metaswitch/clearwater-docs/wiki/ENUM#deciding-on-enum-rules.
* `enum_file` - if set (to a file path), and if `enum_server` is not set, Sprout will use this local JSON file for ENUM lookups rather than a DNS server. An example file is at https://github.com/Metaswitch/clearwater-docs/wiki/ENUM#deciding-on-enum-rules.
* `icscf_uri` - the SIP address of the external I-CSCF integrated with your Sprout node (if you have one).
* `scscf_uri` - the SIP address of the Sprout S-CSCF. This defaults to `sip:$sprout_hostname:$scscf;transport=TCP` - this includes a specific port, so if you need NAPTR/SRV resolution, it must be changed to not include the port.
* `additional_home_domains` - this option defines a set of home domains which Sprout and Bono will regard as locally hosted (i.e. allowing users to register, not routing calls via an external trunk). It is a comma-separated list.
Expand All @@ -82,6 +83,7 @@ This section describes optional configuration options, particularly for ensuring
memento_disk_limit=45% # Percentage of available disk

* `memento_threads` - determines the number of threads dedicated to adding call list fragments to the call list store. This defaults to 25 threads. This is only relevant if the node includes a Memento AS.
* `signaling_dns_server` - a comma-separated list of DNS servers for non-ENUM queries. Defaults to 127.0.0.1 (i.e. uses `dnsmasq`)
* `exception_max_ttl` - determines the maximum time before a process exits if it crashes. This defaults to 600 seconds

## Experimental options
Expand Down
24 changes: 24 additions & 0 deletions docs/Clearwater_DNS_Usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,30 @@ Clearwater makes heavy use of DNS to refer to its nodes. It uses it for

Clearwater also supports using DNS for identifying non-Clearwater nodes. In particular, it supports DNS for identifying SIP peers using NAPTR and SRV records, as described in [RFC 3263](http://tools.ietf.org/rfc/rfc3263.txt).

## Resiliency

By default, Clearwater routes all DNS requests through an instance of [dnsmasq](http://www.thekelleys.org.uk/dnsmasq)
running on localhost. This round-robins requests between the servers in /etc/resolv.conf,
as described in [its FAQ](http://www.thekelleys.org.uk/dnsmasq/docs/FAQ):

> By default, dnsmasq treats all the nameservers it knows about as
> equal: it picks the one to use using an algorithm designed to avoid
> nameservers which aren't responding.
If the `signaling_dns_server` option is set in `/etc/clearwater/config` (which is mandatory when using
[traffic separation](Multiple_Network_Support.md)), Clearwater will not use dnsmasq. Instead, resiliency
is achieved by being able to specify up to three servers in a comma-separated list (e.g.
`signaling_dns_server=1.2.3.4,10.0.0.1,192.168.1.1`), and Clearwater will fail over between them as follows:

* It will always query the first server in the list first
* If this returns SERVFAIL or times out (which happens after a randomised 500ms-1000ms period), it will resend the query to the second server
* If this returns SERVFAIL or times out, it will resend the query to the third server
* If all servers return SERVFAIL or time out, the DNS query will fail

Clearwater caches DNS responses for several minutes (to reduce the load on DNS servers, and the latency introduced by querying them). If a cache entry is stale,
but the DNS servers return SERVFAIL or time out when Clearwater attempts to refresh it, Clearwater will continue to use the cached value until the DNS servers
become responsive again. This minimises the impact of a DNS server failure on calls.

## Requirements

### DNS Server
Expand Down
13 changes: 10 additions & 3 deletions docs/ENUM.md
Original file line number Diff line number Diff line change
Expand Up @@ -400,9 +400,16 @@ you can instead change the suffix, e.g. to .e164.arpa.ngv.example.com, by

ENUM and Sprout
---------------
To enable ENUM lookups on Sprout, edit `/etc/clearwater/user_settings` and add the following configuration to use either an ENUM server (recommended) or an ENUM file
To enable ENUM lookups on Sprout, edit `/etc/clearwater/config` and add the following configuration to use either an ENUM server (recommended) or an ENUM file:

enum_server=<hostname of enum server>
enum_server=<IP addresses of enum servers>
enum_file=<location of enum file>

If you use the ENUM file, enter the ENUM rules in the JSON format (shown above).
If you use the ENUM file, enter the ENUM rules in the JSON format (shown above).

It's possible to configure Sprout with secondary and tertiary ENUM servers, by providing a comma-separated list (e.g. `enum_server=1.2.3.4,10.0.0.1,192.168.1.1`). If this is done:

* Sprout will always query the first server in the list first
* If this returns SERVFAIL or times out (which happens after a randomised 500ms-1000ms period), Sprout will resend the query to the second server
* If this returns SERVFAIL or times out, Sprout will resend the query to the third server
* If all servers return SERVFAIL or time out, the ENUM query will fail

0 comments on commit 427e29e

Please sign in to comment.