# Preparation (install *dig*)

In [None]:
!sudo apt-get install -y -qq dnsutils

# Emulating a client

In this section we will use the *dig* command for emulating the behavior of any application that operates as a *DNS client*.

The dig command takes a *domain name* and a *question type* as input, specified as parameters in the command line. It sends a DNS request to the default DNS server and prints the DNS response in a human-readable way. Keep in mind that dig is executed by the Linux machine associated with this notebook, thus it will interact only with the default name server of that machine.

## Finding the IP address of a server

By default, dig will ask for RR of type A. Execute the code cell below and you will obtain the IP address of the web site of our University.


In [None]:
!dig bartoli.inginf.units.it

The following cases are more complex. Execute the code cell below and try to understand what the output means.

In [None]:
!dig www.realmadrid.es
!dig bartoli.inginf.units.it

## Finding the mail server of a mail domain

If you want to locate mail servers then you have to specify explicitly that you are interested in MX RRs.

The order of options is irrelevant.

Note that the additional section (i.e., the A RRs of the mail servers) is not present in the DNS responses. Understanding when this section must be and may be present is difficult. We will neglect this issue and just assume that it may or may not be there.


In [None]:
!dig MX gmail.com

The mail servers of these two mail domains are quite different:

In [None]:
!dig units.it MX
!dig studenti.units.it MX

## Keep in mind! (1)

An application does ***not*** search the required RRs in the DNS tree.

It just sends a DNS request to its default name server and obtains the required information in the corresponding DNS response.

It is up to the name server to find which of the hundreds of thousands of name servers distributed around the world contains the required information. Such a search usually requires interacting with several of those name servers. Applications are *not* involved in this search and need *not* know anything about it: they just send a question and obtain the required answer.

## Keep in mind! (2)

Applications only need to ask A RRs and MX RRs.

As you have seen above, applications do not need to ask RRs of any other type. Why should they ask for a NS or for a SOA RR?

## Practice yourself

A few interesting questions:

*   Web server of Netflix?
*   Web server of Facebook?
*   Web server of POTUS (The President of the United States: whitehouse.gov)
*   Web server of Senato della Repubblica?

Find the corresponding IP addresses.


In [None]:
!dig whitehouse.gov A

Consider the "Reti di calcolatori" mailing list. The mailing list address is 20212022-reti-di-calcolatori-i@googlegroups.com (the part before @ changes every year):

*   Mail server names?
*   IP addresses?



# Understanding what happened

## Default name server

If no DNS server is specified in the command line, then dig sends its request to the *default name server*.

In the Linux machine associated with this notebook, the default name server is a name server that does not manage any zone and acts as follows:
1.   If its local DNS cache contains RRs for answering the DNS request, then it sends back a DNS response based on the locally available information. Otherwise,
2.   It performs an *iterative navigation* in the DNS tree for obtaining the RRs required for answering the DNS request. Having obtained those RRs, it caches them locally (if allowed by their TTL values) and then sends back a DNS response.

The above behavior is called *DNS resolver*.

The name server of an organization usually manages a DNS zone (containing the names of the organization) and acts as a DNS resolver.

### Colab implementation detail

The actual implementation of the local DNS resolver in Colab notebooks is a bit tricky:

*  The IP specification states that a machine can send packets to itself by using many different addresses: all those in between 127.0.0.1 and 127.0.0.127. How these addresses are actually used depends on the specific configuration of the specific operating system; 127.0.0.1 is always used.
* The Linux machine associated with this notebook is configured with a process running at the IP address 127.0.0.11, listening on port 53 and acting as a DNS resolver.



## Recursive resolution vs iterative resolution

The examples above were structured as follows:
1.   dig sends a *recursive* DNS request to the local DNS resolver: "*if you do not have the required RR locally available, please find it for me*".
2.   The DNS resolver finds the required RRs by searching them in the DNS tree. Such a search consists in a navigation in the DNS tree starting from a name server of the root zone. Such a procedure is called *iterative* resolution (explained in detail below).

Thus, dig acted exactly as any application, whereas the local DNS resolver acted exactly as the name server of any organization.

We emphasize again that applications need not know anything about iterative resolution, zones, replication, glue information and so on and so on: they *always* send *one* single request to their default name server; that name server is always the same and it will always find the required information on behalf of the application.


# Emulating a name server

In this section we will use *dig* for emulating the behavior of the name server of an organization that has to obtain a RR not locally available.

That is, we will execute dig with options that will let it behave (almost) identically to a name server that executes *iterative resolution*. Iterative resolution consists in navigating in the DNS tree, starting from a name server of the root zone, until finding the required RR (or a name server that can tell with certainty that the required RR does not exist).

The only difference between the iterative resolution performed by dig and the one performed by a name server is at the first step, as clarified below.


## Iterative resolution

We need to use the *+trace* option. The execution of *dig +trace www.units.it* occurs as follows:

1. Send a DNS request "www.units.it A" with "recursion not desired" to the *local DNS resolver*; this name server will respond with the NS RRs of the name servers of the root zone; it will also respond with the A RR of one of them.
2. Send a DNS request "www.units.it A" with "recursion not desired" to a *name server of the root zone*, at the IP address obtained above; this name server will respond with the NS/A pairs of the name servers of a TLD zone. Choose one of them.
3. Send a DNS request "www.units.it A" with "recursion not desired" to a *name server of a TLD zone*, at the IP address obtained above; this name server will respond with the NS/A pairs of the name servers of a second-level zone. Choose one of them.
4. ...and so on until receiving either the required RR or a DNS response stating "the RR you are searching does not exist" (NXDOMAIN).

The only difference between the iterative resolution performed by dig and the one performed by a name server is at the first step:
*   A name server would start from step 2; it would obtain the IP address of a name server of the root zone from its local cache; every name server has the NS/A RRs of all the name servers of the root zone locally available, as part of their configuration.
*   dig instead executes step 1 for obtaining the IP address of a name server of the root zone from its default name server (configured as in the previous bullet).

Note that the *same* DNS request is sent at *every* step. In particular, *no* DNS requests of type NS are ever sent. NS RRs are contained in DNS responses, not in DNS requests.

Execute the following command and analyze its output by following the text below.

Note: the output of dig +trace *without* the +additional option may be confusing for students. What happens behind the scenes is the same irrespective of whether +additional is specified or not, but if you execute +trace without +additional then you might be confused: do not do that (until you pass this exam).



In [None]:
!dig +trace +additional ghs.googlehosted.com

The command output does not show any DNS request and summarizes every DNS response received by dig. Let us analyze them in detail.

1.   The first answer was sent by 127.0.0.11, port 53 (local DNS resolver) and contains the NS RRs of the root zone (unfortunately, the A RRs contained in this answer are *not* shown). One of those name servers is chosen and the next DNS request will be sent that name server.
2.   The second answer was sent by IP-1#53(NAME-1), which is the name server of the root zone selected at the previous step (as just pointed out, the value IP-1 was contained in a A RR in the response of the previous step that unfortunately is not shown). This second answer contains the NS/A RRs of the name servers of zone .it. One of those name servers is chosen and the next DNS request will be sent to that name server.
3. The third answer was sent by IP-2#53(NAME-2), which is the name server of zone .it selected at the previous step. This third answer contains the NS/A RRs of the name servers of zone units.it. One of those name servers is chosen and the next DNS request will be sent to that name server.
4. The fourth answer was sent by IP-3#53(NAME-3), which is the name server of zone units.it selected at the previous step. This name server was selected at the previous step. This fourth answer contains the required RR.

Keep in mind that, as pointed out above, a real name server does not execute step 1 and starts from step 2 directly because it has the NS/A of the name server of the root zone locally available.

## Keep in mind!

Iterative resolution is *not* performed by *applications*!

Iterative resolution is performed *only* by* DNS resolvers* (in particular, by name servers of the organizations, i.e., the default name servers contacted by applications).

Applications always require *recursive* resolution: send a question and obtain an answer.

DNS resolvers perform instead *iterative* resolution: given a question, they navigate along the DNS tree for finding the answer. To do that, they usually need to send several DNS requests to different DNS servers.

## Practice yourself



Execute yourself all the steps required for *iterative resolution* of *A www.realmadrid.es* (or of another A/MX RR of your choice). Add a set of code cells at the end of this notebook and insert in those cells the necessary invocations to dig.

To do that, you need to be able to:

1.   Send DNS requests to a name server of your choice (rather than to the defaul name server, i.e., the local DNS resolver); and,
2.   Specify that you do not want recursion (because by default dig sends DNS requests with the "recursion desired" flag).

To specify a name server you use the *@IP-address* option in the dig command line. To specify that you do not want recursion you use *+norecurse* in the dig command line.

In principle +norecurse should not be necessary because name servers of other organizations will refuse to find RRs that they do not have. If you specify it explicitly, however, you will not risk being confused by some non-default behavior by some name server.

In order to start your navigation from a name server of the root zone (i.e.,step 2 of the above examples), you need to knwo the A RRs of name servers of the root zone. The following command gives you that information (DNS request of type NS addressed to the name server 198.41.0.4; it is a name server of the root zone; +additional tells dig to show the additional information in the request, i.e., the matching A RRs).




In [None]:
!dig @198.41.0.4 +additional . NS

# Some more details

## A note about DNS caching

The +trace option does *not* take caching into account: it *always* shows the *full* path along the DNS tree that has to be followed for obtaining a specified RR.

In practice, a DNS resolver may skip one or more steps of this path depending on the content of its DNS cache (and maybe of those of the name servers that it contacts).

## Recursive resolution vs iterative resolution

By default, dig sends DNS requests with the flag "*recursion desired*" set. This flag means "*if you do not have the required RR, please navigate in the DNS tree and find it for me*".

When dig sends a DNS request to the local DNS resolver, the DNS resolver analyzes the IP address that sent the request and then decides, based on its configuration, to indeed execute an iterative navigation in the DNS tree. The DNS response will thus contain the flag "*recursion desired, recursion available*".

The DNS resolver will instead send DNS requests (to external name servers) with the flag "*recursion not desired*". This means "*if you do not have the required RR, please give me the RRs for continuing my navigation in the DNS tree*". The external name server will respond either with the required RR or with NS/A pairs that allow continuing the iterative navigation.

Requests sent by the DNS resolver might also have the flag "*recursion desired*". set. In that case the DNS responses will have the flags "*recursion desired, recursion not available*".


## Name servers specified by name

You can tell dig to send the DNS request:

*   To the local DNS resolver (default)
*   To a specified name server (@IP-address)
*   To a specified name server (@name). In this case, obviously, dig will first have to translate that name to an IP-address. The corresponding DNS traffic will *not* be shown. See the example below: the name server is specified by name but the DNS traffic for translating that name to an IP address is not shown.

In [None]:
!dig @ns1.units.it MX units.it

Specifying a name server by name can be useful in some cases. But for a student that may lead to a disaster: one could forget what actually happens behind the scenes.

For example, in the command below, how does dig find out the IP address of the name m.dns.it? That is complex (iterative navigation and so on). For the exams, you must prove that you really understand all the necessary steps.

In [None]:
!dig @m.dns.it +authority MX units.it

## An excellent exercise for you

An excellent way for checking your understanding of the DNS implementation, is this:
*   Choose the name server of a certain zone
*   Think of a question for that name server.
*   Ask yourself how that name server should respond: does it know the answer? If not, why? and, what should it send back?
*   Send the question to that name server with +norecurse and see whether the answer was expected

For example:
*   If you send question "bartoli.inginf.units.it NS" to a name server of zone it, what do you expect to receive?
*   And the same question to a name server of zone units.it?
*   And the same question to a name server of zone uk?
*   Question "gov.uk A" to a name server of zone uk?
*   ...and of the root zone?
*   ...and of zone gov.uk?
*   ...and of zone it?

If you want to do exercises of this kind, always set these options: +norecurse +additional +authority.

The reasons are somewhat tricky (and outlined below). Just set those options.

In this kind of exercises you might want to specify name servers by name rather than by IP address, as it is easier. But make sure you understand what you are doing (see previous section).

*   A DNS response is composed of 4 fields: query, answer, additional section, authority section.
*    The rules that dig follows for displaying the content of these sections are very tricky. Sometimes it shows them by default, sometimes it does not; sometimes it shows them as a separate portion of the DNS response, sometimes it mixes several portions together.
*    The *authority section* contains NS RR for continuing the navigation.
*    The *additional section* contains info not strictly associated with the query but that the name server believes it could be useful. Determining in advance what will be there is tricky as the rules are complex.







In [None]:
!dig uk  @198.41.0.4