## Domain Name Service (DNS)

### Overview
* DNS is an Internet naming servcie that maps human-friendly domain names to machine-readable IP address
* DNS is transparent to users
* How it works in general
  + user enters a domain name in browser
  + browser translates domain name to IP address or an IP address list by asking DNS infrastructure
  + user's request is forwarded to destination web server(s) by browser
  + browser may cache some frequently used mappings
  
### Terminology
* Name servers
  + DNS is not a single server. It is a complete infrastructure with numerous servers. 
  + DNS servers that respond to user's queries are called name servers
* Resource Records
  + DNS database stroes domain name to IP address mappings in the form of Resource Records (RR). RR is the smallest unit of infrastructure that users request from the name servers. There are different types of RR
    + A record provides hostname to ip address mapping 
      + example: A, relay1.main.educative.io, 104.18.2.109
    + NS (name server record) specifies which server is authorittive for a domain name (domain name to hostname mapping)
      + example: NS, example.com, ns1.example.com (the server ns1.example.com is authoritative for domain example.com)
    + CName (Canonical name record) provides the alias to canonical host name mapping
      + a canonical name is a docmain name that is the nickname or alias for another domain name
        + you can create a CNAME record that maps blog.example.com to example.com that allows you to use blog.example.com as an alias for example.com
      + example: CName, educative.io, server1.primary.educative.io that map the canonical domain name to the hostname
   + MX recond specifies the mail server responsible for accepting email messages on behalf of a domain name
     + example: MX, mail.educative.io, mailserver1.backup.educative.io
* Caching
  + DNS uses caching in different layers to reduce request latency for the user, and reduce burden on DNS infrastructure
* Hierarchy
  + DNS name servers are in a hierarchical form
  + Hierarchical structure allows DNS to be highly scalable because of its increasing size and query load
  
### DNS Hierarchy
* DNS is an infrastruture with name servers at different hierarchies. There are 4 types of servers in DNS hierarchy
  + DNS resolver
    + initiate the query sequence and forward requests to other DNS name servers
    + usually within the premise of users' network
    + can also cater to user's DNS queries through caching, and is also called local or default servers
  + root-level name servers
    + receives requests from local servers. Root name servers maintain name servers based on top-level domain names, such as .edu, .com, .us
    + when a user requests an IP address of educative.io, root-level name servers will return a list of top level domain (TLD) servers that hold the IP addresses of .io domain
  + Top-level domain name servers
    + hold the IP address of authoritative name servers
    + query party get a list of IP addresses belong to the authoritative servers of the organization
  + Authoritative name servers
    + organization's DNS name servers that provide IP addresses of the web or application servers
* Root name servers return the list of IP addresses of TLD servers, which returns the IP addresses of Authoritative name servers, which returns the IP addresses of web or application servers

### Resolving DNS Names
* DNS names are resolved from right to left
* There are two ways to resolve DNS names, iterative and recursive
  + Iterative: 
    + client sends request to local server, which returns the IP of root server
    + based on the IP returned, client sends request to root server, which returns IP of TLD server
    + based on the IP returned, client sends request to TLD server, which returns IP of authoritative server
    + based on the IP returned, client sends request to IP of authoritative servers, which returns IP of web server
    + based on the IP returned, client sends requests to IP of web server
  + Recursive:
    + client sends request to ISP/local server, which sends request to root name server
    + root name server sends request to TLD name server, which send request to Authoritative name server
    + authoritative server sends the IP address of web server back to TLD name server as the base case of recursive call
    + TLD name server sends the IP address back to root name server
    + root name server sends the IP address to local server, which sends the IP address back to client
* Typically, iterative queries are preferred to reduce query load on DNS infrastructure
* Third party public DNS resolvers offered by Google, Cloudflare, OpenDNS may provide quicker response
* Caching
  + temporary storage of frequently requested RRs
  + caching can be implemented in the browser, OS, local name servers within user's network or ISP's DNS resolvers
  + the sequence of a client's request is
    + from client to browser, to OS, then to local DNS resolver, then ISP and finally DNS infrastructure
    + caching can be implemented in each of these stages
  + Even if there is no cache to resolve the specific request, cache can still help by caching IPs of TLD, or authoritative servers and avoid requesting root-level name servers  

### Distributed System
* DNS is a distributed system
  + avoid to be a single point of failure
  + low query latency by getting responses from a nearby server
  + higher degree of flexibility during maintenance and updates/upgrades with other servers responding to users
  + 13 logical root name servers from A to M with many instances spread throughout the globe managed by 12 organizations
* Scalable
  + 13 root-level servers spread throughout the world with about 1000 replicated instances
  + workload is divided byTLD and root servers
  + authoritative servers are managed by organizations themselves to make the entire system work
  + different services handle different portions of the DNS hierarchy tree that enables scalability and manageability of the system
* Reliability
  + Three things make it reliable
  + Caching
    + the caching is done in browser, OS, local name server and ISP DNS resolvers
    + even if some DNS servers are temporarily down, cached results can be served to make DNS a reliable system
  + Server replication
    + DNS has replicated copies of each logical server spread systematically across the globe to entertain user requests at low latency. The redundant servers improve the reliability of the overall system
  + Protocol
    + Many clients use UDP for DNS that improves the performance since it is faster
    + A DNS resolver can resend the UDP request if it didn't get a reply to the previous one
    + DNS can use TCP when message size > 512 bytes.
    + Some clients prefer DNS over TCP to employ transport layer security for privacy reasons
* Consistency
  + DNS uses various protocols to update and transfer information among replicated servers in a hierarchy
  + DNS compromise on storng consistency to achieve high performance due to its higher frequency of read than write
  + DNS provides eventually consistency and updates records on replicated servers lazily (3s - 3 days to update)
  + consistency can suffer because of caching, too
  + authoritative servers are within the organization. When a sever is down and authoritative servers are updated, cached RR at default/local and ISP servers may be outdated. 
    + This can be mitigated by setting TTL (time to live)
  + To improve availability, you need a short TTL (about 120s), so that users will not have to keep pinging the outdated server
  
### Tools for finding IP address  
* Non-authoritative server
  + nslookup www.google.com returns non-authoritative answer 
    + this refers to the non authoritative answer provided by 2nd, 3rd, and 4th hand name servers configured to answer our DNS query
    + for example, our university or office DNS resolver, ISP name server etc.
    + this can be considered as a cached version of Google's authoritative name space response
    + if we run this command several times, the order of IPs will change, since DNS is indirectly performing load balancing
    
* dig www.google.com
  + list TTL 300 sec. of DNS resolver
  + query time: 4 ms represents the time it takes to get a resonse from the DNS resolver
  
* how to find the DNS resolver's IP?
  + OS has the IP addresses of resolvers in config files
  + DHCP provides default DNS resolver's IP
  + DNS resolvers has software (Berkeley Internet Name Domain (BIND)) to resolve queries through DNS infrastructure
  + INterNIC (Internet Network Information Center) maintains updaed list of 13 root servers
  + windows OS stores the IP address to domain name/hostname mappings in C:\Windows\System32\drivers\etc\hosts    