## Performance vs. Scalability

* a service is __scalable__ if it results in increased __performance__ in a manner proportional to resources added.
* another way to look at it:
    - performance problem = system is slow for a single user
    - scalability problem = system is fast for a single user but slow under heavy load
    

## Latency vs Throughput

* __latency__ is the time to perform some action or produce some result
* __throughput__ is the number of such actions or results per unit of time
* generally want __maximal throughput__ with __acceptable latency__

## Availability vs Consistency

### CAP Theorem

* in a distributed system, you can only support two of the following guarantees:
    - __consistency__ = every read receives the most recent write or an error
        * every store, no matter the branch/location, will have the same items for sale
    - __availability__ = every request receives a response, without guarantee that it contains the most recent version of the information
        * the store is always open
    - __partition tolerance__ = the system continues to operate despite arbitrary partitioning due to network failures
        * meaning, there is an interrruption of communication between nodes in the system
* networks aren't reliable, so you __must support partition tolerance__
    - the tradeoff, then, is between availability or consistency

* Consistent and Partition Tolerant (CP)
    - waiting for a response from the partitioned node might result in a timeout error
    - good if business needs atomic reads/writes 
        * atomic read/writes = only one read/write can be performed at a time
* Availability and Partitiion Tolerant (AP)
    - responses return the most readily available version of the data available on any node, which might not be the latest
    - writes might take some time to propagate when the partition is resolved
    - good choice if the business needs to allow for eventual consistency or when the system needs to continue working despite external errors

## Consistency Patterns

* how do we synchronize multiple copies/versions of the same data so that clients have a consistent view of the data?

### Weak Consistency

* after a write, reads may or may not see it
* works well in real time use cases like video chat and multiplayer games
    - e.g. if you lose reception in the middle of a phone call for a few seconds and regain connection, you don't hear what was said during the connectoin loss

### Eventual Consistency:

* after a write, reads will eventually see it (typically within milliseconds)
* data is replicated asynchronously
* seen in DNS and email
* works well in highly available systems

### Strong Consistency:

* after a write, reads will see it. 
* data is replicated asynchronously
* seen in file systems and RDBMSes
* works well in systems that need transactions

## Availability Patterns

* there are 2 complementary patterns to support high availability: __fail-over__ and __replication__

### Fail-over
#### Active-passive

* aka master-slave failover
* heartbeats sent between active and passive server on standby
    - if heartbeat is interrupted, the passive server takes over the active's IP address and resumes service
* length of downtime is determined by whether the passive server is already running in 'hot' standby or whether it needs to start up from 'cold' standby
* only the active server handles traffic

#### Active-active

* aka master-master failover
* both servers manage traffic and spread the load between them
* if servers are public-facing, the DNS would need to know about the public IPs of both servers
    - if they're internal-facing, app logic would need to know about both servers

#### Disadvantages: failover

* adds more hardware and additional complexity
* potential loss of data if active system fails before any newly written data can be replicated to the passive

### Replication

#### Master-slave

#### Master-master

## Availability in Numbers

* quantified by uptime (or downtime) as a percentage of time the service is available
* availability is measured in number of 9s
    - 99.99% availability = four 9s
* 99.9% availability - three 9s (downtime)
    - year = 8h 45min 57s
    - month = 43m 49.7s
    - week = 10m 4.8s
    - day = 1m 26.4s
* 99.99% availability - four 9s (downtime)
    - year = 52min 35.7s
    - month = 4m 23s
    - week = 1m 5s
    - day = 8.6s

#### Availability in parallel vs sequence

* if a service consists of multiple components prone to failure, the service's overall availability depends on whether the components are in sequence or in parallel

##### In Sequence
* overall availability decreases when 2 components with availability < 100% are in sequence
* Availability (total) = Availability (Foo) * Availability (Bar)
    - if both Foo and Bar each had 99.9% availability, their total availability in sequence would be 99.8%

##### In Parallel
* overall availability decreases when 2 components with availability < 100% are in parallel
* Availability (total) = 1 - (1 - Availability (Foo)) * (1 - (Availability (Bar))
    - if both Foo and bar each had 99.9% availability, their total availability in parallel would be 99.9999%

## Domain Name System

* a domain name system (DNS) translates a domain name, like www.example.com, to an IP address
* DNS is hierarchical
    - router/ISP provides info on which DNS server(s) to contact when doing a lookup
* lower level DNS servers cache domain name - IP address mappings
* DNS results can also be cached by browser/OS for a time until the TTL (time to live) expires

***
* NS record (name server) - specifies the DNS servers for your domain/subdomain
* MX record (mail exchange) - specifies the mail servers for accepting messages
* A record (address) - points a name to an IP address
* CNAME (canonical) - points a name to another name or CNAME or to an A record
    - e.g. example.com to www.example.com
***
* services such as CloudFlare and Route 53 provide managed DNS services
* DNS services can route traffic through various methods:
    - Weighted round robin:
        * prevent traffic from goingto servers under maintenance
        * balance between varying cluster sizes
        * A/B testing
    - Latency-based
    - Geolocation-based

### Disadvantages: DNS

* accessing a DNS server introduces a slight delay, although mitigated by caching
* DNS server management could be complex and is generally managed by governments, ISPs and large companies
* DNS services have recently come under DDoS, preventing users from accessing websites such as Twitter without knowing Twitter's IP address(es)

## Content Delivery Network

* a content delivery network (CDN) is a globally distributed network of proxy servers that serve content from locations closer to the user
* generally static files like HTML/CSS/JS, photos, and videos are served from the CDN but some can also serve dynamic content
* the site's DNS resolution will tell clients which server to contact
* serving content from CDNS can significantly improve performance in 2 ways:
    - users receive content from data centers closer to them
    - your servers do not have to serve requests that the CDN fulfills

### Push CDNs

* Push CDNs receive new content whenever changes occur onthe server
* you take full responsibility for:
    - providing content
    - uploading directly to the CDN
    - rewrite URLs to point to the CDN
* you can configure when the content expires and when it is updated
* content is uploaded only  when it is new/changed
    - this minimizes traffic but maximizes storage
* sites with a small amount of traffic or sites with content that isn't regularly updated work well with push CDNs
* content is only pushed to the CDNs once instead of being re-pulled at regular intervals

### Pull CDNs

* Pull CDNs grab new content from your server when the first user requests the content
* can leave content on your server and rewrite URLs to point to the CDN
* results in slower requests until the content is cached on the CDN
* a time-to-live (TTL) determines how long content is cached
* Pull CDNs minimize storage space on the CDN but can create redundant traffic if files expire and are pulled before they have actually changed
* sites with heavy traffic work well with pull CDNs, since traffic is spread out more evenly with only recently-requested content remaining on the CDN

### Disadvantages: CDN

* costs can be significiant depending on traffic but should be weighed with additional costs you would incur if not using a CDN
* content might be stale if it is updated before the TTL expires it
* CDNs require changing URLs for static content to point to the CDN

## Load Balancer