## Performance vs. Scalability

* a service is __scalable__ if it results in increased __performance__ in a manner proportional to resources added.
* another way to look at it:
    - performance problem = system is slow for a single user
    - scalability problem = system is fast for a single user but slow under heavy load
    

## Latency vs Throughput

* __latency__ is the time to perform some action or produce some result
* __throughput__ is the number of such actions or results per unit of time
* generally want __maximal throughput__ with __acceptable latency__

## Availability vs Consistency

### CAP Theorem

* in a distributed system, you can only support two of the following guarantees:
    - __consistency__ = every read receives the most recent write or an error
        * every store, no matter the branch/location, will have the same items for sale
    - __availability__ = every request receives a response, without guarantee that it contains the most recent version of the information
        * the store is always open
    - __partition tolerance__ = the system continues to operate despite arbitrary partitioning due to network failures
        * meaning, there is an interrruption of communication between nodes in the system
* networks aren't reliable, so you __must support partition tolerance__
    - the tradeoff, then, is between availability or consistency

* Consistent and Partition Tolerant (CP)
    - waiting for a response from the partitioned node might result in a timeout error
    - good if business needs atomic reads/writes 
        * atomic read/writes = only one read/write can be performed at a time
* Availability and Partitiion Tolerant (AP)
    - responses return the most readily available version of the data available on any node, which might not be the latest
    - writes might take some time to propagate when the partition is resolved
    - good choice if the business needs to allow for eventual consistency or when the system needs to continue working despite external errors

## Consistency Patterns

* how do we synchronize multiple copies/versions of the same data so that clients have a consistent view of the data?

### Weak Consistency

* after a write, reads may or may not see it
* works well in real time use cases like video chat and multiplayer games
    - e.g. if you lose reception in the middle of a phone call for a few seconds and regain connection, you don't hear what was said during the connectoin loss

### Eventual Consistency:

* after a write, reads will eventually see it (typically within milliseconds)
* data is replicated asynchronously
* seen in DNS and email
* works well in highly available systems

### Strong Consistency:

* after a write, reads will see it. 
* data is replicated asynchronously
* seen in file systems and RDBMSes
* works well in systems that need transactions

## Availability Patterns

* there are 2 complementary patterns to support high availability: __fail-over__ and __replication__

### Fail-over
#### Active-passive

* aka master-slave failover
* heartbeats sent between active and passive server on standby
    - if heartbeat is interrupted, the passive server takes over the active's IP address and resumes service
* length of downtime is determined by whether the passive server is already running in 'hot' standby or whether it needs to start up from 'cold' standby
* only the active server handles traffic

#### Active-active

* aka master-master failover
* both servers manage traffic and spread the load between them
* if servers are public-facing, the DNS would need to know about the public IPs of both servers
    - if they're internal-facing, app logic would need to know about both servers

#### Disadvantages: failover

* adds more hardware and additional complexity
* potential loss of data if active system fails before any newly written data can be replicated to the passive

### Replication

#### Master-slave

#### Master-master

## Availability in Numbers

* quantified by uptime (or downtime) as a percentage of time the service is available
* availability is measured in number of 9s
    - 99.99% availability = four 9s
* 99.9% availability - three 9s (downtime)
    - year = 8h 45min 57s
    - month = 43m 49.7s
    - week = 10m 4.8s
    - day = 1m 26.4s
* 99.99% availability - four 9s (downtime)
    - year = 52min 35.7s
    - month = 4m 23s
    - week = 1m 5s
    - day = 8.6s

#### Availability in parallel vs sequence

* if a service consists of multiple components prone to failure, the service's overall availability depends on whether the components are in sequence or in parallel

##### In Sequence
* overall availability decreases when 2 components with availability < 100% are in sequence
* Availability (total) = Availability (Foo) * Availability (Bar)
    - if both Foo and Bar each had 99.9% availability, their total availability in sequence would be 99.8%

##### In Parallel
* overall availability decreases when 2 components with availability < 100% are in parallel
* Availability (total) = 1 - (1 - Availability (Foo)) * (1 - (Availability (Bar))
    - if both Foo and bar each had 99.9% availability, their total availability in parallel would be 99.9999%

## Domain Name System

* a domain name system (DNS) translates a domain name, like www.example.com, to an IP address
* DNS is hierarchical
    - router/ISP provides info on which DNS server(s) to contact when doing a lookup
* lower level DNS servers cache domain name - IP address mappings
* DNS results can also be cached by browser/OS for a time until the TTL (time to live) expires

***
* NS record (name server) - specifies the DNS servers for your domain/subdomain
* MX record (mail exchange) - specifies the mail servers for accepting messages
* A record (address) - points a name to an IP address
* CNAME (canonical) - points a name to another name or CNAME or to an A record
    - e.g. example.com to www.example.com
***
* services such as CloudFlare and Route 53 provide managed DNS services
* DNS services can route traffic through various methods:
    - Weighted round robin:
        * prevent traffic from goingto servers under maintenance
        * balance between varying cluster sizes
        * A/B testing
    - Latency-based
    - Geolocation-based

### Disadvantages: DNS

* accessing a DNS server introduces a slight delay, although mitigated by caching
* DNS server management could be complex and is generally managed by governments, ISPs and large companies
* DNS services have recently come under DDoS, preventing users from accessing websites such as Twitter without knowing Twitter's IP address(es)

## Content Delivery Network

* a content delivery network (CDN) is a globally distributed network of proxy servers that serve content from locations closer to the user
* generally static files like HTML/CSS/JS, photos, and videos are served from the CDN but some can also serve dynamic content
* the site's DNS resolution will tell clients which server to contact
* serving content from CDNS can significantly improve performance in 2 ways:
    - users receive content from data centers closer to them
    - your servers do not have to serve requests that the CDN fulfills

### Push CDNs

* Push CDNs receive new content whenever changes occur onthe server
* you take full responsibility for:
    - providing content
    - uploading directly to the CDN
    - rewrite URLs to point to the CDN
* you can configure when the content expires and when it is updated
* content is uploaded only  when it is new/changed
    - this minimizes traffic but maximizes storage
* sites with a small amount of traffic or sites with content that isn't regularly updated work well with push CDNs
* content is only pushed to the CDNs once instead of being re-pulled at regular intervals

### Pull CDNs

* Pull CDNs grab new content from your server when the first user requests the content
* can leave content on your server and rewrite URLs to point to the CDN
* results in slower requests until the content is cached on the CDN
* a time-to-live (TTL) determines how long content is cached
* Pull CDNs minimize storage space on the CDN but can create redundant traffic if files expire and are pulled before they have actually changed
* sites with heavy traffic work well with pull CDNs, since traffic is spread out more evenly with only recently-requested content remaining on the CDN

### Disadvantages: CDN

* costs can be significiant depending on traffic but should be weighed with additional costs you would incur if not using a CDN
* content might be stale if it is updated before the TTL expires it
* CDNs require changing URLs for static content to point to the CDN

## Load Balancer

* distribute incoming client requests to computing resources like application servers and databases
* load balancers return the response from the resource to the appropriate client and are effective at:
    - preventing requestsfrom going to unhealthyservers
    - preventing overloading resources
    - help eliminate single point of failure
* can be implemented with hardware (expensive) or with software like HAProxy
* other benefits:
    - SSL termination: decrypt incoming requests and encrypt server responses so backend servers don't have to perform them
    - Session persistence: issue cookies and route specific client's requests to same instance if web apps do not keep track of sessions
* to protect against failures, there can be multiple load balancers in active-passive or active-active mode
* load balancers can route traffic based on various metrics:
    - random
    - least loaded
    - sessions/cookies
    - round robin/weighted round robin
        * round robin = at least everyone gets a turn
    = layer 4
    - layer 7

### Layer 4 Load Balancing

* looks at info at the __transport layer__ to decide how to distribute requests
    - involves looking at source, destination IP addresses, ports in the header, but not contents of the packet
    - forward network packets to/from upstream server, performing Network Address Translation (NAT)

### Layer 7 Load Balancing

* looks at the __application layer__ to decide how to distribute requests
    - involves contents of the header, message, and cookies
* layer 7 load balancers terminate network traffic, reads the message, makes a load-balancing decision, then opens a connection to the selected server
    - e.g. direct video traffic to servers that host videos while directing more sensitive user billing traffic to security-hardened servers
* at the cost of flexibility, layer 4 load balancing requires less time/computing resources than layer 7 but the performance impact can be minimal on modern commodity hardware

### Horizontal Scaling

* load balancers help with horizontal scaling, improving performance and availability
* more cost efficient than vertical scaling
    - easier to hire talent for it as well
* Disadvantages: horizontal scaling
    - introduces complexity and involves cloning servers
        * servers should be stateless, meaning no user-related data like sessions/profile pictures
        * sessions stored in centralized data store like a database (SQL, NoSQL) or a persistent cache (Redis, Memcached)
    - downstream servers like caches and databases need to handle more simultaneous connections as upstream servers scale out

### Disadvantages: load balancer

* can be a performance bottleneck if it doesn't have enough resources or not configured properly
* increases complexity of the system in exchange for eliminating single point of failure
* but a single load balancer = single point of failure so much configure multiple load balancers which further increase complexity

## Reverse Proxy (Web Server)

* web server that centralizes internal services and provides unified interfaces to the public
* requests from clients are forwarded to a server that can fulfill it before the reverse proxy returns the server's response to the client
* benefits:
    - increased security - hide info about backend servers, blacklist IPs, limit # of connections per client
    - increased scalability and flexibility - clients only see the reverse proxy's IP, allowing you to scale servers or change their configuration
    - SSL termination - decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations
    - compression - compress server responses
    - caching - return response for cached requests
    - static content - serve static content directly
        * HTML/CSS/JS
        * photos/videos

### Load Balancer vs Reverse Proxy

* use load balancer when you have multiple servers
    - load balancers route traffic to a set of servers serving the same function
* reverse proxies can be useful even with just 1 web server or app server
* NGINX and HAProxy can support both layer 7 reverse proxying and load balancing

### Disadvantage: reverse proxy

* increases complexity of system
* single reverse proxy = single point of failure
    - but configuring multiple reverse proxies (i.e. failover) further increases complexity

## Application Layer

* separating web layer from application layer allows you to scale and configure both layers independently
* adding API results in adding application servers without necessarily adding additional web servers

### Microservices

* suite of independently deployable, small, modular services
* each one runs a unique process and communicates through a well-defined, lightweight mechanism to serve a business goal
* e.g. Pinterest could have the following microservices
    - user profile
    - follower
    - feed
    - search
    - photo upload

### Service Discovery

* systems like Consul can help services find each other by keeping track of registered names, addresses, and ports
* health checks verify service integrity and are done using an HTTP endpoint

### Disadvantage: application layer

* adding an application layer with loosely coupled services requires a different approach from an architectural, operations, and process viewpoint (vs a monolithic system)
* can add complexity in terms of deployments and operations

# Database

## Relation Database Management System (RDBMS)

* relational database like SQL is a collection of data items organized in tables
* __ACID__ is a set of properties of relational database transactions
    - Atomicity: each transaction is all or nothing
    - Consistency: any transaction will bring the database from one valid state to another
    - Isolation: executing transactions concurrently has the same results as if the transactions were executed serially
    - Durability: once a transaction has been committed, it will remain so
* many techniques to scale a relational database:
    - master-slave replication
    - master-master replication
    - federation
    - sharding
    - denormalization
    - SQL tuning

### Master-Slave Replication

* master serves reads and writes, replicating writes to one or more slaves, which serve only reads
* slaves can also replicate to additional slaves in a tree-like fashion
* if master goes offline, system can operate in read-only mode until a slave is promoted to a master or a new master is provisioned

#### Disadvantages: master-slave replication

* additional logic needed to promote a slave to a master

### Master-Master Replication

* both masters serve reads/writes and coordinate with each other on writes
* if either master goesdown, the system can continue to operate reads/writes

#### Disadvantages: master-master replication

* need a load balancer or you'll need to make changes to app logic to determinewhere to write
* most master-master systems are either loosely consistent (violating ACID) or have increasedwrite latency due to synchronization
* conflict resolution comes more into play as more write nodes are added and as latency increases

### Disadvantages: replication

* potential for loss of data if master fails before any newly written data can be replicated to other nodes
* writes are replayed to the read replicas.
    - if there are a lot of writes, read replicas can get bogged down with replaying writes and can't do as many reads
* more read slaves = more replication = more replication lag
* on some systems, writing to master can spawn multiple threads to write in parallel whereas read replicas only support writing sequentially with a single thread
* replication adds more hardware and additional complexity

### Federation

* federation (functional partitioning) splits up databases by function
* e.g. instead of a single, monolithic database, you have 3 databases: forums, users, and products
    - results in less read/write traffic to each database and therefore less replication lag
* smaller databases = more data that can fit in memory = more cache hits due to improved cache locality
* can write in parallel, increasing throughput b/c no single central master serializing the writes

#### Disadvantages: federation

* not effective if schema requires huge functions/tables
* need to update application logic to determine which database to read/write
* joining data from 2 databases is more complex with a server link
* federation adds more hardware and additional complexity

### Sharding

* distributes data across different databases such that each database can only manage a subset of the data
* e.g. a users database
    - \# of users increases, more shards are added to the cluster
* similar to advantages of federation, sharding results in less read/write traffic, less replication, more cache hits
    - index size is also reduced which improves performance with faster queries
    - if one shard goes down, the other shards are still operational but requires some form of replication to avoid data loss
    - similar to federation, there is no single central master serializing writes which allows you to write in parallel with increased throughput
* common ways to shard a table of users is through the user's last name initial or the user's geographical location

#### Disadvantages: sharding:

* need to update application logic to work with shards which could result in complex SQL queries
* data distribution can become lopsided in a shard
    - e.g. set of power users on a shard could result in increased load to that shared compared to others (celebrity problem aka hotspot key problem)
        * rebalancing adds additional complexity
        * sharding function based on consistent hashing can reduce amount of transferred data
* joining data from multiple shards is more complex
* sharding adds more hardware and additional complexity

### Denormalization

* attempts to improve read performance at expense of write performance
* redundant copies of data are written in multiple tables to avoid expensive joins
* some RDBMS like PostgreSQL support materialized views which handle storing redundant information and keeping redundant copies consistent
* once data becomes distributed with techniques like federation and sharding, managing joins across data centers increases complexity
    - but denormalization helps to circumvent that b/c the queries can be done in one table
* in most systems, reads can heavily outnumber writes
    - a read resulting in complex database joins can be very expensive, spending a significant amount of time on disk operations

#### Disadvantages: denormalization

* data is duplicated
* constraints can help redundant copies of information stay in sync, which increases complexity of the database design
* denormalized database under heavy write load might perform worse than its normalized counterpart

### SQL Tuning:

* important to __benchmark__ and __profile__ to simulate and uncover bottlenecks
    - benchmark: simulate high-load situations with tools such as ab
    - profile: enable tools such as the slow query log to help track performane issues
* benchmarking/profiling can point you to the following optimizations

#### Tighten up the schema

* MySQL dumps to disk in contiguous blocks for fast access
* use char instead of varchar for fixed-length fields
    - char allows for fast, random access whereas varchar must find the end of the string before moving onto the next one
* use text for large blocks of text like a blog post
    - text also allows for boolean searches
    - using a text fields results in storing a pointer on disk that is used to locate the text block
* use int for larger numbers up to 4 billion (2$^{32}$)
* use decimal for currency to avoid floating point representation errors
* avoid storing large blobs, store the location of where to get the object instead
* varchar(255) is the largest number of characters that can be counted in an 8 bit number, often maximizing the use of a byte in some RDBMS
* set the not null constraint where applicable to improve search performance

#### Use good indices

8 colums that you are querying (SELECT, GROUP BY, ORDER BY, JOIN) could be faster with indices
* indices usually represented as self-balancing B-tree that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time
* placing an index can keep the data in memory, requiring more space
* writes could also be slower since the index also needs to be updated
* when loading large amounts of data, it might be faster to disable indices, load the data, then rebuild the indices

#### Avoid expensive joins

* denormalize where performance demands it

#### Partition tables

* break up a table by putting hot spots in a separate table to help keep it in memory

#### Tune the query cache

* in some cases, the query cache could lead to performance issues

## NoSQL

* collection of data items represented in a key-value store, document store, wide column store, or a graph database
* data is denormalized and joins are generally done in the application code
* most NoSQL stores lack true ACID transactions and favor eventual consistency
* __BASE__ is often used to describe the properties of NoSQL databases. In comparison with the CAP theorem, BASE chooses availability over consistency
    - Basically available: the system guarantees availability
    - Soft state: state of the system may change over time, even without input
    - Eventual consistency: the system will become consistent over a period of time, given that the system doesn't receive input during that period

### Key-Value Store

* abstraction: hash table
* allows for O(1) reads and writes and is often backed by memory or SSD
* data stores can maintain keys in lexicographic order, allow efficient retrieval of key ranges
* can allow for storing of metadata with a value
* provide high performance and often used for simple data models or for rapidly-changing data such as an in-memory cache layer
* offer only a limited set of operations so complexity is shifted to the application layer
* key-value store is the basis for more complex systems such as a document store and in some cases, a graph database

### Document Store

* abstraction: key-value store with documents stored as values
* centered around documents (XML, JSON, binary,etc) where a document stores all information for a given object
* document stores provide APIs or a query language to query based on the internal structure of the document itself
* documents are organized by collections, tags, metadata, or directories
* documents can be organized/grouped together but may have fields that are different from each other
* provide high flexibility and are often used forworking with occasionally changing data

### Wide Column Store

* abstraction: nested map ColumnFamily<RowKey, Columns<ColKey, Value, Timestamp>>
* wide column store's basic unit of data is a column (name/value pair)
* a column can be grouped in column families (analogous to a SQL table)
* super column families further group column families
* can access each column independently with a row key and columns with the same row key form a row
    - each value contains a timestamp for versioning and for conflict resolution
* offer high availability and high scalability
    - often used for very large data sets

### Graph Database

* abstraction: graph
* each node is a record and each arc is a relationship between 2 nodes
* graph databases are optimized to represent complex relationships with many foreign keys or many-to-many relationships
* offer high performance for data models with complex relationships such as a social network
    - e.g. Facebook
* relatively new and not widely-used yet
    - more difficult to find development tools/resources
    - many graphs can only be accessed with REST APIs

## SQL or NoSQL

* Reasons for SQL:
    - structured data
    - strict schema
    - relational data
    - need for complex joins
    - transactions
    - clear patterns of scaling
    - more established: developers, community, code, tools, etc
    - lookups by index are very fast

* Reasons for NoSQL:
    - semi-structured data
    - dynamic or flexible schema
    - non-relational data
    - no need for complex joins
    - store many TB (or PB0 of data
    - very data intensive workload
    - very high throughput for IOPS

* Sample data well-suited for NoSQL:
    - rapid ingest of clickstream and log data
    - leaderboard or scoring data
    - temporary data, such as a shopping cart
    - frequently accessed ('hot') tables
    - metadata/lookup tables

## Cache

* caching improves page load times andcan reduce load on servers/databases
* databases benefit from a uniform distribution of reads/writes across its partitions
    - popular items can skew distribution, causing bottlenecks
    - putting a cache in front of a database can help absorb uneven loads and spikes in traffic

### Client Caching

* caches can be located on the client side (OS or browser), server side, or in a distant cache layer

### CDN Caching

* CDNs are considered a type of cache

### Web Server Caching

* reverse proxies and caches can serve static and dynamic content directly
* web servers can also cache requests, returning responses without having to contact application servers

### Database Caching

* database usually includes some level of caching in a default configuration, optimized for a generic use case
* tweaking these settings for specific usage patterns can further boost performance

### Application Caching

* in-memory caches like Memcached and Redis are key-value stores between your application and your data storage
* since data is held in RAM, it is much faster than typical databases where data is stored on disk
    - RAM is more limited than disk, so cache invalidation algorithms such as least recently used (LRU) can help invalidate 'cold' entries and keep 'hot' data in RAM
* Redis has the following additional features:
    - persistence option
    - built-in data structures such as sorted sets and lists
* multiple levels you can cache that fall into 2 categories: database queries and objects:
    - row level
    - query level
    - fully-formed serializable objects
    - fully-rendered HTML
* generally, should try avoiding file-based caching b/c it makes cloning and auto-scaling more difficult

#### Caching at the database query level

* whenever you query the database, hash the query as a key and store the result to the cache
* suffers from expiration issues:
    - hard to delete a cahced result with complex queries
    - if one piece of data changes such as a table cell, you need to delete all cached queries that might include the changed cell

#### Caching at the object level

* see your data as an object, similar to what you do with your application code
* have your app assemble the dataset from the database into a class instance or a data structure(s):
    - remove the object from cache if its underlying data has changed
    - allows for asynchronous processing: workers assemble objects by consuming the latest cached object

***
* suggestions of what to cache:
    - user sessions
    - fully rendered web pages
    - activity streams
    - user graph data

### When to update the cache

* need to determine cache update strategy that works best for your use case

#### Cache-aside

* app is responsible for reading/writing from storage
* cache does not interact with storage directly
* the app does the following
    - look for entry in cache, resulting in a cache miss
    - load entry from the database
    - add entry to cache
    - return entry
* memcached is generally used in this manner
* subsequent reads of data added to cache are fast
* also referred to as lazy loading
* only requested data is cached, which avoids filling up cache with data that isn't requested

##### Disadvantages: cache-aside

* each cache miss results in three trips, which can cause a noticeable delay
* data can become stale if it is updated in the database
    - issue is mitigated by setting a time-to-live (TTL) which forces an update of the cache entry, or by using write-through
* when a nodefails, it is replaced by a new, empty node, increasing latency

#### Write-through

* application uses the cache as the main data store, reading/writing data to it, while the cache is responsible for reading/writing to the database:
    - app adds/updates entry in cache
    - cache synchronously writes entry to data store
    - return
* write-through is a slow overall operation due to the write operation but subsequent reads of just written data are fast
* users generally more tolerant of latency when updating data than reading data
* data in cache is not stale

##### Disadvantages: write through

* when a new node is created due to failure or scaling, the new node will not cache entries until the entry is updated in the database
* cache-aside in conjunction with write-through can mitigate this issue
* most data written might never be read, which can be minimized with a TTL

#### Write-behind (write-back)

* application does the following:
    - add/update entry in cache
    - asynchronously write entry to the data store, improving write performance

##### Disadvantages: write-behind

* there could be data loss if the cache goes down prior to its contents hitting the data store
* more complex to implement write-behind than it is to implement cache-aside or write-through

#### Refresh-ahead

* can configure the cache to automatically refresh any recently access cache entry prior to its expiration
* can result in reduced latency vs read-through if the cache can accurately predict which items are likely to be needed in the future

##### Disadvantages: refresh-ahead

* not accurately predicting which items are likely to be needed in the future can result in reduced performance than without refresh-ahead

### Disadvantages: cache

* need to maintain consistency between caches and the source of truth such as the database through cache invalidation
* cache invalidation is a difficult problem, there is additional complexity associated with when to update the cache
* need to make application changes such as adding Redis or memcached

## Asynchronism

* asynchronous workflows help reduce request times for expensive operations that would otherwise be performed in-line
* help by doing time-consuming work in advance

### Message Queues

* receive, hold, and deliver messages
* if an operation is too slow to perform inline, you can use a message queue with the following workflow:
    - an application publishes a job to the queue, the notifies the user of job status
    - a worker picks up the job from the queue, processes it, then signals the job is complete
* user is not blocked and the job is processed in the background
* during this time, the client might optionally do a small amount of processing to make it seem like the task has completed
    - e.g. posting a tweet, tweet can be seen on your timeline but it could take some time before your tweet is actually delivered to all of your followers
* Redis is useful as a simple message broker but messages can be lost
* RabbitMQ is popular but requires you to adapt to the AMPQ protocol to manage your own nodes
* Amazon SQS is hosted but can have high latency and has the possibility of messages being delivered twice

### Task Queues

* receive tasks and their related data, runs them, then delivers their results
* can support scheduling and can be used to run computationally-intensive jobs in the background
* Celery has support for scheduling and primarily has python support

### Back Pressure

* if queues start to grow significantly, the queue size can become larger than memory resulting in:
    - cache misses
    - disk reads
    - even slower performance
* Back pressure can help by limiting queue size, thereby maintaining a high throughput rate and good response times for jobs already in the queue
    - once queue fills up, cleints get a server busy or HTTP 503 status code to try again later
    - clients can retry the request at a later time, perhaps with exponential backoff

### Disadvantages: asynchronism

* use cases such as inexpensive calculations and realtime workflows might be better suited for synchronous operations, as introducing queues can add delays and complexity

## Communication

* Open Systems Interconnection (OSI) 7 Layer Model
    - Application (7): End User
        * program that opens what was sent or creates what is to be sent
        * user applications/ SMTP
    - Presentation (6): Syntax
        * encrypt/decrypt (if needed)
        * JPEG/ASCII/GIF
    - Session (5): Sync and send to ports
        * logical ports
    - Transport (4): TCP
        * host to host, flow control
        * TCP/SPX/UDP
    - Network (3): Packets
        * "letter", contains IP address
        * routers, IP/IPX/ICMP
    - Data Link (2): Frames
        * "envelopes", contains MAC address
        * switch bridge
    - Physical (1): Physical Structure
        * cables, hubs, etc

### Hypertext Transfer Protocol (HTTP)

* HTTP is a method for encoding/transporting data between a client/server
* request/response protocol:
    - clients issue requests
    - servers issue responses with relevant content and completion status info about the request
* HTTP is self-contained, allowing requests/responses to flow through many intermediate routers and servers that perform load balancing, caching, encryption, and compression
* basic HTTP request consists of a verb (method) and a resource (endpoint)
    - VERB, DESCRIPTION, IDEMPOTENT, SAFE, CACHEABLE
    - GET, reads a resource, yes, yes, yes
    - POST, creates a resource of trigger a process that handles data, no, no, yes if response contains freshness info
    - PUT, creates or replace a resource, yes, no, no
    - PATCH, partially updates a resource, no, no, yes if response contains freshness info
    - DELETE, deletes a resource, yes, no, no
* Idempotent = can be called many times without different outcomes

### Transmission Control Protocol (TCP)

* TCP is a connection-oriented protocol over an IP network
* connection is established and terminated using a __handshake__
* all packets sent are guaranteed to reach the destination in the original order and without corruption through:
    - sequence numbers and checksum fields for each packet
    - acknowledgement packets and automatic retransmission
* if sender does not receive a correct response, it will resend the packets
* if there are multiple timeouts, the connection is dropped
* TCP also implements flow control and congestion controll
    - these guarantees cause delays and generally result in less efficient transmission than UDP
* to ensure high throughput, web servers can keep a large number of TCP connections open, resulting in high memory usage
    - can be expensive to have large number of open connections between web server threads and say, a memcached server
    - connection pooling can help in addition to switching to UDP where applicable
* useful for apps that require high reliability but are less time critical
    - web servers
    - database info
    - SMTP (Simple Mail Transfer Protocol)
    - FTP (File Transfer Protocol)
    - SSH (Secure Shell Protocol)
* use TCP over UDP when:
    - you need all of the data to arrive intact
    - you want to automatically make a best estime use of the network throughput

### User Datagram Protocol (UDP)

* UDP is connectionless
* datagrams (analogous to packets) are guaranteed only at the datagram level
    - datagrams might reach their destination out of order or not at all
* UDP does not support congestion control
* without guarantees that TCP support, UDP is generally more efficient
* UDP can boradcast, send datagrams to all devices on the subnet
    - this is useful with DHCP b/c the client has not yet received an IP address, thus preventing a way for TCP to stream without the IP address
* UDP is less reliable but works well in real time use cases such as VoIP, video chat, streaming, and realtime multiplayer games
* use UDP over TCP when:
    - you need the lowest latency
    - late data is worse than loss of data
    - you want to implement your own error correction

### Remote Procedure Call (RPC)

* in an RPC, a client causes a procedure to execute on a different address space, usually a remote server
    - procedure is coded as if it were a local procedure call, abstracting away details of how to communicate with the server from the client program
* remote calls are usually slower and less reliable than local calls so it is helpful to distinguish RPC calls from local calls
* RPC is a request-response protocol
    - Client program: calls the client stub procedure
        * parameters are pushed onto the stack like a local procedure call
    - Client stub procedure: marshals (packs) procedure id and arguments into a request message
    - Client communication module: OS sends the message from the client to the server
    - Server communication module: OS passes the incoming packets to the server stub procedure
    - Server stub procedure: unmarshalls the results, calls the server procedure matching the procedure id and passes the given arguments
    - the server response repeats the steps above in reverse order
* RPC is focused on exposing behaviors
    - often used for performance reasons with internal communications as you can hand-craft native calls to better fit your use cases

#### Disadvantages: RPC

* RPC clients become tightly coupled to the service implementation
* a new API must be defined for every new operation or use case
* can be difficult to debug RPC
* might not be able to leverage existing technologies out of the box
    - e.g. it might require additional effort to ensure RPC calls are properly cached on caching servers such as Squid

### Representation State Transfer (REST)

* architectural style enforcing a client/server model where client acts on a set of resources managed by the server
* server provides a representation of resources and actions that can either manipulate or get a new representation of resources
* all communication must be __stateless and cacheable__
* 4 qualities of a RESTful interface:
    - Identify resources (URI in HTTP): use the same URI regardless of any operation
    - Change with representations (Verbs in HTTP): use verbs, headers, and body
    - Self-description error message (Status response in HTTP): use status codes, don't reinvent the wheel
    - HATEOAS (HTML interface for HTTP): your web service should be fully accessible in a browser
* REST is focused on exposing data
    - minimizes coupling between client/server
    - often used for public HTTP APIs
* REST uses a more generic and uniform method of exposing resources through:
    - URIS
    - representation through headers
    - actions through verbs such as GET, POST, PUT, DELETE, and PATCH
* being stateless, REST is great for horizontal scaling and partitioning

#### Disadvantages: REST

* with REST being focused on exposing data, it might not be a good fit if resources are not naturally organized or accessed in a simple hierarchy
    - e.g. returning all updated records from the past hour matching a particular set of events is not easily expressed as a path
    - with REST, it is likely to be implemented with a combination of URI path, query parameters, and possibly the request body
* REST typically relies on a few verbs (GET, POST, PUT, DELETE, PATCH) which sometimes doesn't fit your use case
    - e.g. moving expired documents to the archive folder does not fit cleanly within these verbs
* fetching complicated resources with nested hierarchies requires multiple round trips between the client/server to render single views
    - e.g. fetching content of a blog entry and the comments on that entry
    - for mobile apps operating in variable network conditions, these multiple roundtrips are undesirable
* over time, more fields might be added to an API response and older clients will receive all new data fields, even those they don't need
    - bloats the payload size and leads to larger latencies

## Security

* encrypt in transit and at rest
* sanitize all user inputs or any input parameters exposed to user to prevent XSS and SQL injection
* use parameterized queries to prevent SQL injection
* use the principle of least privilege
    - every module in a computing environment (process, user, or a program) must be able to access only the information and resources that __are necessary__ for its legitimate purpose