# Networking Essentials

* __packet__: basic unit of data sent over a network that can be part of a larger message

## OSI Model

* only focused on a couple of layers
* basically abstraction and build on each other
* Network Layer (3): IP
    - most will use IP
* Transport Layer (4): TCP/UDP
    - built on top of IP
* Application Layer (7): HTTP/Web sockets
* for an HTTP request, these layers help each other
* latency - back and forth
* state -> a connection established/terminated
    - how do you manage state?

## Example: A Simple Web Request (HTTP)

1. DNS Resolution: client uses a DNS to resolve the domain name of our destination into an IP address
2. TCP Handshake: client uses a 3-way handshake to establish a TCP connection with the server
    - SYN: client sends a SYN (synchronize) packet to the server to request a connection
    - SYN-ACK: server sends a SYN-ACK (synchronize-acknowledge) packet that acknowledges the request
    - ACK: the client sends an ACK (acknowledge) packet to establish the connection
3. HTTP Request: when the TCP connection is established, the client sends an HTTP GET request to the server to reqeust web page
4. Server Processing: the server prepares the response, aka the web page in this case to send back to the client
    - this is the only latency that SWEs care about and control
5. HTTP Response: the server sends back an HTTP response with the corresponding web page content
6. TCP Teardown: after all the data is sent back with the HTTP response, the client and server must now close the TCP connection using a 4-way handshake
    - FIN: client sends a FIN (finish) packet to the server to close the connection
    - ACK: server acknowledges the FIN packet with an ACK packet
    - FIN: server sends its own FIN packet to the client to close the connection from its side
    - ACK: client acknowledges the server's FIN packet with its own ACK packet
***
* the connection between the client and server is a __state__ that both the client and server must maintain
    - therefore, for every request we make from client to server, we must also set up all these packet transfers to make it happen
    - this can be somewhat ignored...until it can't anymore since the nature of the handshakes causes some latency

## Layer 3 (Network)

* IP = internet protocol
    - gives usable names to nodes and routing
* IPv4 = 4 byte
    - internet uses this
* IPv6 = 16 bytes (2 byte pairs)
    - for external use
    - we basically ran out of IPv4 addresses
* in systems, IPs are assigned using a DHCP server (Dynamic Host Configuration Protocol) but they don't really mean much because people won't know about them
    - this IPs are private
* if you want your network to be accessible from anywhere on the internet, it needs to have an IP address allocated by the RIR (Regional Internet Registry)
* Public IP: routers aware of them
    - known to the world
* Private IP: assign your nodes any name
    - only have to remember where they are
* for system design:
    - public: for external components
        * API gateway, load balance
    - private: for everything else
        * e.g. microservices

## Layer 4 (Transport)

* w/ IP, we can send __packets__ or data to a host but we are missing 2 things:
    1. context: where data goes to/comes from
        - can use ports but might not be enough
    2. ordering of packets/delivery success
        - not provided by IP itself but through protocols
* 3 protocols:
    1. TCP (default)
    2. UDP
    3. QUIC (similar to TCP but modern)

### TCP (Transmission Control Protocol): guaranteed delivery / ordering but with overhead

* establishes a connection through a __3-way handshake__ called a __stream__
* __stream__: stateful connection between client and server
* creates a sequence of packets (numbering)
* if order of packets is wrong, know that something went wrong
* identifies packet loss
* TCP mitigates some network failures
* __Key Characteristics of TCP:__
    1. connection-oriented: establishes a dedicated connection before data transfer
    2. reliable delivery: guarantees that data arrives in order and without errors
    3. flow control: prevents overwhelming receivers with too much data
    4. congestion control: adapts to network congestion to prevent collapse
* __costs__: throughput/latency
    - TCP needs to restransmit lost packets
    - can take time
* __USE TCP IF DATA INTEGRITY (GUARANTEE DELIVERY/ORDERING) IS CRITICAL, I.E. WHERE UDP IS NOT A GOOD FIT__

### UDP (User Datagram Protocol): higher performance / spray + pray

* cannot guarantee delivery, ordering, or duplicate protection
    - e.g. zoom call, if connection dropped, it doesn't matter
* datagrams contain info on where they came from (source IP address and port) and where they're going (destination IP address and port) but that's it
* __Key Characteristics of UDP__:
    1. connectionless: no handshake or connection setup
    2. no guarantee of delivery: packets may be lost without notifcation
    3. no ordering: packets may arrive in a different order than sent
    4. lower latency: less overhead means faster transmission (recall all the packet transfers when establishing a TCP connection?)
* __NEED FOR SPEED: USE UDP WHEN SPEED IS MORE IMPORTANT THAN BEING RELIABLE!__:
    - for real-time apps, e.g. MMOs or online game
    - the application can handle packet loss or out of order packets
    - __browsers don't have widespread support of UDP outside of WebRTC__

### TCP or UDP?

* TCP by default unless __latency__ is very important
* or you can handle packets missing/out of order
* UDP __NOT__ supported by browsers natively
* __you might choose UDP when:__
    - low latency is critical (real-time applications, gaming)
    - some data loss is acceptable (media streaming)
    - handling high-volume telemetry (data sent from network devices that help in gauging its health/performance) or logs where occasional loss is acceptable
    - don't need to support web browsers (or you have an alternative for that client)
* modern apps often use both protocols for handling different things:
    - web-based video conferencing app could use TCP/HTTP for signaling and authentication but UDP/WebRTC for the actual audio/video streams

### TCP Vs. UDP Comparison

| Feature | UDP | TCP |
| :----- | :----- | :-----|
| Connection | Connectionless | Connection Oriented |
| Reliability | best-effort delivery | guaranteed delivery |
| Ordering | no ordering guarantees | maintains order |
| Flow Control | no | yes |
| Congestion Control | no | yes |
| Header Size | 8 bytes | 20 - 60 bytes |
| Speed | Faster | Slower due to overhead |
| Use Cases | streamining, gaming, VoIP | everything else |

## Layer 7 (Application)

* application layer processes in the "User Space" (place where user applications run) whereas the lower layers are processed in the OS Kernel in the "Kernel Space"
    - therefore, the application layer is more flexible and can be easily modified than the lower layers
    - the lower layers are difficult to change but can be very efficient

### HTTP/HTTPS: The Web's Foundation

- de-facto standard for data communication on the web
- __stateless protocol__: every request is independent and server does not maintain any data about previous requests or any data that could help with future requests
- request/response
    - request: HTTP verb determines intent of request
        - headers = any info about request
        - e.g. content-type or your own
    - response: containing data, status code, and headers
- __Key concepts of HTTP:__
    1. Request methods: GET, POST, PUT, DELETE, etc
    2. Status Codes: 200 OK, 404 Not Found, 500 Server Error, etc.
    3. Headers: Metadata about the request or response
    4. Body: the actual content being transferred
- Common Request Methods:
    - GET: requests data from the server
        - should be __idempotent__ and don't have a body
    - POST: send data to the server
    - PUT: update data on the server
    - PATCH: updatea resource partially
    - DELETE: delete data from the server
        - DELETE requests should be __idempotent__
- Common Status Codes:
    - Success (2xx):
        - 200 OK: request was successful
        - 201 Created: request was successful and a new resource was created
    - Moved (3xx):
        - 302 Found: requests resource has been moved temporarily
        - 301 Moved Permanently: requested resource has been moved permanently
    - Client Error (4xx):
        - 404 Not Found: requested resource was not found
        - 401 Unauthorized: request requires authentication
        - 403 Forbidden: server understood the request but refuses to authorize
        - 429 Too Many Requests: client has sent too many requests in a given amount of time
    - Server Error (5xx):
        - 500 Server Error: server encountered an error
        - 502 Bad Gateway: server received an invalid response from the upstream server
- can think of HTTP headers like key-value pairs
    - e.g. the HTTP header _Accepts-Encoding_ provides clients a way to say that they can handle different types of content encoding
        - servers can then respond with the most efficient encoding for that client with _Content-Encoding: X_
        - this provides backward compatibility and graceful degradation
- __content negotiation__: allows HTTP to be backwards/forwards compatible
    - request might ask for JSON but if server doesn't have it, its header will indicate that it can send back plain text instead
- __HTTPS__ adds a security layer (TLS/SSL) to encrypt communications and protect against eavesdropping and man-in-the-middle attacks
    - this should be the default for public websites
    - __word of warning for API creation__: never trust any request you receive without validating it first
        - just because a request is encrypted doesn't mean that the request body itself doesn't contain information that could be malicious
    
#### REST API: representational state transfer
- most common way to build APIs on top of HTTP
- allows use of HTTP verbs to describe wanted operation/intent
- resources => URLs associated w/ resources
- organizing APIs around URLs and verbs
- pretty much the default

#### Graph QL: (REST alternative)
- tries to solve issue of __under-fetching__
    - e.g. having to make multiple API calls
    - or changing API to send all necessary data (too slow)
- GraphQL tells backend exactly what the front-end needs and no more
- useful where frontend changes often or when lots of backend teams that frontend needs to call up
- allow negotation between frontend and backend

#### gRPC: protobuf + services:
- protobufs = provide a schema that allows serializing of objects into binary representation
- protobufs allow you to save space
- gRPC builds services on top of protobufs
- gRPC makes serializing/de-serializing efficient
    - REST sends data as JSON blobs that need to be parsed
- gRPC can have 10x throughput compared to REST
- __problems with gRPC__:
    1. external clients and web browsers don't support gRPC natively
    2. while working w/ binaries are efficient for servers, it makes it harder for developers to view data and debug it
- __gRPC used for internal services b/c not widely used__
    - wouldn't bring up gRPC unless we care a lot about performance
    - but using REST for client-server and gRPC for internal services allows an optimal hybrid approach

#### Server Sent Events:

- push data to users as it's happening
- extension on HTTP
- includes headers in response and body of response uses newlines to show how each event is separated
- b/c of headers, client can immediately parse response
- also unidreictional flow from server to client
- no infrastructure needed for SSE since they are basically HTTP requests
- SSE connections are short-lived (30s - 60s)
- SSE will automatically retry a new SSE connection
- basically SSE built on HTTP requests allows for longer running requests that server can push to client (push notifications)

#### Web Sockets:

- useful for bidirectional communication and high frequency updates
- very powerful but require a lot of infrastructure
    - think of polling or SSE solutions first before web sockets
- websockets simulate TCP connections to browsers/other clients
    - basically an exchange of binary blobs __in order__ and __reliably delivered__
- involves a lot of __state__but want to avoid statefulness in System Design interviews
- so to handle this, have an edge service that handles web sockets
    - all users connect to service w/ websockets and the service makes requests to internal services, and those services send messages back via websockets

### WebRTC: Real-time Communications (niche)

- runs on UDP
- used for collaborative editors or audio + video communication between clients
- it's a Peer-to-Peer connection - allow clients to connect to each other
- __avoid this in SD interviews unless used for audio + video calling or collaborative editors like google docs__

## Vertical Scaling vs. Horizontal Scaling

* vertical scaling: better hardware
* horizontal more servers (expected in interviews)

## Load Balancers (problem introduced by horizontal scaling)

1. spreads load between servers for more traffic
2. allows for high availability -> redirects traffic from failed server to online one

### Client-side Load Balancing:

- client aware of all servers via registry
- or ask server1 about other servers (redis clusters)
- effective b/c no middleman
- useful when:
    1. very few clients (internal microservices)
    2. or have lots of clients but update delays are tolerated
        - e.g. DNS b/c DNS is heavily cached (can take up to a day to propagate changes)
- gRPC handles client side load balancing natively

### Dedicated Load Balancers:

- useful for interacting w/ external clients that need quick updates
- layer between client and server
- can be made of software or hardware
- server tells load balancer that it's available and load balancer makes health check, and if server is healthy, will direct traffic to it
- if server becomes unhealthy, load balancer stops sending traffic to it
- load balancing algorithms for __stateless__ or simply request/response
    - round robin
    - random allocation
    - by least connections -> only allocate new connections to server w/ least connections
        - good for getting new server up to speed
        - good for long-running/stateful connections b/c connections could last hour+ and this guarantees more even distribution

### What Layer of OSI Model does your Load Balancer Operate at?

- layer 4 (transport) vs layer 7 (application) load balancer
- layer 7 load balancer at HTTP level and layer 4 load balancer at TCP level
- __layer 4__: creates TCP connection to load balancer and that load balancer create a parallel TCP connection to that server
    - we can pretend like this layer 4 load balancer doesn't exist
    - almost like the client has a direct connection to the server
    - layer 4 load balancers are really high performing
        - don't care about looking at packets
- __layer 7__: accepts HTTP requests and chooses a server to send requests to
- more expensive
- default for most cases except for websockets or stateful where a layer 4 load balancer is more acceptable

## Regionalization

- distances between can requests can hamper latency (how far a request can travel)
- global scale system = global scale traffic
- to solve these issues:
    1. co-locate data: keep  core of data + processing close together
    2. get services closests to user as possible
        - have some cases like youtube/reddit where users from one region can access data of others so having data be closest to user is not always possible
        - but you have some options:
            1. have that data be local to where it is used the most
            2. save data locally but replicate that data to other regions
            3. use a CDN -> used as a cache to serve data quickly to users that are closest to it
                - helps reduce load on backend
                - not all data can be cached but can call the origin server (actual backend) to get that data

## Timeouts, Backoffs, Retries

- i.e. how you handle failures in your system
- timeouts allow connection to give up and retrieve an error message
    - have to be long enough so that the request can be fulfilled but not too long that the client has to wait
- when there's a failure/timeout the obvious thing to do is retry
    - naive approach: retry every x seconds
        - naive because server hasn't changed much
        - also has a "bunching" behavior
        - https://encore.dev/blog/thundering-herd-problem
    - backoff approach (often times exponential backoff):
        - subsequent retries take longer to start
        - first retry = quick, but after retries takes more time in between
        - but that "bunching" behavior still happens where multiple clients make a request at the same time but now with longer time in-between
    - jitter approach (randomness):
        - randomize retry delays which solves the "bunching" issue since multiple clients have less of a chance to call the server at the same time
        - helps distribute load
        - https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/
- __gold standard for interviews__: timeouts and retries with exponential backoff and jitters

## Cascading Failures:

- mostly in senior/staff-level interviews
- a failure or a component at limited capacity in your system causes a domino-effect of failures
- to deal with this, you use __circuit breakers__ like the ones in our house
- in software, a circuit breaker trips when a failure exceeds a certain level
    - halts operations but resets after a time to allow subsequent components to recover
    - allows failing in an "obvious" way and notifies that something is failing downstream