Notes on "High Performance Browser Networking" (second reading, this time trying out stuff on the Raspberry Pi and/or my MacBook in the process), the good parts, with additional material from other sources.

### `traceroute`

* `traceroute` traces the route to an web address. The mechanism by which it does this is intruiging.

  The IP protocol has a TTL field, whose maximum value is 255 (one octet). The TTL is reduced by one by every single router along the packet travel route. When the TTL reaches 0, the router returns a ICMP message (Internet Control Message Protocol; a web standard encapsulated by IP that handles signal propogation on the Internet) recording a `TIME_EXCEEDED`.
  
  Network time-to-live was implemented to prevent "immortal packets" cycling between different hosts forever.
  
  `traceroute` cleverly sends probe packets with incrementally larger TTLs: 0, 1, ..., N. Since ICMP messages containing identifying information about the router (in particular, its IP address), assuming stable network topology, this will allow you to "illuminate" the route to the target.
  
  `traceroute` sends either an empty UPD packet or a TCP ICMP echo packet. The former will (presumably) result in a connection closed upon reaching the intended target; the latter will result in an echo return upon reaching the intended target. Once `traceroute` gets this result, it is done with its work and exits out.
  
  `traceroute` output looks like so:
  
    ```
    Alexs-MacBook:browser-networking-notes alex$ traceroute google.com
    traceroute to google.com (172.217.0.46), 64 hops max, 52 byte packets
     1  www.routerlogin.com (192.168.7.1)  2.072 ms  0.987 ms  1.029 ms
     2  192.168.0.1 (192.168.0.1)  2.724 ms  3.380 ms  2.855 ms
     3  96.120.90.145 (96.120.90.145)  16.982 ms  15.098 ms  12.252 ms
     4  po-302-1209-rur01.oakland.ca.sfba.comcast.net (68.86.249.113)  13.496 ms  104.963 ms  26.555 ms
     5  be-214-rar01.santaclara.ca.sfba.comcast.net (162.151.78.93)  56.329 ms  27.718 ms  16.772 ms
     6  be-299-ar01.santaclara.ca.sfba.comcast.net (68.86.143.93)  20.916 ms  25.936 ms  22.927 ms
     7  96.112.146.18 (96.112.146.18)  27.572 ms  18.707 ms  30.109 ms
     8  * * *
     9  209.85.252.250 (209.85.252.250)  59.955 ms
        209.85.248.34 (209.85.248.34)  24.171 ms  40.714 ms
    10  108.170.243.13 (108.170.243.13)  23.903 ms  29.063 ms  30.077 ms
    11  74.125.253.151 (74.125.253.151)  22.753 ms
        74.125.253.190 (74.125.253.190)  30.559 ms
        lga15s43-in-f14.1e100.net (172.217.0.46)  25.599 ms
    ```

  Note that for certain hops, the result includes domain names in addition to IP addresses. `traceroute` performs a lightweight reverse DNS lookup on the IP addresses it finds. Specifically, it asks DNS for a PTR (pointer) record for the server. This pointer record will obviously only exist if the server owner has published such a record. Intermediate "switchboard" routers will do so; hence the three Comcast regional service providers that have a well-known name. Wide-area network and local-area network routers will not have such records associated with them.
  
  This is a "lightweight" reverse DNS lookup because a more thorough method for reverse DNS exists: traversing the entire DNS service tree in a top-down manner, from the ISP lead network center all the way down, to get to the server which is responsible for this IP address, which can then report its A record. This is an expensive operation. Described in detail in [this StackOverflow post](https://stackoverflow.com/questions/23981098/how-forward-and-reverse-dns-works).
  
  The first hop is to the local router, which injects its login page as the network address (I think this is a NETGEAR router because mine has the same login page).
  
  Note that one of the hosts couldn't be resolved. That appears to be due to a misbehaving router simply dropping the ICMP echo packets on the floor, instead of returning them to sender like it's supposed to. It's also possible that this is occuring due to network firewall rules blocking this type of traffic.

## TCP flow control and congestion control

* There is an initial three-way handshake:
  1. `SYN` message which has the current machine picks a random number.
  2. `SYN ACK` message from the server which increments the random number by one, then appends its own.
  3. `ACK` message from your machine which increments both numbers by one again.
  
  Only upon completing this handshake will the target server being to return data. The implication is that every TCP connection requires a full roundtrip of latency before any data transfer can occur. This is a big part of the reason why TCP connection reuse is a thing.
  
  From our reading of "Data Intensive Applications" we know that three-way handshakes do not provide strong guarantees against data loss, but this is a simple protocol and it makes sense for an unreliable transport medium like the Internet, on which reconnect and packet resend is absolutely a thing.


* Network traffic throughput on a TCP connection is controlled by two windows: the congestion window and the receive window.

* The **recieve window** provides **flow control** for the data sender and reciever, e.g. a way for these two entities to reconsile incoming traffic with their processing load. TCP packets sent from A to B are "cleared" when A recieves an `ACK` message from B stating that B has received the messages. Once the sender gets this `ACK`, it is allowed to send the next `rwnd` bytes and packets' worth of payload.
  
  To reduce its receive window, the recipient may send an `ACK` with a smaller `rcwd` set.
  
  The current maximum `rcwd` value is 1 GB. The current minimum `rcwd` value is 0 bytes, which is to serve as a signal to stop all traffic until the server sends a new `ACK` packet with a non-zero `rcwd` value.

* The **congestion window** provides **congestion control** for the underlying transport network. This is a separate concern from flow control because neither the sender nor the reciever is entirely aware of the transport capacity of the underlying network.

  The congestion window is a `cwnd` value that is only known by the data sender. Interestingly, whilst the recieve window is stated in terms of bytes of data, the congestion window is stated in terms of number of packets.
  
  Congestion control is peformed using exponential growth, multiplicative backoff. When the sender recieves an `ACK` from the reciever stating that all `N` packets in the current window were recieved, the `cwnd` value is doubled. Thus the value goes from 4 packets in-flight, to 8 packets in-flight, to 16 packets in-flight, and so on. As soon as a packet loss occurs and a retransmit request is made, the `cwnd` value is halved; and the cycle begins anew.

* The maximum number of bytes/packets in flight is always the minimum of the `rcwd` and `cwnd` values.
* Of course, true bandwidth is a product of both window size and network delay. If a network is slow but very reliable, very large segment sizes are better because `ACK` message transmit time is proportionately important and message retransmit is proportionally unimportant. If the opposite is true, small segment sizes are better for the reverse reasons.

* TCP is a well-ordered protocol. TCP sockets will only serve data to applications in packet order. If there is a delay in the arrival of an early packet, the remaining packets in the segment will be blocked in a queue until the late-arriving packet arrives; only then will more data be readable from the socket. This behavior is known as **head-of-line blocking**, and it's responsible for burst randomness in reads from TCP connections that are known as "jitter".


## NAT

* NAT stands for **Network area translation**. A NAT box performs **IP masquerading**: it intercepts messages bound for certain public IP addresses, maps those to a corresponding private IP address, and forwards the message to that address. On the return trip, it ejects the target private IP address and re-injects the public IP address.

  NAT exists because it deals with **IPv4 exhaustion**. Without NAT, an endpoint must publish a public IP address to be able to communicate with other endpoints on the Internet because its public IP address must be in its IP protocol header. With NAT, an endpoint may stay on a private IP address (e.g. not take up a public IP address); the NAT will substitute the private IP address for its own public IP address, and inject a new port number into the packet that is unique to the given endpoint, before passing the message upstream. On the return trip, the NAT will resolve the port number to the corresponding endpoint, perform the necessary header substitutions, and route the traffic thusly.
  
  NAT allows endpoints that would have to set a public IP address to communicate with the public Internet set non-unique private IP addresses instead. These private IP addresses are from one of three subnets reserved specifically for this purpose, and get stacked across many different endpoints on many different private networks. NAT boxes can be stacked hierarchically to obviate the need for huge blocks of IP assignments.


## UDP

* UDP is an unreliable message delivery protocol. It omits connection handshakes, congestion control, flow control, receipt acknowledgement, packet retransmit, and well-ordered packet delivery. This allows for a simple, performant, lossy protocol. UDP is rarely the first choice for network engineering because the gaurantees provided by TCP are useful. If you want the things that TCP offers, you can implement them on top of UDP...or you can just use TCP to begin with.

  UDP sees the most use in contexts where partial delivery as fast as possible is important. For example, it's the protocol of choice for video game multiplayer. It is left up to the server software system and the local copy of the game to reconstruct state transitions from partial information in the event of packet loss, but doing this as-soon-as-possible is important for minimizing lag.
  
* The big problem with UDP on the open Internet is NAT. The management of cache entries in NAT requires a connection state machine. TCP provides such a machine (connection handshake and connection termination), so NAT boxes known exactly when to create and remove cache entries. UDP, meanwhile, is stateless. A NAT box knows when to create a cache entry, but doesn't know when to remove it. If a NAT box removes a map entry before the data is finished being communicated, routing from the receipient to the sender will fail.
* Another major issue that is that some servers may choose to block UDP traffic outright.
* There is an RFC called ICE that specifies a well-known methdology for working around these issues. The protocol is:
  1. Attempt to connect point-to-point using UDP directly.
  2. If this fails, fall back to using specially-designed STUN servers for difficult hops. STUN servers provide IP-mapping-as-a-service; the UDP peers notarize with the STUN server with a set message, and the STUN server independently manages the address mapping on the NAT boxes. When the peers are done communicating, an end notice is sent to the STUN server, which proceeds with entry clean-up.
  3. If this fails, usually due to firewall rules that block UDP traffic completely, fall back to using TURN servers. TURN servers act a relay; they tunnel the UDP traffic over TCP connections to the problematic NAT(s) that the TURN server manages.
  
  Note that STUN servers are preferable to TURN server because they require an additional handshake, but the connection is still peer-to-peer. TURN servers require an additional handshake *and* must recieve and broadcast the traffic. This increases latency, as the route followed is now indirect and is non peer-to-peer anymore. It also requires the TURN server to have enough inbound and outbound network bandwidth to handle the traffic being routed.

## WSGI

* Web services consist of a **server process** (Apache, Nginx, Lighttpd) that handles the transport layer (server tier), and a **web application process** (Django, Flask, Tornado, Pyramid) that handles the logic layer (web tier). These two types of entities need some way to talk to one another: the server process needs to know what callable to pass the packets that it serves to the web application to, and the web application needs to know how to register its callables with the server process so that they do, indeed, get called.

  This is where **WSGI** comes in. WSGI is just a standard for providing and consuming these hooks. WSGI is a Python-specific specification, orginally drawn up in PEP 333 and updated in PEP 3333. Similar standards include CGI and the Java servlet API.

* `werkzeug` is a WSGI framework, commonly paired with `flask`, though it came into being before `flask` did (the latter was supposed to be a demo of `werkzeug` actually).

* At this point, it's helpful to leaf through the [Werkzeug tutorial](https://werkzeug.palletsprojects.com/en/0.15.x/tutorial/), which builds a small URL shortener application using Redis. I didn't actually bother with writing the application myself, instead, let's look at the code snippets.

  First of all, here's how to return a basic request. The `Response` object is a callable which handles constructing the response and handing it off to the server software. Support for query string parameters in the URL is built in:
  
  ```python
    from werkzeug.wrappers import Request, Response

    def application(environ, start_response):
        request = Request(environ)
        text = 'Hello %s!' % request.args.get('name', 'World')
        response = Response(text, mimetype='text/plain')
        return response(environ, start_response)
  ```
  
* A longer code snippet:

```python
    import os
    import redis
    import urlparse
    from werkzeug.wrappers import Request, Response
    from werkzeug.routing import Map, Rule
    from werkzeug.exceptions import HTTPException, NotFound
    from werkzeug.wsgi import SharedDataMiddleware
    from werkzeug.utils import redirect
    from jinja2 import Environment, FileSystemLoader

    class Shortly(object):

        def __init__(self, config):
            self.redis = redis.Redis(config['redis_host'], config['redis_port'])

        def dispatch_request(self, request):
            return Response('Hello World!')

        def wsgi_app(self, environ, start_response):
            request = Request(environ)
            response = self.dispatch_request(request)
            return response(environ, start_response)

        def __call__(self, environ, start_response):
            return self.wsgi_app(environ, start_response)


    def create_app(redis_host='localhost', redis_port=6379, with_static=True):
        app = Shortly({
            'redis_host':       redis_host,
            'redis_port':       redis_port
        })
        if with_static:
            app.wsgi_app = SharedDataMiddleware(app.wsgi_app, {
                '/static':  os.path.join(os.path.dirname(__file__), 'static')
            })
        return app
```

* The application is contained in a `Shorty` instance, which initializes by creating a connection to a `Redis` instance. `Shorty` is a callable object that accepts `environ` and `start_response` inputs. These are input variables to the service, e.g. query parameters and the like, and based on these input variables we construct and return a `Response` object (which is itself a callable which passes the information on to the server software). And bang, you have an app.

  What makes a framework (and what makes a *good* framework) is taking these basics and building nice tools on top of them, like routing maps and the like.
  
  Werkzeug seems to emphasize minimum configuration required. It doesn't require you to actually set up any servers, or even to actually run one, because it has a simple build in server that can be launched with `simple_server`. But to tell it to connect to a running Apache process, you obviously need only add some configuration bits telling it where that process is, and it'll handle the rest.

## Apache

* Apache is one of the common pieces of server software, and it runs something like a third of all websites on the Internet. Let's look at how it's configured.
* First thing first, `apache2` can be installed with `apt-get` or `brew` or whatever. Once you do that, you will find a hierarchy of files laid out in the `/etc/apache2` folder on your machine. This folder contains a number of files:

    > apache2.conf: This is the main configuration file for the server. Almost all configuration can be done from within this file, although it is recommended to use separate, designated files for simplicity. This file will configure defaults and be the central point of access for the server to read configuration details.
    >
    > ports.conf: This file is used to specify the ports that virtual hosts should listen on. Be sure to check that this file is correct if you are configuring SSL.
    >
    > conf.d/: This directory is used for controlling specific aspects of the Apache configuration. For example, it is often used to define SSL configuration and default security choices.
    >
    > sites-available/: This directory contains all of the virtual host files that define different web sites. These will establish which content gets served for which requests. These are available configurations, not active configurations.
    >
    > sites-enabled/: This directory establishes which virtual host definitions are actually being used. Usually, this directory consists of symbolic links to files defined in the "sites-available" directory.
    >
    > mods-[enabled,available]/: These directories are similar in function to the sites directories, but they define modules that can be optionally loaded instead.

   The functionality of `apache2.conf` and `ports.conf` is relatively obvious (we'll get to the lexicon in just a bit). The interesting bit is `sites-available` and `sites-enables`. First of all, notice the use of symlinks; an interesting design choice, but keeping with the Unix philosophy I suppose.
   
   The actual **virtual host files** contained in the `sites-*` directories are specified using a configuration language that is Apache-specific, e.g. it's not XML or YAML or anything. It looks like this:
   
    ```
   <VirtualHost *:80>
    ServerAdmin admin@example.com
    ServerName example.com
    ServerAlias www.example.com
    DocumentRoot /var/www/example.com/public_html
    ErrorLog ${APACHE_LOG_DIR}/error.log
    CustomLog ${APACHE_LOG_DIR}/access.log combined
    </VirtualHost>
    ```
    
    DocumentRoot points to the HTML (`.html`) file that will actually be served as the page. Of couse, this design limits to using just static HTML files (or HTML files you modify with downtime on the fly). How do we get from this to the dynamic webpages of the modern Internet?
    
    Well, take CGI, a historical WSGI competitor, as an example. Apache may be told that a particular directory is set aside for CGI programs by setting a `ScriptAlias` directive in the main configuration file, then pointing to a file within that folder in the virtual host file. Apache will execute that file when it is told to return it, instead of just returning it.
    
    For WSGI, something similar but different is done. Apache includes a `WSGIDaemonProcess` directive which may be plugged into the main configuration definition to configure Apache to handle off requests for certain files to a dameon process. That daemon process, whose uptime you are responsible for yourself, should . This contrasts with the CGI approach in that the Apache process itself doesn't actually do any work: the WSGI daemon does. Note however that it is possible to spin off a CGI dameon, so the actual difference at the end of the day is small. Here is the code you add to your `.conf` to make this happen:
    
  ```
  WSGIDaemonProcess myproject python-path=/home/user/myproject python-home=/home/user/myproject/myprojectenv
  WSGIProcessGroup myproject
  WSGIScriptAlias / /home/user/myproject/myproject/wsgi.py
  ```
  
  Here's a handful of tutorials describing how all this works: [1](https://www.digitalocean.com/community/tutorials/how-to-configure-the-apache-web-server-on-an-ubuntu-or-debian-vps), [2](https://www.digitalocean.com/community/tutorials/how-to-set-up-apache-virtual-hosts-on-ubuntu-14-04-lts), [3](https://www.digitalocean.com/community/tutorials/how-to-serve-django-applications-with-apache-and-mod_wsgi-on-ubuntu-14-04).
  
  Apache handles running the daemon, but how do you create the necessary file? `werkzeug` allows you to do this using code generation. Assuming you've created a valid application factory function in the previous steps, writing a correct WSGI file to disk is as simple as:
  
  ```
  from yourapplication import make_app
  application = make_app()
  ```
  
  You can see how this is a massive advantage over writing such a file yourself. Save that file somewhere, and then create an Apache configuration file for this service, and you're done. That'll look something like this:
  
  ```
      <VirtualHost *>
        ServerName example.com

        WSGIDaemonProcess yourapplication user=user1 group=group1 processes=2 threads=5
        WSGIScriptAlias / /var/www/yourapplication/yourapplication.wsgi

        <Directory /var/www/yourapplication>
            WSGIProcessGroup yourapplication
            WSGIApplicationGroup %{GLOBAL}
            Order deny,allow
            Allow from all
        </Directory>
    </VirtualHost>
  ```
  
* A few minor notes on other subjects:
  * Caching. Apache can be configured to cache pages to disk (which doesn't do anything for you for static content) or to memory (via `memcached`). This obviously speeds up serving.
  * Content negociation. There is extremely complex content negociation built into requests now. It is possible to request specific language versions of a page, or to request specific content types, using HTTP headers and two different content resolution algorithms (one controlled by the server, the other, by the browser). I don't know anything that uses any of this stuff, but Apache supports.
  * Logging. Apache has a log format which I am very familiar with from working with it on PythonAnywhere. The logs may be written by Apache itself (which requires performing graceful restarts to handle log rotation) or to a pipe, which can then do whatever is necessary with the logs.

## Summary (Part 1)

The important bits so far:

* All packets have a time-to-live, which sets the maximum number of hops a packet will take.
* `traceroute` allows you to see which hosts lie along a path by experimentally probing with packets with incrementally larger time-to-live values.
* TCP connections start with a three-way handshake.
* TCP connection bandwidth is controlled by a combination of a **congestion window**, which uses a exponential growth mulitiplicative backoff algorithm, and a recieve window, which is set by the data reciever.
* TCP will buffer packets until late-arriving packets arrive so that they are always read out in exact order. This behavior is called **head-of-line blocking**.
* NAT combats **IPv4 exhaustion** by mapping public IP address and port numbers to private ones behind the NAT in-flight.

## TLS

* TLS stands for "transport layer security", and it is the primary form of encryption used on the Internet. It is sometimes used interchangably with SSL, a close predacessor that predated standardization of the spec, but which is technically not interoperable.
* TLS is implemented at the application layer: e.g. it is encapsulated in the TCP/IP or UDP/IP message itself. The transport layer has no knowledge of whether the payload it is carrying is encrypted or not. 

  TLS negotiates encryption after the TCP link is established. The handshake proceeds as follows:
  
  1. The client sends a plaintext file stating which version of TLS it is using and which ciphersuite it accepts.
  2. This kicks off the server and the client send messages back and forth negotiating the connection details.
  3. Once the server is happy with the protocol, it sends the client its certificate.
  4. Assuming the client is happy with the authenticity of the certificate (for a primer on public key infrastructure see this other notebook, ["Notes on public key infrastructure"](https://www.kaggle.com/residentmario/notes-on-public-key-infrastructure)) it generates a new symmetric (two-way) key, encrypts it using the server's public key (from the cert), and sends that payload to the server. This message constitutes encrypted traffic, as per the principles of asymmetric cryptography, only the holder of the private key&mdash;the server&mdash;may decrypt the message.
  4. The server decrypts the message and verifies the integrity of the message by checking its MAC (see next bullet point). It sends a proceed message to the recepient, this time one encoded using the symmetric key.
  5. The recepient recieves the message, decrypts it using its symmetric key, verifies its MAC, and begins encrypted symmetric-key communications.

* There are a few different things from this process worth highlighting.

  The process starts off with the use of public key infrastructure to securely transmit a symmetric key. Asymmetric cryptography, it turns out, is much slower computationally than symmetric cryptography.
  
  This process establishes a trust relationship: the client believes the server is who it says it is because the cert it has says so. No such trust relationship is established the other way, e.g. the server has no proof that the client is who the client says it is (or even any awareness of who the client is at all). If verification of the identity of the client is important, regular TLs is insufficient&mdash;you need two-way TLS.
  
  The process of downgrading from a public key cipher to a symmetric key cipher is an ingenious trick that comes up in other contexts as well. The magic of public key cryptography is that it allows establishing a secret relationship of any kind without side channels.

  One of the steps in this process is verifying MACs. A MAC is a **message authorization code**, and it's a secret key -encrypted function of the input text which, when decypted, verifies the content of the message. There are various ways of constructing a MAC. The simplest way is to perform a cryptographic hash on the data payload, then encrypt it with the symmetric key. The message recepient can decrypt the key, then compare the data payload against the hash (e.g. rehash it) to determine whether or not it is genuine. The message authorization code protects the data payload from being tampered with in-flight; e.g. it ensures that the data payload recieved is the same as the data payload sent. TLS is not the only system to do this, many systems do this.


* There are basically two ports that you can rely on being open on machines you are connecting to on the Internet: 80, which is reserved for HTTP traffic, and 443, which is specifically reserved for TLS (HTTPS) traffic. Attempts to use other ports will likely be filtered out by firewall rules.

  Using 443 is preferable. Using port 80 will require using the HTTP Upgrade mechanism in HTTP, which intermediaries may fail to do correctly. By contrast, traffic to port 443 is known to be eventually encrypted (after the initial handshake), so intermediaries will know to simply forward the message and not attempt to do anything else with it.
* To speed up the process of establishing a connection, TLS has an inbuilt ability to upgrade a HTTP payload to HTTPS, using TLS header fields instead of typical HTTP upgrade request header fields. This allows for simultaneous  TLS negotiation and HTTPS upgrade negotiation.
* Again because performing reconnects is expensive, TLS has the optionality to resume past connection contexts. There are two such mechanisms: session tickets and session identifiers. The former uses a cache of session information on both the server and the client end. The latter has the server store session credentials, encrypted with an additional secret key known only to the server; the server receives and decrpyts and verifies the session credentials, then resumes them if they're good.

  These mechanisms equivalently allow skipping the public key phase of the negociation, eliminating a full network round trip in the process.
  
  These mechanisms have some interesting implications. One feature of the modern web browser is that it will typically create multiple connections in order to parallelize the work of downloading the necessary resources. To avoid negotiating TLS a bunch of extra times, the browser will wait on additional connections until the fist negotiation is complete and the session information cache is primed. *Then* it will open the additional connections, and leverage the expedited process to do so.
  
  The downside of the session tickets mechanism is that it requires maintaining and managing a cache.
  
  The session identifier mechanism, meanwhile, requires careful work to maintain durability when the server is a distributed system. You also need to have some mechanism in place for rotating the identifiers every so often for security purposes.
  
## CDNs

* CDNs, or **content distribution networks**, provide a variety of services in the intermediary network connection space, which help to make connections faster and more secure.
* They precompute an efficient distribution network on the Internet backbones they have access to and rent space on, so the routes that packets take on CDNs are known efficient routes that are likely to be faster than those taken on the open Internet.
* They rent servers in ISP data centers worldwide, and cache content on those servers. So that requests for specific data from the endpoints they have coverage of will be served with fewer hops, from a server that is much closer to you than the true origin server.
* They provide DDoS protection and other similar web management services.
* They act as endpoints for protocol negotiation. The CDN server may have permission (e.g. a certificate the user is directed to inspect) to perform e.g. a TLS handshake from its own nearby server, instead of having to route requests all the way to the origin server. This greatly reduces round trip time, and hence, delay.

## Summary (Part 2)

* TLS provides security at the application layer, e.g. above TCP/IP but below HTTP. Web traffic that is protected by TLS is known as HTTPS.
* The TLS handshake uses public key encryption (e.g. certs) to share a symmetric key between server and client, after which all messages are symmetrically encrpyted. Symmetric encryption is used instead of private key encryption because it is faster.
* A key component of TLS is the **message authorization code**, or MAC, which is a secret-key encrypted data payload tag used to verify that the data payload has not been tampered with in flight.
* TLS is one-way authentication. It proves that the server is who it says it is, but does not prove that the client is who it says it is. For that you need two-way TLS.
* TLS (HTTPS) traffic should be transmitted on port 443, a port dedicated to it, instead of port 80, which is dedicated to HTTP trafic.
* TLS has mechanisms for resuming a prior connection in an expedited manner by taking advantage of encrypted client-side and/or server-side key caching.

## HTTP 1.x

* The original HTTP 0.9 spec by Tim Berners-Lee was literally small enough to fit on a napkin. Improvements to this became HTTP 1.0, which became HTTP 1.1, which after a long pause saw a raft of further improvements that make up HTTP/2. HTTP/2 solved many of the problems in HTTP 1.1, became standard and supported in 2015, and is pretty much the only connection type of relevance at the present time.

  The nice thing about being out of the transition period of HTTP 1.1 into HTTP/2 is that the optimizations necessitated by the two technologies are different. It's insightful to look at the performance hacks that were necessary in HTTP 1.1, but they're not very relevant anymore.
  
* First perf note. HTTP 1.0 required a new TCP connection, and hence a new TCP handshake, for every single resource downloaded. HTTP 1.1 introduced keep-alive by default: the socket would stay open until such time as you explicitly moved to close it.
* Second perf note. HTTP 1.x headers are plain-text encoded, which, for small data payloads, results in a *lot* of proportional message overhead.
* Third perf note. Content delivery via HTTP 1.x is head-of-line blocking. This means that one can only send packets for a resource that is later in the load order after all resources earlier in the load order have already been sent. 
  
  The client *can* issue concatenated requests (e.g. multiple GETs in the same message), and the server can buffer those resources in a parallel way. This makes processing efficient but doesn't solve streaming. This is known as [buffer] **pipelining**.
  
  Solving streaming would require **multiplexing**: combining sub-elements of various elements just-in-time with parsing. The capacity for this doesn't exist in HTTP 1.x.
* Fourth perf note. To get around download time limitations created by the lack of multiplexing support, web browsers open some number of threads (6 at the time of the writing of the book, in 2015) simultaneously, and download content on a threaded basis.
* Fifth perf note. Some sites encourage even greater threading by spreading their resources across subdomains, e.g. `shard0.x.com`, `shard1.x.com`, etcetera. Since the thread limit rule is based on the fully qualified domain, this increases the limit by a multiplicative factor of as many shard as you've got. But, obviously also creates a headache in terms of asset management.
* Sixth perf note. Asset concatenation and spriting was a thing to get around all this.

## HTTP/2

* HTTP/2 is radically different. To even begin to gork it we have to understand three new terms:
  * A **stream** is a bidirectional flow of bytes within established connections.
  * A **message** is a sequence of frames that map to a logical message.
  * A **frame** is the atomic unit of HTTP/2. It consists of a data payload and a tiny identifier header.

  HTTP/2 divides the socket connect into a bidirectional set of streams 1, 2, ..., N. Each time a stream is finished building a data payload, it ensconces that payload in a packet and adds the packet to a interleved queue. The server sends the packets in the order in which they arrived in the queue.
  
  Header information gets its own packet, which always precedes the data packets in send order.
   
  As a result of these changes, there is no more head-of-line blocking (on the HTTP level, at least).
  
* HTTP/2 introduces a new mechanism for flow control: **stream priority**. Higher-priority assets will be enqueued ahead of lower-priority assets (priority queueing). This allows the browser to dictate that the most important parts of the page, e.g. layout, get rendered ahead of the least important parts of the page, e.g. ads.
* HTTP/2 introduces a new mechanism for asset handling: **server push**. The client typically determines what resources it needs by inspecting the page, but the server already knows (roughly) what assets the client will ask for. So why not push those ahead-of-time? Server push is this feature. For an example implementation, Apache: this looks for `X-Associated-Content` headers in the response sent by the application, and acts on those assets, but also does some inference of common complimentary requests automatically.
* HTTP/2 introduces a header index table, which records the headers that were set for each request. When an additional request is made on the connection, instead of rebuilding the header, the client sends the server a delta between the most recent header used and the one it needs now. Because a lot of header information is static, e.g. `User-Agent`, this saves a lot of bytes in overhead.

* You can attempt to initialize an HTTP/2 connection right from the get-go, but if the server doesn't support HTTP/2 traffic, an error will be returned. In that case you might want to fall back to HTTP 1.1.

## Summary (Part 3)

* HTTP/2 is a fully multiplexed, unlike HTTP 1.1, and transfered data using interleved multichannel buffers and a priority queue.
* HTTP/2 is almost always paired with TLS to form HTTPS connections.
* HTTP/2 includes a **stream priority** feature, which allows clients to control the priority of transfered assets.
* HTTP/2 includes a **server push** feature, which allows servers to push resources to the client in anticipation of follow-on requests.

## XHR

* XHR, short for **XMLHttpRequest**, is the mechanism behind AJAX, and with it, the dynamic web. The XML bit was just added to the name to get it into the release of Internet Explorer that the feature was introduced in. This was codified as an expanded standard in 2008 called XHR2.

  The XHR standard defines an API that automatically handles low-level details like authentication, redicts, and caching. It also in turn allows browsers to control the security and policy constraints on the application code making the XHR request. This set of restrictions is known as the **web security model**, and its fundamental basis is **cross-origin resource sharing**, or CORS.

* CORS is implemented at the browser level. It sets a restricted subset of headers that can only be set by the browser, not by the user.
* CORS divides requests into same-origin requests and cross-origin requests. The **origin** of a request is the tuple of (protocol, hostname, port). For example, `(https, residentmar.io, 80)`. Each subdomain is considered a separate origin, so e.g. `https://a.x.com` is a separate origin `(https, a.x.com, 80)` from `https://b.x.com`, which is `(https, b.x.com, 80)`.

  There are no restrictions on the elements of a same-origin request.
  
  Cross-origin requests are a different story. The server must set a `Access-Control-Allow-Origin` header in the response that includes your requesting origin protocol and hostname. I implemented a version of this in my Google Maps proxy service in the Transit Explorer project. You can use the wildcard `*` to indicate "any".
  
  Even if you set the access control header (1) user auth information, specifically HTTP authentication and cookies, are still stripped out of the response and (2) the list of methods allowed is restricted to GET, POST, and HEAD.
  
  Cookies and HTTP authentication can be re-enabled by the server by setting `Access-Control-Allow-Credentials`.
  
  Custom headers and restricted request types require **preflight request**. An `OPTIONS` request must be sent to the server containing `Access-Control-Request-Method` and/or `Access-Control-Request-Headers`. The server must respond with a result that contains `Access-Control-Allow-Methods` and `Access-Control-Allow-Headers` headers matching the ones requested. Then, and only then, can you continue with custom headers and request types set.
  
  Preflighting adds a roundtrip of latency to the request time, but the permissions granted are cached by the browser using the usual cache controls, so they need only be performed once per route.


* The actual XHR API is split into two components: a send side, and a receive side. There is no streaming support. 
  
  Payload types for recieve are `ArrayBuffer`, `Blob`, `Document`, `JSON`, and `Text`. Payload types for send are different: `DOMString`, `FormData`, and `File` are added, and there is no `Text` type.

  Progress can be monitored using asynchronous event listeners (callbacks) against `loadstart`, `error`, `abort`, `loadend` (transfer done, either successful or failed), `load` (transfer successful), `progress` (transfer is in progress). `loadstart` and `loadend` are fired once, the other events are fired potentially many times. The `progress` API in particular provides a convenient API for tracking the progress of a send or recieve: it has both a `loaded` value for bytes transferred and a `total` value for bytes total. The latter is populated from the content length header, so make sure to set that on your responses!
  
  Since there is no streaming support, data transfers are all-or-nothing. You can build a sort of interrupt-restart support on top of XHR by chunking the data yourself, but it's unreliable and tedious. If you need streaming support, use WebSockets instead.
  
* The XHR protocol design is a natural fit for push, but it's less of a fit for pull. In an HTTP connection, every data payload that the server sends must be initialized by a client request. Thus if there is updated data on the server side that the server wants the client to have, there is no immediate way for the server to push it across. Instead, one of two different techniques must be used.

  The first technique is **short polling**. Polling has the client send requests to the server at regular intervals. The server may respond to the requests with information on whether or not new resources are available. There is an obvious trade-off with this technique between delay between resource availability and discovery, and the amount of transferless network traffic sent.
  
  The second technique is **long polling**. Long polling has the client initiate a new long-lived request with the server (keep-alive). The server holds off on responding to the request until.
  
  Long polling has a shorter delay period, but can also result in bursty traffic patterns if a resource is heavily subscribed, and induces the additional load on the server of managing all of the open connections. However long polling is to a large extent what enabled the second stage of dynamic content on the web (after simple XHR); it underpins a loose set of techniques known as [Comet](https://stackoverflow.com/questions/15966813/node-js-and-comet). It was how Facebook Chat was originally implemented in 2008, for example.
  
  Short polling is still used for certain appropriate workloads, but today long polling has been replaced by the server-side event and websocket APIs.

## Server-side events

* Server-side events are a browser-optimized API that improves upon XHR long-polling.
* It's a very very simple API. You initialize a new connection and pass it to an `EventSource` object, which allows you to set callbacks for all events or for events with specific IDs. The callbacks are consumed as the events are run. When the server closes the connection, it sends an event with the `id` of `"CLOSE"`.
* Data recieved in XHR is buffered in memory and only returned on complete success. Data in is streamed to the listener as it comes in, so there is minimal buffering. This greatly reduces the client-side memory footprint.
* The web security model is enforced using the same CORS policy controls as in XHR.


* The wire format is similarly easy to understand. A server-side event stream is initiated by a client request which has an `Accept: text/event-stream` header, to which the server responds with a `Content-Type: text/event-stream` header. The server then sets a client reconnect interval, before serving the data. Example stream:

    ```
    HTTP/1.1 200 OK
    Connection: keep-alive
    Content-Type: text/event-stream
    Transfer-Encoding: chunked

    retry: 15000

    data: Simple string

    data: {"message": "JSON payload"}

    event: foo
    data: Message of type "foo"

    id: 42
    event: bar
    data: Multiline message of
    data: type "bar" and id "42"

    id: 43
    data: Last message
    ```

  Notice that mesages need not have IDs or event types. Line breaks are significant; they indicate multipart messages. The inidividual lines are treated seperately by the data layer (e.g. they are buffered) until the full message is available for send.
  
  
* All that taken into account, SSE has its limitations.

  One is that it is exclusively a server push protocol, e.g. the stream is one-way. Two-way communication requires e.g. applying XHR in the reverse direction.
  
  Another important limitation is that SSE streams text types, not bytes. You can hack bytes into it but it's inefficient at this task. However, SSE can be used to set up alerts for the availability of byte-based assets, and then the client can use regular XHR to pull those bytes down locally. This *does* add an extra round trip of latency, however; so if fetching raw byte data is a priority in your use case, the more complex but more powerful web socket API is preferable.

## WebSockets

* WebSockets is a new standard for bidirectional communication of both byte and text data. WebSockets isn't meant to repalace the other protocols previously covered, but it does have the most advanced capacities of the lot, and seems to be the newest standard. Web sockets are designed to be close to the socket interface available in programming languages like Python (sockets, by the way, are covered in some detailed in a section above).

  One important aspect of WebSockets is that it doesn't use HTTP(S) or HTTP as its Internet protocol (at least, not beyond the initial HTTP/S Upgrade handshake). Instead, WebSocket connections occur over `ws:` or `wss:` protocols (the latter is the secure version of WebSockets; which means, yes, TLS): e.g. `ws:foo.com/sock`.
  
  This has the important implication that [WebSockets traffic is not subject to CORS](https://blog.securityevaluators.com/websockets-not-bound-by-cors-does-this-mean-2e7819374acc). The initial HTTP Upgrade workflow is not covered by CORS&mdash;CORS only applies to data payloads, and switching protocols, an interaction which cannot contain any user data, is not covered by CORS. Then, once the protocol is switched, CORS isn't applicable because CORS is only applied to HTTP/S traffic, and the traffic is now WebSocket instead.
  
  It's thus mandatory to move private data protections up to the application layer, as it's not possible to get away with just using CORS, as is true of certain HTTP/S workflows. The simplest way to achieve this is to check the user-immutable `Origin` header that is set by the browser on the outgoing traffic.


* The API itself is intentionally designed to be very, very close to the API used by server-side events. A primary difference is that there are just two types: `Text` and (binary) `Blob`. Binary blobs can be efficiently converted to arrays by setting `WebSocket.binaryType = "arraybuffer"`.

  There is now an XHR-like asynchronous `send` operation which can similarly accept any of the three-ish data types and send them to the server. Since this method is asynchronous, it's important to keep the amount of buffered memory in mind; this can be accessed at `WebSocket.bufferedAmount`.
  
  There is a concept of a subprotocol. The client can advertise a set of named subprotocols it accepts (e.g. protobuf or whatever), which the server can pick from by setting the `WebSocket.protocol` attribute. The client can then check this value and decide what to do at `onopen` time.


* The wire format is binary (unlike, say, SSE). It suffers from a couple of limitations: no built in compression, and no multiplexing (the latter meaning that WebSocket has the same head-of-line blocking problems that HTTP 1.1 had that HTTP/2 solved). There are extensions addressing both these problems, but it's not clear to me on immediate search what the standard of support for these extensions is at the present them (update this section later).

  The other technical element of WebSockets is the negotiation process. Currently, there is no way of negotiating a WebSockets connection expect by using the HTTP upgrade workflow. Specific headers are used:
  
  ```
  GET /socket HTTP/1.1
  Host: thirdparty.com
  Origin: https://example.com
  Connection: Upgrade
  Upgrade: websocket
  Sec-Websocket-Version: 13
  Sec-Websocket-Key: BUNCHA_HEXADECIMAL_VALUES
  Sec-Websocket-Protocol: appProtocol, appProtocol-v2
  Sec-WebSocket-Extensions: x-webkit-deflate-message
  ```

  `Sec-WebSocket-Protocol` is what is used for subprotocol negotiation. The server provides the subprotocol chosen in the `Sec-Websocket-Protocol` field of the response.


* The primary advantage of WebSockets is flexibility and bidirectionality. The primary disadvantage is complexity: you have to invest in your own efficient compressed send format, there's no multiplexing so you need to manage large blobs carefully lest they block the rest of your stream, and there's no resource prioritization built-in so you have to handle that concern yourself as well.

  In general, you shouldn't reach for WebSockets unless you need (1) bidirection traffic or (2) one-way blob traffic that is latency-critical and doesn't have a predictable time pattern.

## Summary (Part 4)

* XHR, SSE, and WebSockets are the three web standards for communicating text and/or blob data between a server and a client.
* XHR, short for **XMLHttpRequests** is the oldest, and the bedrock on which the early AJAX-based dynamic web was built. XHR allows a client to fetch data in text or blob formats from a web endpoint, using an event-driven callback-based API.
* XHR buffers the entire response received in-memory. This (1) increases the memory footprint of the transfer and (2) requires a redo if there is a failure.
* XHR solves client pull well, but requires either **short polling** (polling at intervals) or **long polling** (polling on a persistant connection) for server push.
* **Server-side events** is an API specifically designed for server push with a simple API.
* Server-side events allow pushing text, but are not designed for (and not efficient at) pushing bytes.
* **WebSockets** is an API for bidirectional communication, that supports both push and pull and both text and binary data.
* WebSockets uses its own Internet protocol, `ws` or `wss` (with TLS), which is not HTTP. Nevertheless, the HTTP upgrade handshake is the only way to enter a WebSockets connection.
* The WebSockets wire format is binary.
* In summary: use XHR for client pull. Use XHR with periodic short polling for server push when latency doesn't matter and object size is small. Use server-side events for server push when latency matters and/or object size is potentially non-small, and data type is text or small binary. Use websockets for bidirectional push/pull, or when data type is nontrivial binary.

## WebRTC

* The last chater of the book discusses WebRTC, which is the current best format for transfering real-time communications data like audio, video, VoIP, etcetera. This is less relevant to my work, and hugely complex, so I skipped this bit.

## Web workers

* Web workers allow running code in background threads that don't interfere with user-facing rendering. They are designed for long-running, compute-heavy workloads that you don't want to contest for resources with the user-facing workload, and provide significant isolation for such tasks.


* Basic workers are spawned by, and dedicated to, the specific page (and page access) that created them (see the later section on shared workers). The actual worker spin-up process is unusual in that it requires you to specify the URL of the script defining the worker process:

    ```
    var myWorker = new Worker('worker.js');
    ```

  This feels weird, but makes a lot of sense...if you are running compute jobs in a background thread, it doesn't make sense to load the current thread down with the additional code and assets necessary for that job but not for the caller job.
  
  Note that the data communicated via messages is copied on push; there is no concept of shared memory.
  
  Each worker is a dedicated thread on the OS level.

* Communication is handled via `postMessage`, which posts a message from the main thread to the worker (or vice versa). The worker has a well-defined event handler: `onmessage`; this function is the entrypoint to worker processing of the messages that hit the worker. In the main thread, these methods need to be hung off of the responsible worker: e.g. `Worker1.postMessage`. In the worker thread they are global methods because the worker itself is the global context.

* Workers have access to a globally-scoped `importScripts('foo.js');` function, which allows import execution of other scripts. Globally exported values within those scripts are then available in the global scope in the current script.

* Workers can spawn subworkers.

* Shared workers (`SharedWorker`), which can be accessed from several parent threads, are also possible. The primary API difference between a shared worker and dedicated worker is that a shared worker has to accessed via a specific port number.



* In general the [section of the MDN reference on web workers](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers) is a great reference on this subject.

## Practical support of HTTP/2

* Web browsers will not connect to HTTP/2 endpoints unless the traffic encrypted via TLS. E.g. your options are HTTP 1.1 or HTTPS.