# Communication Protocols
Sending data between computers over the internet is a complex process, particularly because of the huge diversity of different computer operating systems and hardware. To ensure reliable data exchange there are established rules known as *protocols* that govern how computers communicate with one another.

Protocols can be defined in one of two ways:

**Stateful** protocols track and retain information about the state of a connection. They are a bit like a telephone call: a constant connection between the two parties must be established before communication can occur. In stateful protocols it is the server's job to retain the session information.

**Stateless** protocols conversely do not retain information about the state of a connection. Consider an SMS text message: the receiver's availability is not confirmed before the message is sent. There is no confirmation from the receiving device to the sending device that the message has been received.


#### Packets
Before exploring communication protocols we first need to revisit *packets*. Packets are small segments of a larger message which are sent over the internet separately and then reassembled at the destination to display the data. Individual packets may take different routes, so that many packets can be transmitted over the same networking equipment simultaneously.



Protocols standardise the steps of the packet transfer process to ensure these packets are disassembled and reassembled properly at either end of the transfer. These are the main protocols you need to know about:

### HTTP (HyperText Transfer Protocol)
HTTP is a protocol used by web browsers to ask for the information they need to load a website. It involves a *request* from the browser and a *response* from the server hosting the website. HTTP is located at layer 7 (the application layer) of the OSI model. Revisit [this](LINK TO NETWORKING OVERVIEW NOTEBOOK) notebook to remind yourself about the OSI model.


HTTP is a **stateless** protocol because both the request and response contain all of the information they need to fulfill their purpose; there is no need for the server to retain any state information about the client.

The request is formed of:

- **The method:** `GET` and `POST` are the two main HTTP methods, which serve to extract data from a website or send data to a website, respectively
- **The version:** HTTP/3 is the most recent version
- **A request header:** Metadata of the request such as authentication details
- **A request body:** The information that has to be transferred

The response from the server then consists of:

- **A status code:** This tells us if the data transfer is successful or not
- **A response header:** Core information such as the language and format of the data being sent
- **A response body:** The actual requested information. In most web requests, this is HTML data that a web browser will translate into a webpage.

The S in HTTPS stands for secure, it means there has been a layer of security encryption added to the transferred data. If you want to learn more about HTTP and HTTPS as well how they relate to APIs, please refer to [this](https://colab.research.google.com/github/AI-Core/Content-Public/blob/main/Content/units/Cloud-and-DevOps/0.%20APIs%20%26%20Requests/0.%20Basics%20of%20APIs%20and%20Communication%20Protocols/Notebook.ipynb) notebook.

### TCP (Transmission Control Protocol)
TCP is a **stateful** protocol which operates on layer 4 (the transport layer) of the OSI model. It involves a three step process to establish a connection between the source and the destination, before data is transferred. Each of these steps can sometimes be called a *handshake*:

1. Firstly, the source sends an initial request to start the conversation, called a *SYN* (synchronisation) packet
2. Then the target responds with a *SYN/ACK* (synchronisation/acknowledgment) packet to agree to the process
3. Finally, the source sends an *ACK* packet to confirm. Then the actual data can be sent.

This process ensures a reliable and ordered delivery of data, though it has a relatively slow transfer speed because of the lengthy handshake process. It is well-suited to the following use cases:

- Email delivery
- Database access: TCP guarantees secure and dependable transmission of queries and database responses
- Web browsing: A TCP connection must be established between the client and server before the HTTP request and response process can begin. 

### UDP (User Datagram Protocol)
UDP is a more simple request and response process whereby no formal connection is established. It is a **stateless** protocol, and is also sometimes referred to as being *connectionless*. This means that data can be transferred very quickly, but comes at the cost of reliability, as data is more likely to be lost in the transmission. 

UDP is commonly used in time-sensitive communications where occasionally dropping packets is better than waiting. The following cases all prioritise speed of data transfer, but are designed to handle some level of packet loss:

- Online gaming
- Video streaming 
- Voice and video calls

## TCP/IP Model
The TCP/IP model is an evolution of the OSI model. The OSI model was developed as a protocol-independent framework for understanding the steps of the connection process between two devices. The TCP/IP model is an updated, more practical model which addresses specific communication challenges and relies on specific protocols, namely TCP and IP.

The TCP/IP model groups together the top three layers of the OSI model into *Application,* and the bottom two layers into *Network Access.*

<p align=center> <img src=images/TCP-IP-model.png width=400 height=300> </p>

**1. Network Access:** This layer handles the physical equipment on the source device being able to connect to the network

**2. Internet:** The internet layer controls network traffic flow and facilitates routing, to ensure data reaches its intended destination. This is the IP component of the TCP/IP model.
    
**3. Transport:** The transport layer creates a connection between the devices. It handles network error checking and is responsible for choosing how the message is divided into packets before transfer. This layer is the TCP component of the TCP/IP model, UDP also operates on this layer.

**4. Application:** Finally, the application layer consists of the actual group of applications that let the user access the network. HTTP operates on this layer.

## Key Takeaways

- Protocols can be stateful or stateless. Stateful protocols track and retain information about the connection process, stateless protocols do not.

- HTTP is a stateless protocol used by web browsers to load websites, it consists of a request and response process between the client and server
- TCP is a stateful protocol which is used for reliable data delivery because of it's robust, three-part handshake process
- UDP is a connectionless, stateless protocol perfect for instances where speed of data transfer is prioritised over reliable packet delivery
- The TCP/IP model is a practical framework for implementation of these protocols and troubleshooting of network connection issues