# The TCP/IP Stack and Transport-Layer Protocols
## Overview
### What You'll Learn
In this section, you'll learn
1. The infrastructure of the Internet
1. How it pertains to socket programming
1. The differences between the two most popular socket types -- TCP and UDP

### Prerequisites
Before starting this section, you should have an understanding of
1. [Socket Programming](https://github.com/HackBinghamton/PythonWorkshop)

### Introduction
The Internet is a remarkably complex beast, but thankfully it's been designed in a way that we can abstract it out into just a few moving parts.

## The TCP/IP Stack

The main moving parts of the internet can be broken up into just 5 "layers" that we have to keep in mind. Each "layer" is responsible for different aspects of getting data from point A to point B, from deciding how to put binary data on an ethernet cable, to how we want to send image data for a website.

![TCP/IP Stack Diagram](tcp-ip-stack.png)

The TCP/IP stack is made up of the following layers, from closest to the hardware to farthest:
1. **Physical Layer** - How to physically put bytes onto a wire, and how to read them off of the wire (e.g. 802.11)
2. **Link Layer** - How to choose which device on a local area network to send data to (e.g. MAC)
3. **Network Layer** - How to route data between devices that may be on opposite sides of the world (e.g. IP)
4. **Transport Layer** - How to make sure that the data makes it from point A to point B in one piece (e.g. UDP/TCP)
5. **Application Layer** - How our programs format data send from point A to point B, specific to the application being run (e.g. HTTP when using Firefox)

Generally, there's no need to worry about the intricacies of the physical and link layers -- they're taken care of for you and you never need to interface with them when writing networked programs.

However, the other layers are of utmost importance. The network layer has important implications for how we host servers and how visible computers are to one another. The transport layer can have massive impacts on how reliable connections are, and how fast we can send data over them. Finally, the application layer is where we decide how we want our programs to interact.

*Note:* There's a more "complete" model of the internet called the OSI Model -- it has 9 layers, but several of them are rarely relevant. Here's [an article from CloudFlare](https://www.cloudflare.com/learning/ddos/glossary/open-systems-interconnection-model-osi/) that explains it in detail.

## Transport-Layer Protocols (TCP and UDP)

**Don't worry, we'll be checking out the network and application layers in the next two sections!**

When creating a Python socket in the previous section, we used the `socket.SOCK_STREAM` constant:

```python
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
```

This argument defines which transport-layer protocol we'd like to use with this socket. In this case, we're using TCP, but there are many others that we can choose from as listed in [the `socket` documentation](https://docs.python.org/3/library/socket.html#constants). The other one we'll commonly use is `socket.SOCK_DGRAM` for UDP.

Let's dive into what these weird "UDP" and "TCP" acronyms mean!

### User Datagram Protocol (UDP)

UDP is the simplest transport-layer protocol, and it works a lot like postal mail!

With postal mail, we write our letter, and then send it in an envelope marked with our address, as well as the address of the recipient. We drop it off for shipping, and it gets sent off to the person we specified. However, we don't *really* know if our letter ever made it to the recipient (unless we pay for tracking and confirmation) -- it could get lost in the mail, fall off the truck, or get stolen and we'd never know the difference.

All of this applies to UDP as well. We take our data, provide an IP address and port, and then send it. We get no sort of confirmation of if the message was ever received. If we want to add some sort of confirmation, we have to implement it ourselves.

You might wonder why someone would ever use UDP if there's a risk of loss. However, there are some pros to keep in mind. Some applications don't need every single piece of data to make it through, like **video streaming**. Also, some applications need to get updates *fast* and don't have time to correct for loss, like **multiplayer gaming**.

**Important Note:** There are subtle syntax differences to keep in mind when working with UDP sockets -- for example, you need to use the `.sendto()` and `.recvfrom()` methods instead of `.send()` and `.recv()`. Also, there's no need to use `.connect()`, or any of the server-specific methods.


### Transmission Control Protocol (TCP)

TCP aims to correct the problems with UDP by creating a connection between two hosts, ensuring that:

1. Data arrives in entirety
2. Data doesn't arrive out of order
3. Data isn't corrupted in transit

Thus, there's lots of extra traffic to detect and correct data loss that UDP doesn't have. Otherwise, it's very similar to UDP since it's actually built on UDP!

Thankfully, **every socket library takes care of these extra interactions for you under the hood**, so you don't need to worry about them! 

**Question:** If you're writing a program like Snapchat, what protocol should you use?

**Answer:** It depends -- for chat messages, TCP is probably a good idea since it'd be bad if messages often got lost. For video calls or streaming stories, UDP could be a good choice, and then putting the audio over TCP would be nice since garbled voice is more impactful than garbled video.

# Next Section: [IP Addresses and DNS]()