<a href="https://colab.research.google.com/github/brendanpshea/intro_to_networks/blob/main/Networking_01_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chapter 1: Understanding the OSI Network Model
## Breaking Down Network Communications

Have you ever wondered how your message travels from your smartphone to a friend halfway across the world? Or how millions of devices can communicate simultaneously across the internet without descending into chaos? Computer networking may seem like magic, but it's actually a carefully orchestrated symphony of protocols and processes, each playing its part in the grand performance of digital communication.

In this chapter, we'll demystify the fundamental framework that makes modern networking possible: the **Open Systems Interconnection (OSI)** model. Think of it as the blueprint that network engineers use to understand, troubleshoot, and build the networks we rely on every day. By breaking down network communications into seven distinct layers, the OSI model helps us manage the complexity of modern networking one piece at a time.

## Chapter Case Study: The Scooby-Doo Detective Agency Goes Digital

Follow along with the Scooby Gang as they modernize their detective agency with a new networked case management system. When mysterious network issues start affecting their investigations, they'll need to understand each layer of the OSI model to solve these digital mysteries. From Velma troubleshooting physical layer interference to Fred analyzing TCP connection attempts, each member of the team discovers how networking knowledge directly impacts their ability to solve cases efficiently.

Throughout the chapter, we'll see how the gang applies networking concepts to real-world challenges:
- Setting up secure connections between their main office and new satellite location
- Ensuring reliable transmission of sensitive case files
- Troubleshooting video conferencing issues with remote witnesses
- Protecting their network from potential security threats

## Learning Outcomes

After completing this chapter, you will be able to:

1. Explain the purpose and structure of the OSI model in network communications
2. Identify and describe the function of each OSI layer from Physical to Application
3. Analyze how data is encapsulated and decapsulated as it moves through the network layers
4. Use common networking tools (ip link, tcpdump, traceroute) to examine network behavior
5. Troubleshoot common networking issues by identifying which OSI layer is affected
6. Compare and contrast the characteristics of TCP and UDP protocols
7. Explain how MAC addresses, IP addresses, and ports work together in network communication
8. Demonstrate understanding of fundamental networking concepts such as MTU, checksums, and packet structure
9. Interpret network packet captures to understand protocol behavior
10. Apply OSI model concepts to solve real-world networking challenges

## Keywords

OSI Model, encapsulation, TCP/IP, MAC address, IP address, ports, packets, frames, protocol, network layer, transport layer, physical layer, data link layer, presentation layer, session layer, application layer, UDP, TCP, MTU, checksum, routing, switching, network interface, traceroute, tcpdump, handshake, payload, headers, flags, SYN, ACK, FIN, RST.

## Layer 1: The Physical Layer

At its most fundamental level, all network communication consists of electrical signals, light pulses, or radio waves. The **Physical Layer** serves as the foundation of network communications, dealing with these raw physical elements of data transmission. Think of it as the actual road system in a delivery network - the concrete, asphalt, and physical infrastructure that makes movement possible.

Understanding the Physical Layer is crucial because it defines the basic building blocks that make all higher-level network functions possible. When network engineers talk about "signal degradation," "noise interference," or "bandwidth limitations," they're discussing Physical Layer concerns. This layer handles questions like: How do we represent a binary 1 or 0 in an electrical signal? How do we ensure a signal can travel 100 meters without degrading? How do we handle interference from other nearby cables?

> **Real-World Example**: The Scooby Team is setting up their new detective agency office network, and they've encountered their first mystery: intermittent network connections.
>
> "Zoinks! Like, the internet keeps cutting out whenever someone uses the microwave!" Shaggy complains, trying to upload evidence photos.
>
> Velma, already investigating the wiring closet, emerges with a dusty cable. "Just as I suspected. Look at how this network cable runs right alongside the electrical conduit for the break room. That's a Physical Layer problem - electromagnetic interference."
>
> "But like, we used the expensive cables you recommended!" Shaggy protests.
>
> "Yes, but even Cat 6 cables have limitations," Velma explains, sketching a quick diagram. "See, at the Physical Layer, we're dealing with actual electrical signals. When we run network cables too close to power lines, the electrical interference can corrupt our data transmission. Think of it like trying to hear someone whisper in a noisy room. We need to either move the cable or install shielded cable instead."
>
> Fred nods thoughtfully. "So that's why the networking standards specify minimum distances from electrical sources?"
>
> "Exactly! Physical Layer specifications help us avoid these kinds of problems."

Common Physical Layer technologies include:

| Technology | Maximum Speed | Maximum Distance | Common Use Case |
|------------|---------------|------------------|-----------------|
| Cat 6 UTP | 10 Gbps | 55 meters | Office networks |
| Fiber Optic | 100+ Gbps | Several kilometers | Backbone connections |
| Wi-Fi 6 | 9.6 Gbps | ~30 meters | Wireless networks |

Let's use `ip link show` to see which interfaces are available to our computer (if you have this open in Colab, this is on Google's cloud).


In [20]:
%%bash
ip link show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
9: eth0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 
    link/ether 02:42:ac:1c:00:0c brd ff:ff:ff:ff:ff:ff link-netnsid 0


This shows two interfaces: a **loopback (lo)**, which is a computers way of communicating with itself. We also see an ethernet connection (starting with **eth0**) tha connects the computer to the rest of the network.

In [21]:
import base64
from IPython.display import Image, display
import matplotlib.pyplot as plt

def mm(graph, width=800, height=600):  # Add default dimensions
    graphbytes = graph.encode("utf8")
    base64_bytes = base64.urlsafe_b64encode(graphbytes)
    base64_string = base64_bytes.decode("ascii")
    # Add width and height parameters to the URL
    url = f"https://mermaid.ink/img/{base64_string}?width={width}&height={height}"
    display(Image(url=url))

mm("""
graph TD
    subgraph "Physical Layer"
        Data[Digital Data<br/>1s and 0s] -->|Convert| Signal[Electrical/Light Signals]
        Signal -->|Through| Medium[Transmission Medium]

        subgraph Medium
            C1[Copper Cable<br/>Electrical]
            C2[Fiber Optic<br/>Light]
            C3[WiFi<br/>Radio Waves]
        end
    end

    note["Like turning your voice<br/>into sound waves that<br/>travel through air!"]
"""
)

## Layer 2: The Data Link Layer

If the Physical Layer is like a road system, the **Data Link Layer** is like the basic rules of driving - staying in your lane, following traffic signals, and avoiding collisions. This layer transforms the raw bit transmission of the Physical Layer into something more reliable and organized. It's the first layer that actually brings some intelligence to data transmission, organizing raw bits into structured "frames" of data and implementing basic protocols for controlling access to the physical medium.

The Data Link Layer solves fundamental problems like: How do devices share a single network cable without interfering with each other? How can we detect if data was corrupted during transmission? How do we identify different devices on the same physical network? These capabilities make it possible for multiple devices to share network resources efficiently and reliably.

> **Real-World Example**: Back at the detective agency, the gang is setting up a new network switch for their growing team.
>
> "Check this out, gang," Fred says, pointing to the switch's interface. "Every device connected to our network has what's called a MAC address. It's like a license plate for network devices."
>
> Daphne peers at her laptop's network settings. "So that's why my laptop's MAC address is different from Velma's, even though we have the same model?"
>
> "Exactly!" Velma jumps in. "Every network interface gets a unique MAC address during manufacturing. When I send case files to our network printer, the Data Link Layer uses these addresses like addresses on an envelope, making sure the data frames get to the right device on our local network."
>
> "Like, wow," Shaggy adds, watching the switch's LED lights blink. "So it's kind of like each computer has its own mailbox?"
>
> "That's a great analogy, Shaggy! And just like a real mailbox, the Data Link Layer also makes sure our data isn't damaged during delivery. If a frame gets corrupted, it requests retransmission automatically."

Important concepts at this level include:

- **MAC Addresses**: Unique physical addresses assigned to network interfaces
- **Frame Structure**: Organizing data into **frames** with **headers** and error-checking
- **Error Detection**: Identifying and potentially correcting transmission errors
- **Flow Control**: Managing the rate of data transmission between nodes
- **Media Access Control**: Determining when devices can transmit on shared media

The Data Link Layer is often subdivided into two sublayers:
1. **Logical Link Control (LLC)**: Handles flow control and error checking
2. **Media Access Control (MAC)**: Manages access to the shared physical medium


In [22]:
mm("""
graph TD
    subgraph "MAC Address System - Like Name Tags for Computers"
        M1[Computer A<br/>MAC: 00:1A:2B:3C]
        M2[Computer B<br/>MAC: 00:4D:5E:6F]

        M1 -->|Frame 1| SW{Switch}
        M2 -->|Frame 2| SW

        SW -->|Check MAC| D1[Delivery]
        D1 -->|Error Check| D2[Confirm]
    end

    note["Think of it like putting<br/>letters in envelopes with<br/>exact apartment numbers!"]
    """)

## Layer 3: The Network Layer

While the Data Link Layer handles communication between directly connected devices, the **Network Layer** extends this connectivity across different networks, potentially spanning the globe. Think of it as adding the postal system to our previous road analogy - now we need to worry about not just local delivery, but routing packages between cities and countries.

The Network Layer introduces the concept of logical addressing (most commonly IP addresses) and routing. Unlike MAC addresses, which are tied to specific hardware, IP addresses can be assigned and changed as needed to create logical network organizations. This flexibility makes it possible to build large, complex networks and handle the dynamic nature of internet routing.

This layer's key responsibilities include:

- **IP Addressing**: Assigning and managing logical addresses (IPv4/IPv6)
- **Routing**: Determining the best path for packets across networks
- **Packet Forwarding**: Moving packets between different networks
- **Fragmentation**: Breaking large packets into smaller ones when necessary

The most well-known Network Layer protocol is the **Internet Protocol (IP)**, which comes in two main versions:
- **IPv4**: Uses 32-bit addresses (e.g., 192.168.1.1)
- **IPv6**: Uses 128-bit addresses (e.g., 2001:0db8:85a3:0000:0000:8a2e:0370:7334)

> **Real-World Example**: The Scooby Team's detective agency is expanding, opening a new satellite office across town.
>
> "Like, Velma, I'm confused," Shaggy says, staring at the network diagram. "How will the computers in our new office find our database server here in the main office?"
>
> "Excellent question!" Velma pulls out her tablet and opens a network visualization tool. "Remember how we talked about MAC addresses before? Well, those only work for devices on the same local network. For communication between offices, we need Layer 3 addressing - specifically, IP addresses."
>
> She draws a quick sketch showing the two offices. "See, we'll assign each office its own range of IP addresses. Our main office will use 192.168.1.x, and the new office will use 192.168.2.x. The routers will use these addresses like a map, figuring out the best path for data to travel between offices."
>
> Fred scratches his head. "But what if we open more offices? Or need to work from home?"
>
> "That's the beauty of the Network Layer!" Velma enthuses. "IP addressing is hierarchical and flexible. We can keep adding new networks and the routers will automatically learn how to reach them. It's like having a GPS system for our data!"

To how all this works, let's try running `traceroute`, which shows the "route" (from origin to router to router to (eventually) our destination).

In [23]:
%%bash
apt install traceroute > /dev/null
traceroute -n "www.google.com"

traceroute to www.google.com (172.253.115.105), 30 hops max, 60 byte packets
 1  172.28.0.1  0.044 ms  0.018 ms  0.010 ms
 2  142.250.209.94  2.627 ms * *
 3  172.253.67.67  2.081 ms 172.253.67.63  0.818 ms 172.253.66.147  0.656 ms
 4  142.250.209.39  1.605 ms 142.251.244.159  0.789 ms 142.250.209.97  0.756 ms
 5  142.250.210.0  3.048 ms 172.253.72.198  0.795 ms 142.250.209.244  0.856 ms
 6  172.253.66.155  1.692 ms * 172.253.66.201  0.927 ms
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * 172.253.115.105  0.795 ms *






Here, we see a number of **hops** through the network. (Note that our initial, "local" hop is very fast--this occurs within our local area network.). The * * * means the details of the rouer in question are hidden from.

In [24]:
## Network Layer
mm("""
flowchart LR
    start[Mystery Van] -->|IP Address| router1{Router 1}
    router1 -->|Path A| router2{Router 2}
    router1 -->|Path B| router3{Router 3}
    router2 -->destination[HQ Computer]
    router3 -->destination

    note1[Like Scooby following<br/>clues to find the<br/>shortest path home!]

    style start fill:#f96
    style destination fill:#9f6
    style router1 fill:#69f
    style router2 fill:#69f
    style router3 fill:#69f""")

## Layer 4: The Transport Layer

Imagine you're sending a large package across the country, but it's too big to ship as a single unit. You'd need to break it into smaller boxes, make sure each one arrives safely, and ensure they're reassembled in the correct order at the destination. This is essentially what the Transport Layer does with network communications. The **Transport Layer** serves as the crucial bridge between the basic network services and the higher-level application functions, ensuring that data reaches its destination accurately and efficiently.

This layer introduces one of the most fundamental concepts in networking: the distinction between connection-oriented and connectionless communication. Think of it like the difference between a phone call and sending a letter. A phone call (like TCP) establishes a dedicated connection and ensures both parties are ready to communicate, while a letter (like UDP) is simply sent out with hope it reaches its destination. Both methods have their place in modern networking, and understanding when to use each is crucial for building efficient network applications.

- **Transmission Control Protocol (TCP)**: Provides reliable, ordered, and error-checked delivery
- **User Datagram Protocol (UDP)**: Offers fast, connectionless delivery without guaranteed reliability

Key Transport Layer functions include:
- **Port Numbers**: Identifying specific applications and services
- **Segmentation**: Breaking data into manageable chunks
- **Flow Control**: Preventing sender from overwhelming receiver
- **Error Recovery**: Detecting and retransmitting lost segments


> **Real-World Example**: At the Scooby Detective Agency, Velma is setting up their new case management system. The gang has been struggling with lost case files and mixed-up evidence photos in their database.
>
> "Like, Velma, I don't get it," Shaggy scratches his head, looking at the network diagram. "Sometimes our uploads work perfectly, and other times they're all scrambled up!"
>
> "That's because we're using the wrong transport protocol," Velma explains, adjusting her glasses. "Think of it like organizing our evidence boxes. When we ship regular mail to clients, a few delayed or out-of-order letters aren't a huge problem. That's like UDP - fast but not guaranteed. But with case files, we need everything in perfect order, like having a dedicated courier who confirms delivery of each piece of evidence. That's what TCP does for us."
>
> Fred nods in understanding. "So for our database connections, we'll use TCP to ensure all case details are transmitted reliably and in order. But for our security camera feeds, we can use UDP since it's faster and a few dropped frames won't matter much."


In [25]:
## Transport Layer
mm("""
graph TD
    subgraph "TCP: When Everything Must Arrive"
        T1[Message: 'SCOOBY'] --> |Split| T2[S C O O B Y]
        T2 --> |Number| T3["1:S 2:C 3:O 4:O 5:B 6:Y"]
        T3 --> |Send| T4["Confirm Each Piece ✓"]
        Note1[Like sending a pizza<br/>and making sure<br/>all slices arrive!]
    end

    subgraph "UDP: When Speed Matters Most"
        U1[Live Video] --> |Stream| U2[Quick Delivery]
        U2 --> |No Waiting| U3[Accept Some Loss]
        Note2[Like talking on<br/>phone - some static<br/>is OK!]
    end
    """)

## Layer 5: The Session Layer

Communication in modern networks is rarely as simple as sending a single message. Instead, applications often need to maintain ongoing conversations, tracking multiple exchanges of information over time. The **Session Layer** acts like a sophisticated conversation coordinator, managing these complex dialogues between applications.

Consider what happens when you're conducting a video call while simultaneously sharing files and using a chat window. Each of these activities represents a different session, and they all need to be managed independently yet cohesively. The Session Layer handles this orchestration, ensuring that if your file transfer is interrupted, it doesn't affect your video call, and you can resume the transfer from where it left off rather than starting over.

This layer introduces the concept of **dialog control**, determining which party can transmit at any given time - similar to how a moderator might manage speakers in a debate. This becomes particularly important in situations where both parties shouldn't transmit simultaneously to avoid confusion or data corruption.


| Session Layer Function | Purpose | Example |
|----------------------|----------|----------|
| Authentication | Verify identity | Login sessions |
| Authorization | Control access | Permission checks |
| Session Restoration | Resume interrupted sessions | Recovery points |

> **Real-World Example**: The Scooby gang is implementing a new system for retrieving high quality video interview of suspects from a remote server. Daphne takes the lead on setting up the software.
>
> "This is giving me a headache," Fred complains after another failed download. "We lost the connection halfway through, and now we have to start the download over!"
>
> "Not necessarily," Daphne responds, pulling up the session settings. "See, our software uses session layer checkpointing. Think of it like placing bookmarks in a mystery novel - if we get interrupted, we can pick up right where we left off."
>
> She demonstrates by configuring the checkpointing interval. "Now it'll save our progress every five minutes. If the connection drops, we can resume from the last checkpoint instead of starting over. It's like having a save point in a video game!"
>
> Shaggy brightens up. "Like, wow! That would have saved us so much time with the Miner Forty-Niner case last week!"

In [26]:
## Session Layer Processes
mm("""
stateDiagram-v2

    [*] --> StartSession: Begin Investigation
    StartSession --> ActiveSession: Connection Established

    state ActiveSession {
        [*] --> Talking: Data Flowing
        Talking --> Checkpointing: Save Progress
        Checkpointing --> Talking: Resume
        Talking --> Paused: Temporary Stop
        Paused --> Talking: Resume
    }

    ActiveSession --> EndSession: Close Investigation
    EndSession --> [*]

    note right of ActiveSession
        Like keeping track of your place
        in a conversation:
        - Who's talking
        - Taking turns
        - Remembering where you left off
    end note""")

## Presentation Layer

The **Presentation Layer** serves as the network's universal translator, ensuring that information remains meaningful as it moves between different systems. In our increasingly interconnected world, this layer's importance cannot be overstated. Consider trying to read a document written in a different language - not only do you need to understand the individual words, but you also need to know the character encoding, formatting, and cultural context. The Presentation Layer handles similar challenges in the digital realm.

Key responsibilities include:

- **Data Translation**: Converting between different formats
- **Encryption/Decryption**: Protecting data confidentiality
- **Compression**: Reducing data size for transmission
- **Character Code Translation**: Converting between **character encoding systems**

Common Presentation Layer standards include:
- **ASCII/Unicode**: Character encoding
- **JPEG/GIF/PNG**: Image formats
- **SSL/TLS**: Encryption protocols

This layer becomes particularly crucial when dealing with international communications or when connecting different types of systems. For example, when your modern web browser connects to a legacy banking system, the Presentation Layer handles the critical task of translating data formats between the two systems, ensuring that your account balance appears as actual numbers rather than raw binary data.

> **Real-World Example**: At the Scooby Detective Agency, the team is struggling with a puzzling case involving international clients and evidence files from different systems.
>
> "I don't understand," Fred frowns at his screen. "The surveillance photos from our client in Japan are just showing up as gibberish."
>
> "Like, man, and the measurements in their report are all wrong!" Shaggy adds. "It says the suspect is 170 units tall. What does that even mean?"
>
> Velma adjusts her glasses with a knowing smile. "Ah, we're dealing with classic Presentation Layer issues. First, their system is using a different character encoding for the image metadata - probably Shift-JIS instead of UTF-8. And those measurements? Their system is recording in centimeters while ours expects inches."
>
> She begins configuring their case management software. "The Presentation Layer handles all these conversions automatically when set up correctly. Think of it like having a universal translator. It can handle differences in:
> - Character encodings
> - Data formats
> - Encryption systems
> - Image and media formats
>
> "Watch this," she demonstrates, adjusting the settings. Suddenly, the photos display correctly, complete with readable Japanese metadata, and the measurements automatically convert to inches. "Now our systems can talk to each other seamlessly!"

The need for the Presentation Layer has become increasingly crucial in our interconnected world. Consider how many different types of devices and systems you interact with daily - Windows PCs, Macs, smartphones, IoT devices - each potentially using different internal data formats. The Presentation Layer's ability to handle these translations transparently is what makes this diverse ecosystem work together seamlessly.

Here's a simple example of data encoding:


In [27]:
# Example of presentation layer encoding
message = "Scooby Doo"

# Convert to bytes and encode in base64
encoded = message.encode('utf-8').hex()
print(f"Original: {message}")
print(f"Encoded: {encoded}")

Original: Scooby Doo
Encoded: 53636f6f627920446f6f


In [28]:
mm("""
flowchart TD
    subgraph "Converting Data So Everyone Understands"
        Input[Raw Data from Different Devices] --> Convert

        subgraph Convert["Translation Process"]
            E["Encryption 🔒<br/>Making data secret"] -->
            C["Compression 📦<br/>Making data smaller"] -->
            F["Format Conversion 🔄<br/>Making data readable"]
        end

        Convert --> Output[Data Everyone Can Read]

        note1["JPEG Photos<br/>from Crime Scene"]
        note2["MP4 Videos<br/>of Evidence"]
        note3["PDF Reports"]

        Input --> note1 & note2 & note3
    end""")

## Layer 7: The Application Layer

At last, we reach the layer that users actually interact with - the **Application Layer**. This topmost layer is where all the familiar networking applications and protocols operate: web browsers, email clients, file transfer tools, and messaging apps. While end users never need to think about the lower layers (just as you don't think about the engine when steering a car), everything they do at the Application Layer depends on the services provided by the layers below.

The Application Layer provides protocols and services that directly serve end-user applications, including:
- **HTTP/HTTPS**: Web browsing
- **FTP/SFTP**: File transfer
- **SMTP/IMAP/POP3**: Email
- **DNS**: Domain name resolution
- **SSH**: Secure remote access
- **DHCP**: Automatic network configuration

> **Real-World Example**: The Scooby Gang is modernizing their detective agency with a new web-based case management system.
>
> "Okay, gang," Fred announces during a team meeting. "Our new system will let us access case files from anywhere, but we need to make sure it's secure. Velma, can you explain how this works?"
>
> Velma brings up a diagram on the projector. "Of course! Let's follow what happens when Daphne logs into our case management system from her laptop at a client site."
>
> She draws a series of steps on the whiteboard:
> 1. "First, Daphne's laptop uses DNS (Domain Name System) to find our server's IP address - like looking up our address in a phone book."
> 2. "Then her browser establishes a secure HTTPS connection - think of it as a private, encrypted tunnel to our server."
> 3. "The server uses SMTP to send her a login verification code via email."
> 4. "Once she's in, the system uses HTTPS to serve web pages and FTP to handle file uploads."
>
> "Like, that's a lot of different protocols!" Shaggy observes.
>
> "Exactly!" Velma agrees. "The Application Layer is where all these high-level protocols work together to create the seamless experience users expect. Each protocol is specialized for a specific task, but they all rely on the lower layers to handle the actual data transmission."




In [29]:
## Application Layer
mm("""
sequenceDiagram
    title Layer 7 - Application Layer: Like Using Mystery Inc's Software
    participant User as Mystery Gang
    participant Browser as Web Browser
    participant Email as Email Client
    participant Files as File Transfer

    Note over User,Files: "Think of this layer as the part we actually use!"
    User->>Browser: Click to view clue database
    Browser->>Email: Share evidence with police
    User->>Files: Upload crime scene photos
    Note over Browser,Files: Common Layer 7 Programs:<br/>- Web Browsers<br/>- Email Clients<br/>- File Transfer Apps<br/>- Chat Programs
    """)

## Pulling It All Together: The OSI Model in Action

Now that we've explored all seven layers, let's follow a typical network interaction through the entire stack. Imagine Daphne is uploading evidence photos to the case management system:

1. **Application Layer**: The web browser prepares the HTTP upload request
2. **Presentation Layer**: The images are compressed and encrypted
3. **Session Layer**: Maintains the authenticated session with the server
4. **Transport Layer**: TCP ensures reliable delivery of the files
5. **Network Layer**: IP routing gets the data to the server
6. **Data Link Layer**: Ethernet handles local network transmission
7. **Physical Layer**: The actual signals travel across network cables

Understanding this layered approach helps tremendously with network troubleshooting. By identifying which layer is causing an issue, we can focus our efforts appropriately:
- Can't reach any websites? Check Network Layer (IP configuration)
- Files corrupted during transfer? Check Presentation Layer settings
- Connection keeps dropping? Might be a Session Layer problem
- Can't connect to the network at all? Start with the Physical Layer

Remember, while the OSI model may seem abstract, its principles are reflected in every network interaction you make. Whether you're browsing the web, sending an email, or video chatting, all seven layers are working together to make it happen.

In [30]:
mm("""
sequenceDiagram
    title How Your Message Actually Travels: A Complete Journey

    participant A as Application<br/>(Write Email)
    participant P as Presentation<br/>(Format Data)
    participant S as Session<br/>(Start Connection)
    participant T as Transport<br/>(Split Message)
    participant N as Network<br/>(Find Route)
    participant D as Data Link<br/>(Frame Data)
    participant Ph as Physical<br/>(Send Signals)

    Note over A,Ph: Sending "Meet at HQ" to the gang
    A->>P: Raw text
    P->>S: Formatted data
    S->>T: Connection ready
    T->>N: Data chunks
    N->>D: Routed packets
    D->>Ph: Frames
    Note over Ph: Electrical signals sent!""")

## Introduction to tcpdump: A Network Detective's Tool
tcpdump is like a detective's magnifying glass for network traffic. Just as detectives examine evidence at a crime scene, tcpdump lets us capture and examine packets (pieces of data) traveling across a network. It's a powerful command-line tool that helps us understand network communication.

The basic structure of a tcpdump command is:
```bash
tcpdump [options] [filters]
```

When working with a capture file (like our detective agency example):
```bash
# Read from our captured file instead of live network
tcpdump -r detective_agency.pcap

# Make it more readable by not converting addresses to names
tcpdump -r detective_agency.pcap -n

# Limit output to 5 packets to avoid overwhelming output
tcpdump -r detective_agency.pcap -n -c 5
```

Let's explore our detective agency's network traffic layer by layer. Our PCAP file contains traffic from detectives' workstations, their database access, and external connections.


In [31]:
# You must run this cell to install tcpdump
# and to download sample "packet capture file"
# Get ready for network analysis
!apt install tcpdump > /dev/null
!wget "https://github.com/brendanpshea/intro_to_networks/raw/main/pcaps/detective_agency.pcap" -q -nc





In [32]:
%%bash
# Read from our captured file instead of live network
# tcpdump -r detective_agency.pcap

# Make it more readable by not converting addresses to names
# tcpdump -r detective_agency.pcap -n

# Limit output to 5 packets to avoid overwhelming output
tcpdump -r detective_agency.pcap  -c 20

17:25:47.560786 ARP, Request who-has 192.168.1.1 tell 192.168.1.10, length 28
17:25:47.660786 ARP, Reply 192.168.1.1 is-at 00:1a:2b:3c:4d:5e (oui Unknown), length 28
17:25:47.760786 ARP, Request who-has 192.168.1.1 tell 192.168.1.11, length 28
17:25:47.860786 ARP, Reply 192.168.1.1 is-at 00:1a:2b:3c:4d:5e (oui Unknown), length 28
17:25:47.960786 ARP, Request who-has 192.168.1.1 tell 192.168.1.12, length 28
17:25:48.060786 ARP, Reply 192.168.1.1 is-at 00:1a:2b:3c:4d:5e (oui Unknown), length 28
17:25:48.160786 ARP, Request who-has 192.168.1.1 tell 192.168.1.13, length 28
17:25:48.260786 ARP, Reply 192.168.1.1 is-at 00:1a:2b:3c:4d:5e (oui Unknown), length 28
17:25:48.360786 ARP, Request who-has 192.168.1.1 tell 192.168.1.14, length 28
17:25:48.460786 ARP, Reply 192.168.1.1 is-at 00:1a:2b:3c:4d:5e (oui Unknown), length 28
17:25:48.560786 ARP, Request who-has 192.168.1.1 tell 192.168.1.15, length 28
17:25:48.660786 ARP, Reply 192.168.1.1 is-at 00:1a:2b:3c:4d:5e (oui Unknown), length 28
17:2

reading from file detective_agency.pcap, link-type EN10MB (Ethernet), snapshot length 65535


TCPDump allows us to look at records of different data **packets** that have been sent through our network. In general, we can see things like:

| **Parameter** | **Description** | **Example** |
| --- | --- | --- |
| **Time** | Timestamp showing when the packet was captured. | `17:25:49.860786` |
| **Source (MAC/IP)** | Originating address of the packet, either a MAC or IP address, depending on layer. | `192.168.1.11` (IP), `00:1b:2c:3d:4e:5f` (MAC) |
| **Destination (MAC/IP)** | Destination address of the packet, again either a MAC or IP, depending on layer. | `203.0.113.10` (IP), `ff:ff:ff:ff:ff:ff` (MAC) |
| **Protocol** | Specifies the protocol type (e.g., ARP, IP, BOOTP, TCP, HTTP). | `TCP` or `ARP` |
| **Flags** | TCP-specific markers indicating packet behavior, such as SYN (connection initiation) or ACK (acknowledgment). | `[S]` (SYN) or `[S.]` (SYN-ACK) |
| **Sequence/ACK Number** | Sequence or acknowledgment number to help track and synchronize packet delivery. | `seq 1000`, `ack 1001` |
| **Window Size** | Amount of data (in bytes) that the sender is willing to accept in one window of transmission. | `win 8192` |
| **Packet Length** | Total length of the packet, often including headers and content. | `length 42` |
| **Message Content** | Occasionally displayed content of application-layer messages (like HTTP requests). | `HTTP: GET / HTTP/1.1` |

In the sections that follow, we'll use tcpdump to help decipher what happens at different levles.

## Encapsulation and Decapsulation Through the Network Layers

**Data encapsulation** is the process of wrapping data with header information as it moves down the OSI layers, preparing it for transmission across the network. Conversely, **decapsulation** is the process of removing these headers as the data moves up the OSI layers at the receiving end.

Think of each header as a specialized envelope containing specific instructions for handling that layer's responsibilities. Each layer adds its own header (and sometimes a trailer) to the data, treating everything it receives from the layer above as pure payload, without concerning itself with the contents.

> **Real-World Example**: At the Scooby Detective Agency, Velma is explaining their new secure file transfer system to the team.
>
> "Like, I don't get it," Shaggy scratches his head, looking at a network packet capture. "Why does our evidence photo have all this extra stuff around it?"
>
> Velma pulls out a set of Russian nesting dolls from her desk. "Let me demonstrate with these. Imagine our photo is the smallest doll. When we send it across the network, each layer adds its own 'doll' around it, each with specific information."
>
> She begins nesting the dolls: "The Application Layer wraps the photo with information about its format. The Transport Layer adds port numbers in its doll. The Network Layer adds IP addresses in an even bigger doll. And finally, the Data Link Layer adds MAC addresses in the largest doll."
>
> "So when it reaches the other end..." Fred begins.
>
> "Exactly!" Velma finishes. "The recipient unpacks each doll in reverse order, using the information in each layer to properly handle the data, until they finally get to our original photo."


## Ethernet Header: The First Wrapper

The **Ethernet header** is added by the Data Link Layer and serves as the outermost wrapper for data traveling across local networks. Think of it as the detailed addressing and handling instructions on a postal package.

An Ethernet header contains several crucial pieces of information:
- **Destination MAC Address** (6 bytes): The physical address of the intended recipient
- **Source MAC Address** (6 bytes): The physical address of the sender
- **EtherType** (2 bytes): Indicates what type of data follows (usually IPv4 or IPv6)


### TCP Dump: (Data Link Layer - The Physical Address Layer)
Let's use tcdump to see what's happening at the data link layer (with our ethernet headers):

In [33]:
%%bash
# See conversations at the ethernet (MAC address) level
# -r: read from our capture file
# -e: show ethernet header info (MAC addresses)
# -n: don't convert addresses to names (show raw data)
# -c 5: only show 5 packets
tcpdump -r detective_agency.pcap -e -n -c 10

17:25:47.560786 00:1b:2c:3d:4e:5f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.1.1 tell 192.168.1.10, length 28
17:25:47.660786 00:1a:2b:3c:4d:5e > 00:1b:2c:3d:4e:5f, ethertype ARP (0x0806), length 42: Reply 192.168.1.1 is-at 00:1a:2b:3c:4d:5e, length 28
17:25:47.760786 00:1c:2d:3e:4f:6a > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.1.1 tell 192.168.1.11, length 28
17:25:47.860786 00:1a:2b:3c:4d:5e > 00:1c:2d:3e:4f:6a, ethertype ARP (0x0806), length 42: Reply 192.168.1.1 is-at 00:1a:2b:3c:4d:5e, length 28
17:25:47.960786 00:1d:2e:3f:4f:6b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.1.1 tell 192.168.1.12, length 28
17:25:48.060786 00:1a:2b:3c:4d:5e > 00:1d:2e:3f:4f:6b, ethertype ARP (0x0806), length 42: Reply 192.168.1.1 is-at 00:1a:2b:3c:4d:5e, length 28
17:25:48.160786 00:1e:2f:3f:4f:6c > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.1.1 tell 192.168

reading from file detective_agency.pcap, link-type EN10MB (Ethernet), snapshot length 65535


This sample output shows ARP (Address Resolution Protocol) conversations. We see things like:

| **Parameter** | **Description** | **Example** |
| --- | --- | --- |
| **Time** | Timestamp showing when the packet was captured. | `17:25:47.560786` |
| **Source MAC** | Originating MAC address of the device sending the packet. | `00:1b:2c:3d:4e:5f` |
| **Destination MAC** | MAC address of the device intended to receive the packet. Broadcast is shown as `ff:ff:ff:ff:ff:ff`. | `ff:ff:ff:ff:ff:ff` |
| **Protocol** | The specific protocol in use at this layer (e.g., ARP). | `ARP (0x0806)` |
| **Message** | ARP-specific message content, often indicating which device is being queried. | `Request who-has 192.168.1.1 tell 192.168.1.10` |
| **Packet Length** | Total length of the packet in bytes. | `42 bytes` |

> **Real-World Example**: Back at the detective agency, Daphne notices something odd about their network traffic.
>
> "Velma, come look at this," she calls. "Every device on our network is getting flooded with traffic meant for other devices."
>
> Velma examines their network switch configuration. "Aha! Our switch is operating in 'hub mode' instead of properly using the Ethernet headers to direct traffic. See, normally the switch reads the destination MAC address in the Ethernet header and sends the frame only to that specific port. But right now, it's broadcasting everything everywhere!"
>
> "Like, isn't that bad for performance?" Shaggy asks.
>
> "Exactly right, Shaggy. The Ethernet header is designed to enable efficient delivery using MAC addresses, but only if our network equipment is configured to use them properly."

In [54]:
mm("""
---
title: Ethernet Frame (Layer 2)
---
packet-beta
  0-47: "Destination MAC Address (e.g., 00:1A:2B:3C:4D:5E)"
  48-95: "Source MAC Address (e.g., 00:2B:3C:4D:5E:6F)"
  96-111: "EtherType (0x0800=IPv4, 0x86DD=IPv6)"
  112-255: "Data (Encapsulated IPv4 Packet)"
""")

## Internet Protocol (IP) Header: Enabling Internet Routing

While the Ethernet header handles local delivery, the **IP header** enables routing across the vast internet. Added by the Network Layer, it contains the information needed to get data from its source to its destination across multiple networks.

The IPv4 header (the most common version) includes:
- **Version** (4 bits): Indicates IPv4 or IPv6
- **Header Length** (4 bits): Size of the IP header
- **Type of Service** (8 bits): Specifies delivery priority
- **Total Length** (16 bits): Size of the entire packet
- **Source IP Address** (32 bits): Sender's IP address
- **Destination IP Address** (32 bits): Recipient's IP address
- Several other fields for fragmentation, time-to-live, etc.


> **Real-World Example**: The gang is troubleshooting connectivity to their new branch office.
>
> "The network cable test passed," Fred reports, "but we still can't reach the server at the other office."
>
> "Let's look at the IP headers," Velma suggests, running a packet capture. "See here? The TTL (Time To Live) field in the IP header is reaching zero before our packets get to the destination. That means we have a routing loop somewhere!"
>
> She draws a diagram showing how the IP header's TTL field prevents packets from circulating endlessly when routing problems occur. "Every router decrements the TTL by 1. When it hits zero, the packet is discarded. It's like a game of 'hot potato' with a timer - preventing network gridlock when routing goes wrong."

The IP header exemplifies the layered nature of network protocols. While the Ethernet header handles local delivery between directly connected devices, the IP header enables global routing. A packet might have its Ethernet header rewritten many times as it travels across the internet, but its IP header remains largely unchanged, ensuring it eventually reaches its final destination.

### tcpdump (Network Layer - The IP Address Layer)


In [None]:
%%bash
# Watch HTTP/HTTPS IP addresses communicate
# -n: don't convert addresses to names
# 'tcp port 80 or tcp port 443': only show HTTP and HTTPS packets
# -c 15: limit to 15 packets
tcpdump -r detective_agency.pcap -n 'tcp port 80 or tcp port 443' -c 15


17:25:49.860786 IP 192.168.1.11.63896 > 203.0.113.10.443: Flags [S], seq 1000, win 8192, length 0
17:25:49.960786 IP 203.0.113.10.443 > 192.168.1.11.63896: Flags [S.], seq 2000, ack 1001, win 8192, length 0
17:25:50.060786 IP 192.168.1.11.63896 > 203.0.113.10.443: Flags [.], ack 1, win 8192, length 0
17:25:50.160786 IP 192.168.1.11.63896 > 203.0.113.10.443: Flags [P.], seq 1:54, ack 1, win 8192, length 53
17:25:50.260786 IP 192.168.1.12.51966 > 203.0.113.10.443: Flags [S], seq 1000, win 8192, length 0
17:25:50.360786 IP 203.0.113.10.443 > 192.168.1.12.51966: Flags [S.], seq 2000, ack 1001, win 8192, length 0
17:25:50.460786 IP 192.168.1.12.51966 > 203.0.113.10.443: Flags [.], ack 1, win 8192, length 0
17:25:50.560786 IP 192.168.1.12.51966 > 203.0.113.10.443: Flags [P.], seq 1:54, ack 1, win 8192, length 53
17:25:52.460786 IP 192.168.1.11.55538 > 203.0.113.10.80: Flags [S], seq 0:46, win 8192, length 46: HTTP: GET / HTTP/1.1
17:25:52.560786 IP 192.168.1.12.50820 > 203.0.113.10.80: Flags

reading from file detective_agency.pcap, link-type EN10MB (Ethernet), snapshot length 65535


Here we see recordis of various http (web) connections with other sites. Our parameter are:

| **Parameter** | **Description** | **Example** |
| --- | --- | --- |
| **Time** | Timestamp showing when the packet was captured. | `17:25:49.860786` |
| **Source IP** | IP address of the device that sent the packet. | `192.168.1.11` |
| **Destination IP** | IP address of the device intended to receive the packet. | `203.0.113.10` |
| **Protocol** | Network protocol in use, typically IP for this layer. | `IP` |
| **Flags** | TCP flags used to indicate control information, like `[S]` (SYN) for starting a connection. | `[S]`, `[S.]`, `[.]` |
| **Packet Content** | Content of the packet, often including sequence and acknowledgment numbers for TCP packets. | `seq 1000, ack 1001` |
| **Packet Length** | Total length of the packet in bytes. | `length 53` |

In [47]:
mm("""
---
title: IPv4 Packet (Layer 3)
---
packet-beta
  0-3: "Version (4)"
  4-7: "IHL (5-15, typically 5)"
  8-15: "ToS/DSCP (e.g., 0x00=Best Effort)"
  16-31: "Total Length (20-65535 bytes)"
  32-47: "Identification (Unique ID)"
  48-50: "Flags (Don't Fragment, More Fragments)"
  51-63: "Fragment Offset (0-8191)"
  64-71: "TTL (Typically 64 or 128)"
  72-79: "Protocol (6=TCP, 17=UDP)"
  80-95: "Header Checksum"
  96-127: "Source IP (e.g., 192.168.1.1)"
  128-159: "Destination IP (e.g., 10.0.0.1)"
  160-191: "Options (if IHL > 5)"
  192-255: "Data (Encapsulated TCP/UDP)"
  """)

## Transport Layer Headers: TCP and UDP

The Transport Layer adds yet another crucial wrapper to our data, using either TCP or UDP headers depending on the needs of the application. Think of TCP as certified mail with delivery confirmation, while UDP is more like regular mail - faster but without guarantees.

### TCP Header Structure

The **TCP header** is more complex than UDP because it contains all the information needed for reliable, ordered delivery:
- **Source Port** (16 bits): Identifies the sending application
- **Destination Port** (16 bits): Identifies the receiving application
- **Sequence Number** (32 bits): Orders the segments
- **Acknowledgment Number** (32 bits): Confirms received segments
- **Window Size** (16 bits): Flow control information
- **Checksum** (16 bits): Error detection
- Various control bits and flags


> **Real-World Example**: The Scooby gang is investigating why their case management system is running slowly.
>
> "Like, Velma, why does uploading evidence photos take forever sometimes?" Shaggy asks, watching a progress bar crawl across his screen.
>
> Velma opens Wireshark and shows him the TCP headers. "See these sequence numbers? They help ensure all pieces of your photo arrive in order. And look at the window size - it's telling the sender to slow down because our network is congested."
>
> "So it's like having a traffic cop managing the flow of data?" Daphne suggests.
>
> "Exactly! And see these acknowledgment numbers? Every time a piece arrives successfully, the receiver sends back a confirmation. If anything gets lost, TCP automatically requests retransmission."

### Understanding TCP Flags

TCP flags are like signal flags on a ship, each conveying specific information about the state of the connection. The main TCP flags are:

- **SYN (Synchronize)**: Initiates a connection
- **ACK (Acknowledgment)**: Confirms received data
- **PSH (Push)**: Pushes buffered data to the application
- **RST (Reset)**: Abruptly terminates a connection
- **FIN (Finish)**: Gracefully closes a connection
- **URG (Urgent)**: Marks urgent data

> **Real-World Example**: Fred is analyzing their network security logs.
>
> "Jeepers! Look at all these failed connection attempts," he points to a stream of logs.
>
> Velma examines the TCP flags. "These are SYN flood attacks - someone's sending thousands of connection requests without completing the handshake. See the SYN flags without corresponding ACK flags? It's like someone repeatedly ringing our doorbell and running away!"
>
> She draws a diagram of normal TCP connection establishment:
> 1. Client sends SYN: "Can we talk?"
> 2. Server responds SYN-ACK: "Yes, let's talk"
> 3. Client sends ACK: "Great, connection established!"


### tcdump: Looking at TCP headers and connections
Let's use tcpdump to look at some tcp headers and connections.

In [None]:
%%bash
tcpdump -r detective_agency.pcap -n 'tcp'

17:25:49.860786 IP 192.168.1.11.63896 > 203.0.113.10.443: Flags [S], seq 1000, win 8192, length 0
17:25:49.960786 IP 203.0.113.10.443 > 192.168.1.11.63896: Flags [S.], seq 2000, ack 1001, win 8192, length 0
17:25:50.060786 IP 192.168.1.11.63896 > 203.0.113.10.443: Flags [.], ack 1, win 8192, length 0
17:25:50.160786 IP 192.168.1.11.63896 > 203.0.113.10.443: Flags [P.], seq 1:54, ack 1, win 8192, length 53
17:25:50.260786 IP 192.168.1.12.51966 > 203.0.113.10.443: Flags [S], seq 1000, win 8192, length 0
17:25:50.360786 IP 203.0.113.10.443 > 192.168.1.12.51966: Flags [S.], seq 2000, ack 1001, win 8192, length 0
17:25:50.460786 IP 192.168.1.12.51966 > 203.0.113.10.443: Flags [.], ack 1, win 8192, length 0
17:25:50.560786 IP 192.168.1.12.51966 > 203.0.113.10.443: Flags [P.], seq 1:54, ack 1, win 8192, length 53
17:25:52.460786 IP 192.168.1.11.55538 > 203.0.113.10.80: Flags [S], seq 0:46, win 8192, length 46: HTTP: GET / HTTP/1.1
17:25:52.560786 IP 192.168.1.12.50820 > 203.0.113.10.80: Flags

reading from file detective_agency.pcap, link-type EN10MB (Ethernet), snapshot length 65535


While we can see many of the same fields we have been talking about earlier, it's worth looking at the following specifically relevant to tcp:

| **Field** | **Explanation** | **Example from Output** |
| --- | --- | --- |
| **Flags** | TCP flags provide control information for connections. Common flags include: `S` (SYN - start connection), `F` (FIN - finish connection), `R` (RST - reset), `P` (PSH - push), and `A` (ACK - acknowledgment). These flags guide connection states, such as the three-way handshake. | `[S]`, `[S.]`, `[P.]`, `[.]` |
| **Sequence Number (seq)** | A 32-bit field that orders the segments. The number is assigned to the first byte in the current packet to ensure segments arrive in sequence. | `seq 1000` |
| **Acknowledgment Number (ack)** | Confirms receipt of data by acknowledging the next expected sequence number. This is critical for reliable data transfer in TCP. | `ack 1001` |
| **Window Size (win)** | A field for flow control, specifying how many bytes the sender is ready to receive. A larger window size can indicate higher bandwidth available for the connection. | `win 8192` |
| **Packet Length (length)** | Total size of the packet, including headers and any payload data. For empty packets (e.g., SYN, ACK packets), this will be `0`; packets with data will show non-zero values. | `length 0`, `length 53` |
| **Payload** | The actual data being transmitted. For example, in HTTP packets, the payload might contain a request like `GET / HTTP/1.1`. If the packet has no payload, `tcpdump` just shows `length 0`. | `HTTP: GET / HTTP/1.1` or `length 0` |

In [49]:
mm("""
---
title: TCP Segment (Layer 4)
---
packet-beta
  0-15: "Source Port (e.g., 80=HTTP, 443=HTTPS)"
  16-31: "Destination Port (e.g., 1024-65535)"
  32-63: "Sequence Number (Random Initial)"
  64-95: "Acknowledgment Number (Next Expected)"
  96-99: "Data Offset (5-15)"
  100-105: "Reserved (000000)"
  106: "URG"
  107: "ACK "
  108: "PSH"
  109: "RST"
  110: "SYN"
  111: "FIN"
  112-127: "Window Size (Flow Control)"
  128-143: "Checksum"
  144-159: "Urgent Pointer (if URG=1)"
  160-191: "Options (e.g., MSS, Window Scale)"
  192-255: "Data (Application Payload)"
  """)


### UDP Header: The Lightweight Alternative

The **UDP header** is much simpler, containing only:
- **Source Port** (16 bits)
- **Destination Port** (16 bits)
- **Length** (16 bits)


Let's take a look at some UDP traffic using tcpdump.


In [None]:
%%bash
tcpdump -r detective_agency.pcap -n 'udp'

17:25:48.760786 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 30:30:31:42:32:43, length 244
17:25:48.860786 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 30:30:31:43:32:44, length 244
17:25:48.960786 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 30:30:31:44:32:45, length 244
17:25:49.060786 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 30:30:31:45:32:46, length 244
17:25:49.160786 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 30:30:31:46:32:46, length 244
17:25:49.260786 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 30:30:32:41:33:42, length 244
17:25:49.360786 IP 192.168.1.10.61163 > 8.8.8.8.53: 0+ A? client_portal.detective-agency.com. (52)
17:25:49.460786 IP 192.168.1.10.60903 > 8.8.8.8.53: 0+ A? email_server.detective-agency.com. (51)
17:25:49.560786 IP 192.168.1.10.59360 > 8.8.8.8.53: 0+ A? cloud_storage.detective-agency.com. (52)
17:25:49.660786 IP 192.168.1.10.54440 > 8.8.8.8.53: 0+ A? case

reading from file detective_agency.pcap, link-type EN10MB (Ethernet), snapshot length 65535


In the above `tcpdump` output, you're seeing **UDP traffic**, which includes **BOOTP/DHCP requests** (used for assigning IP addresses to devices on the network), **DNS queries** (translating domain names to IP addresses), and **NTP requests** (synchronizing time across devices). These examples show common UDP applications where fast transmission is prioritized over strict reliability.

You might note the following fields (along with the usual ones we've seen):

| **Field** | **Explanation** | **Example from Output** |
| --- | --- | --- |
| **Protocol** | High-level protocol, usually specifying BOOTP/DHCP, DNS, or NTP, given their reliance on UDP for fast, connectionless delivery. | `BOOTP/DHCP`, `DNS`, `NTPv3` |
| **Additional Info** | Protocol-specific information about the request, such as the type of DNS query or DHCP request. | `Request from 30:30:31:42:32:43`, `0+ A? client_portal.detective-agency.com` |
| **Packet Length (length)** | Total size of the packet in bytes, including the UDP header and payload. | `length 244`, `length 52`, `length 48` |

UDP does NOT have as many (or as detailed) headers as TCP, as we don't care about things like packet order, missed packets, etc.

In [50]:
mm("""
---
title: UDP Datagram (Layer 4)
---
packet-beta
  0-15: "Source Port (e.g., 53=DNS)"
  16-31: "Destination Port (e.g., 1024-65535)"
  32-47: "Length (8-65535 bytes)"
  48-63: "Checksum (Optional in IPv4)"
  64-255: "Data (Application Payload)"
""")


## Understanding Payload

The **payload** is the actual data being transmitted - the letter inside all those envelopes we've been adding. Everything else we've discussed (headers, flags, etc.) exists solely to ensure the payload reaches its destination correctly.

> **Real-World Example**: "Like, I get all the headers and stuff now," Shaggy says, "but where's our actual evidence photo in all this?"
>
> Velma pulls up a packet capture. "See this part after all the headers? That's our payload - the actual photo data. Everything else is just wrapping paper and delivery instructions to get it where it needs to go."

In [None]:
%%bash
tcpdump -r detective_agency.pcap -n -A 'tcp port 80 or tcp port 443' -c 5

17:25:49.860786 IP 192.168.1.11.63896 > 203.0.113.10.443: Flags [S], seq 1000, win 8192, length 0
E..(....@.}.......q
............P. .....
17:25:49.960786 IP 203.0.113.10.443 > 192.168.1.11.63896: Flags [S.], seq 2000, ack 1001, win 8192, length 0
E..(....@.}...q
................P. .....
17:25:50.060786 IP 192.168.1.11.63896 > 203.0.113.10.443: Flags [.], ack 1, win 8192, length 0
E..(....@.}.......q
............P. .....
17:25:50.160786 IP 192.168.1.11.63896 > 203.0.113.10.443: Flags [P.], seq 1:54, ack 1, win 8192, length 53
E..]....@.|.......q
............P. .q........................................................
17:25:50.260786 IP 192.168.1.12.51966 > 203.0.113.10.443: Flags [S], seq 1000, win 8192, length 0
E..(....@.}.......q
............P. .....


reading from file detective_agency.pcap, link-type EN10MB (Ethernet), snapshot length 65535


In the above results, we see that most of the packets have lenght 0 (no payload!), since they are just being to establish a connection between devices using TCP. The fourth packet DOES have a payload, but we can't actually see it, either here or in real life (since it will be encrypted using HTTPS).

## Maximum Transmission Unit (MTU)

The **MTU** defines the largest "package" that can be sent across a network link. Think of it like weight limits on bridges - exceed them, and you'll need to split your cargo into smaller loads.

Common MTU values:
- Ethernet: 1500 bytes
- PPPoE: 1492 bytes
- Wi-Fi: 2304 bytes

> **Real-World Example**: The gang is trying to upload a large video file from a crime scene.
>
> "Why is the file transfer failing only for this huge video?" Daphne wonders.
>
> "MTU issues," Velma explains. "Our VPN connection has a smaller MTU than our local network. When we try to send packets that are too large, they need to be fragmented. Let me adjust the MTU size..."
>
> She demonstrates with a simple test:


In [None]:
%%bash
apt install net-tools > /dev/null
# look for "mtu" to see the current mtu for eth0
ifconfig eth0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.28.0.12  netmask 255.255.0.0  broadcast 172.28.255.255
        ether 02:42:ac:1c:00:0c  txqueuelen 0  (Ethernet)
        RX packets 6189  bytes 1882418 (1.8 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 5319  bytes 1555395 (1.5 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0







If you look closely at the above, you'll see that eth0 has an mtu of 1500 (which is standard for ethernet).


## Putting It All Together

When data travels across a network, it gets wrapped in multiple layers of headers:
1. Application data becomes the payload
2. Transport Layer adds TCP/UDP header
3. Network Layer adds IP header
4. Data Link Layer adds Ethernet header

At each hop along the way:
- Ethernet headers may change (like updating a shipping label at each post office)
- IP headers guide overall routing
- TCP/UDP headers ensure proper delivery
- The payload remains unchanged until it reaches its final destination

Understanding this encapsulation process is crucial for:
- Network troubleshooting
- Security analysis
- Performance optimization
- Protocol development

Remember: Each layer adds its own header because it can't make assumptions about the layers above or below it. This modularity is what makes networks so flexible and reliable, allowing different protocols and technologies to work together seamlessly.

# Daphne's Guide to Debugging Network Problems Using the OSI Model

Hey there! Daphne Blake here. After years of solving mysteries with the gang, I've learned that debugging network problems is a lot like solving a good mystery - you need a systematic approach and the right tools for the job. I'm going to share my method for tracking down network issues using the OSI model as our detective's framework.

## The Detective's Approach: Start at the Bottom

Just like when we investigate a haunted house, we start with the basics and work our way up. With network problems, that means starting at Layer 1 (Physical) and working up to Layer 7 (Application). Here are some real cases we've solved at the detective agency.

### Case #1: The Vanishing Connection
**Reported Problem:** "Like, Daphne, help! My computer won't connect to anything!"

**Investigation Process:**
1. **Physical Layer (Layer 1)**
   - First check: Is everything plugged in?
   - Found: Network cable was chewed by Scooby! Classic Layer 1 problem.
   - Tool used: Visual inspection
   - Solution: Replace the damaged cable

**Lesson:** Always start with the physical layer. You'd be surprised how many "ghosts in the machine" are actually just unplugged cables!

### Case #2: The Mysterious Address Mix-up
**Reported Problem:** "I can see other computers on the network, but can't connect to them!"

**Investigation Process:**
1. **Physical Layer:** Cables connected? ✓
2. **Data Link Layer (Layer 2)**
   - Used `tcpdump -e` to check ethernet frames
   - Found: Multiple devices had the same MAC address!
   - Tool used: `tcpdump -e -n` to view ethernet headers
   - Solution: Found a misconfigured network card clone, fixed the MAC address

**Lesson:** When devices can see each other but not communicate properly, check Layer 2 addressing.

### Case #3: The Routing Phantom
**Reported Problem:** "We can reach local computers but not the internet!"

**Investigation Process:**
1. **Physical Layer:** Connections good ✓
2. **Data Link Layer:** MAC addresses unique ✓
3. **Network Layer (Layer 3)**
   - Used `traceroute 8.8.8.8` to check routing
   - Found: Default gateway was set incorrectly
   - Tool used: `traceroute`, `ip route show`
   - Solution: Fixed the default gateway configuration

**Lesson:** Can't reach outside networks? Layer 3 is your likely culprit!

### Case #4: The Case of the Slow Evidence Upload
**Reported Problem:** "Evidence photos take forever to upload!"

**Investigation Process:**
1. **Physical Layer:** Good connectivity ✓
2. **Data Link Layer:** No collisions ✓
3. **Network Layer:** Routing correct ✓
4. **Transport Layer (Layer 4)**
   - Used `tcpdump -n 'tcp'` to analyze connections
   - Found: Lots of retransmissions due to wrong TCP window size
   - Tool used: `tcpdump` with TCP filters
   - Solution: Adjusted TCP window size for better performance

**Lesson:** Performance issues often hide in Layer 4. Look for retransmissions and window sizes!

### Case #5: The Encrypted Evidence Enigma
**Reported Problem:** "Can't access the secure evidence server!"

**Investigation Process:**
1. Layers 1-4: All good ✓
2. **Session Layer (Layer 5)**
   - Checked session logs
   - Found: Expired security certificate
   - Tool used: Browser's security information
   - Solution: Updated the server's SSL certificate

**Lesson:** Security and certificate issues usually appear at Layer 5 or 6.

## Daphne's Top Tips

1. **Start at the Bottom:** Always begin with Layer 1 and work up. No use checking application settings if the cable is unplugged!

2. **Document Everything:** Keep notes of what you've checked. Network problems can have multiple issues, and you don't want to repeat steps.

3. **Use the Right Tool:** Each layer has specific tools. Learning to use `tcpdump` is especially important - it's like having a magnifying glass for your network!

4. **Look for Patterns:** Is the problem intermittent? Affecting all users or just one? These clues help identify the layer to investigate.

5. **Keep It Simple:** Start with simple tests before diving into complex analysis. Sometimes a quick `ping` tells you everything you need to know.

Remember, network debugging is just like solving any other mystery - methodical investigation and the right tools will help you crack the case every time!

Jinkies! I hope this guide helps you solve your own network mysteries. Remember, every great detective started as a beginner. Keep practicing, and you'll be solving network mysteries in no time!

-- Daphne

## Glossary
### OSI Model Concepts

| Term | Definition |
|------|------------|
| OSI Network Model | A seven-layer abstraction framework defining how data moves through a network, with each layer providing specific services to the layer above while using services from the layer below |
| Physical Layer (OSI) | Manages the physical connection and electrical/optical signals between systems, including specifications for voltage levels, timing, physical data rates, physical connectors, and media types |
| Data Link Layer (OSI) | Ensures reliable point-to-point data delivery over the physical layer, handling addressing, framing, error detection, and media access control for directly connected nodes |
| Network Layer (OSI) | Performs routing between different networks, handles logical addressing and path determination, manages traffic control, and fragments/reassembles data packets as needed |
| Transport Layer (OSI) | Provides end-to-end communication control, managing segmentation, flow control, error recovery, and optional connection-oriented services |
| Session Layer (OSI) | Controls the dialogues between computers, establishing, managing, and terminating connections between local and remote applications, including synchronization points for long data transfers |
| Presentation Layer (OSI) | Handles data format translation, character code conversion, encryption/decryption, and data compression/decompression between different systems |
| Application Layer (OSI) | Provides high-level APIs for application access to the network, including protocols for file transfer, email, remote access, and other end-user services |

### Data Link Layer Concepts

| Term | Definition |
|------|------------|
| Media Access Control (MAC) | Sublayer protocol determining how devices share the physical medium, including collision detection, avoidance mechanisms, and transmission timing |
| MAC Address | Unique 48-bit physical address permanently assigned to network interfaces during manufacturing, expressed as six pairs of hexadecimal digits |
| Frame | Data unit at the Data Link layer containing source/destination MAC addresses, payload, error checking information, and control fields for reliable delivery |
| Ethernet header | 14-byte structure containing 6-byte destination MAC, 6-byte source MAC, and 2-byte type/length field, followed by payload and frame check sequence |

### Network Layer Concepts

| Term | Definition |
|------|------------|
| Routing | Complex process of determining optimal paths for data through multiple networks using metrics like hop count, bandwidth, delay, and load balancing considerations |
| Packet | Self-contained unit of data carrying routing information and payload, which can be transmitted independently through networks regardless of content |
| Header (Packet) | Metadata structure prepended to packets containing crucial protocol-specific control information for proper delivery and processing |
| Payload (Packet) | Actual user data being transported, encapsulated within the packet structure and isolated from the surrounding delivery mechanism |
| IP Address | Hierarchical numerical identifier specifying both network and host portions, used for logical addressing and routing between different networks |
| IPv4 | 32-bit addressing scheme allowing approximately 4.3 billion unique addresses, written in four octets separated by periods (e.g., 192.168.1.1) |
| IPv6 | 128-bit addressing scheme supporting approximately 3.4×10^38 unique addresses, written in eight groups of four hexadecimal digits (e.g., 2001:0db8:85a3:0000:0000:8a2e:0370:7334) |
| IP header | Complex structure containing version, header length, service type, total length, identification, flags, fragment offset, TTL, protocol, checksum, and addresses |

### Transport Layer Concepts

| Term | Definition |
|------|------------|
| Transmission Control Protocol | Reliable, connection-oriented protocol providing ordered delivery, error checking, flow control, and congestion control through a three-way handshake mechanism |
| User Datagram Protocol | Lightweight, connectionless protocol offering basic transport services without guarantees of delivery, ordering, or integrity for applications prioritizing speed over reliability |
| Port | 16-bit numerical identifier for specific network services or processes, allowing multiple simultaneous connections to/from the same IP address |
| TCP header | 20-byte structure containing source/destination ports, sequence numbers, acknowledgment numbers, flags, window size, checksum, and optional fields |
| UDP header | 8-byte minimal structure containing only source/destination ports, length, and checksum fields for basic multiplexing |
| Window size | Dynamic value indicating buffer space available at the receiver, used for flow control to prevent overwhelming slower receivers |
| SYN | TCP control flag initiating connection establishment, carrying sequence number for synchronization of both ends |
| ACK | TCP control flag acknowledging received data, containing the next expected sequence number for reliable delivery |
| Maximum Transmission Unit | Largest protocol data unit that can be passed through a specific layer of a communications protocol, typically ranging from 1,500 to 9,000 bytes |

### Session & Presentation Concepts

| Term | Definition |
|------|------------|
| Session (network) | Sustained connection between applications with defined beginning and end, maintaining state and allowing for checkpointing and recovery |
| Dialog control | Coordination mechanism determining which party can transmit at what time, including turn-taking and synchronization markers |
| Character encoding | Standardized system for representing text characters as binary data, including schemes like ASCII, Unicode, UTF-8, and others |
| Encryption | Mathematical transformation of data to protect confidentiality, using algorithms and keys to make information unreadable without proper decryption |

### Application Layer & Services

| Term | Definition |
|------|------------|
| Domain Name Service | Hierarchical decentralized naming system translating human-readable domain names into IP addresses, with distributed database servers worldwide |
| Hypertext Transport Protocol | Application protocol for distributed, collaborative, hypermedia information systems, using request-response patterns between clients and servers |
| Hypertext Transport Protocol Secure | Security-enhanced HTTP using TLS encryption to protect against eavesdropping, tampering, and message forgery |
| Dynamic Host Configuration Protocol | Network management protocol automating IP address assignment and providing configuration information to network devices |

### Networking Operations & Tools

| Term | Definition |
|------|------------|
| Traceroute | Network diagnostic tool displaying the route and measuring transit delays between local host and destination, showing each intermediate router hop |
| ip link show | Linux networking command displaying detailed status of all network interfaces, including state, MTU, MAC address, and operational parameters |
| tcpdump | Powerful packet analyzer allowing capture and analysis of network traffic in real-time, with extensive filtering and display options |

### Data Processing Concepts

| Term | Definition |
|------|------------|
| Data encapsulation | Progressive wrapping of data with protocol headers/trailers as it moves down the OSI stack, with each layer adding its own control information |
| Data decapsulation | Systematic removal of protocol headers/trailers as data moves up the OSI stack, with each layer processing and stripping its relevant control information |
| Checksum | Mathematical value calculated from packet contents to detect corruption, using various algorithms depending on the protocol level |