Stop reporting numbers. Start proving where the bottleneck lives — NIC, MTU, CPU/cipher, or firewall.
Almost everyone can run iperf3 and read a number. Almost no one can answer the next question: is that number good, and if not, what is the ceiling? This course teaches network performance testing as a diagnostic discipline, not a single command — the gap between a link's theoretical capacity and the goodput an application actually gets, and how to prove where the bottleneck lives.
The whole course is built around one real investigation. Two machines on a 10 GbE LAN that iperf3 clocked at ~9.9 Gbit/s, yet scp/rsync between them topped out near ~620 MB/s. Every module traces the reasoning that closes that gap: the link is fine, the cipher is the bottleneck, switching SSH from software chacha20-poly1305 to AES-NI aes256-gcm reaches ~1.1 GB/s (~1.8x), raw nc hits ~1.2 GB/s at wire speed, and a ufw default-DROP policy makes a test silently hang instead of failing fast.
After this course you can:
- Distinguish bits vs bytes and line rate vs goodput, and convert fluently (Gbit/s to MB/s and back).
- Measure latency and loss with
pingand read the RTT distribution (min/avg/max/mdev). - Measure throughput with
iperf3— single stream, reverse, and parallel streams — and interpret retransmits. - Read the hardware ceiling: NIC link speed (
/sys/class/net,ethtool), MTU, jumbo frames, and same-switch topology. - Identify the binding constraint when goodput trails capacity: NIC, MTU, CPU/cipher, or firewall.
- Quantify encryption overhead and tune it (AES-NI
aes-gcmvs softwarechacha20; pinning a cipher in~/.ssh/config). - Choose the right transport: encrypted (
scp/rsync) vs raw (nc+tar) on a trusted LAN, and justify the trade-off. - Diagnose a hung test — firewall default-DROP, wrong port, MTU blackhole — systematically rather than by guessing.
- (Build track) implement a small TCP throughput + latency probe in Rust and validate it against
iperf3.
Four weeks that reproduce a real bottleneck investigation end to end. Each lesson ships a concept video, a copy-paste lab for two hosts, and a one-page "what the numbers mean" reference. Every command in the labs was actually run, with the real observed numbers. The full outline is in outline.md; runnable commands are in labs.md.
| Module | Lesson | Core Skill | Grounded In |
|---|---|---|---|
| 1. Measuring the Network | 1.1 Bits, Bytes, and the Ceiling | Gbit/s vs MB/s; line rate vs goodput; the NIC to MTU to CPU/cipher to firewall triage framework | 620 MB/s ≈ 5.0 Gbit/s — "why only half the link?" |
| 1. Measuring the Network | 1.2 Latency and Loss with ping | Reading rtt min/avg/max/mdev; sub-millisecond same-switch baselines; when latency, not bandwidth, is the problem |
0.3 ms avg, 0% loss on a LAN |
| 1. Measuring the Network | 1.3 Throughput with iperf3 | Client/server model; single stream, -R reverse, -P parallel; interpreting retransmits and sender vs receiver |
~9.9 Gbit/s; one stream already saturates |
| 2. Finding the Bottleneck | 2.1 Reading the Hardware Ceiling | NIC speed via /sys/class/net/*/speed and ethtool; MTU and jumbo frames (9000); computing theoretical max goodput |
10 GbE + MTU 9000 → ~9.9 Gbit/s ceiling |
| 2. Finding the Bottleneck | 2.2 The CPU and Crypto Ceiling | Single-core cipher cost; chacha20-poly1305 vs AES-NI aes128-gcm vs aes256-gcm; isolating crypto with dd-over-ssh |
620 MB/s → 966 MB/s → 1.1 GB/s |
| 2. Finding the Bottleneck | 2.3 Encrypted vs Raw Transport | nc + tar at wire speed on a trusted LAN; the throughput vs security trade-off; rsync daemon mode |
Raw nc ≈ 1.2 GB/s (~9.6 Gbit/s) |
| 3. Diagnosis and Building Tools | 3.1 When Tests Lie or Hang | Firewall default-DROP; hang vs refuse; listener checks; MTU blackholes; an "is it me, the host, or the path?" checklist | ufw default-DROP; only 22 + 8081-8083 open |
| 3. Diagnosis and Building Tools | 3.2 Build a Probe in Rust | A minimal TCP throughput sender/receiver plus a TCP-connect RTT prober, built contract-first | Reproducing an iperf3 figure in a few hundred lines |
| 3. Diagnosis and Building Tools | 3.3 Validating Against iperf3 | Matching netprobe to iperf3 within a stated tolerance; reading the gap when it does not |
netprobe vs iperf3 within ±10% |
| 4. Capstone Project | 4.1 Diagnose and Tune a Real Link | Baseline, ceiling math, a proven bottleneck, one tuning change, and before/after re-measurement | The full discipline, end to end |
The build track ships netprobe/ — a minimal, zero-dependency Rust crate that measures TCP throughput and TCP-connect RTT latency (no raw sockets or root required, the same technique tcping-style tools use).
Like every PAIML Rust artifact, netprobe is built contract-first under provable contracts (pv). Its kernel contract — contracts/netprobe-v1.yaml — specifies the throughput, latency, and bits/bytes invariants before any code, and contracts/netprobe-binding.yaml maps each equation to the function that satisfies it. The three kernel functions are:
bytes_per_sec_to_gbit(bps: f64) -> f64— the Module 1 bits-vs-bytes conversion, total over every finite non-negative input.measure_throughput(stream: &mut TcpStream, bytes: u64) -> Result<Throughput, ProbeError>— a short write is anErr, never a truncated success.measure_latency(addr: SocketAddr, count: u32) -> Result<LatencyStats, ProbeError>— guaranteesmin <= avg <= maxandloss_pctin[0, 100], with no division by zero on total loss.
The contract's falsification tests (FALSIFY-NP-001/002/003) ship as cargo test cases, so the invariants are checked on every build. The shell labs in Modules 1-3 are not contract-governed — pv validates Rust functions, not shell one-liners.
Run it:
make rust # fmt + clippy + cargo test (includes the falsify_np_* tests)
make lint-contracts # pv validate over contracts/*-v*.yaml
make check # full quality gate: markdown + contracts + rust + structureTo start your own probe instead of reading the reference, scaffold trait and test stubs straight from the contract:
pv scaffold contracts/netprobe-v1.yamlnetwork-performance-testing/
├── outline.md # Coursera-style course outline (4 weeks, 10 lessons)
├── labs.md # Runnable two-host labs — every command actually run
├── capstone.md # Capstone brief: diagnose and tune a real link
├── netprobe/ # The contract-governed Rust crate (TCP throughput + RTT)
│ ├── src/lib.rs # Kernel functions + contract falsification tests
│ └── Cargo.toml
├── contracts/
│ ├── netprobe-v1.yaml # Provable contract (pv-valid): invariants + obligations
│ └── netprobe-binding.yaml # Maps each equation → implementing function
├── coursera-assets/ # Key terms, reflections, role-plays, course page, banners
├── slides/ # Title-slide SVG animations (<lesson>-title.svg + Frame-0..2)
├── assets/hero.svg # Course hero image
├── Makefile # Quality gates (make rust / lint-contracts / check)
├── .github/workflows/ci.yml # CI: the same gates, enforced on every push
└── LICENSE # MIT
Check-in policy: SVG, Markdown, and Lua sources are committed; rendered PNG and MP4 are not — regenerate them with rmedia.
Diagnose and tune a real link. Given two hosts (a home lab or two cloud VMs in the same placement group), produce a one-page performance report that demonstrates the full discipline: baseline latency and throughput, read the NIC and MTU to compute the theoretical ceiling, prove which constraint binds (controlling for disk and the SSH window with dd-over-ssh), apply exactly one tuning change, and re-measure to quantify the gain.
The build-track variant ships netprobe as a portfolio artifact: a real diagnostic tool that reproduces an iperf3 figure within a stated tolerance. Full brief, deliverables, and the Advanced / Proficient / Developing rubric are in capstone.md.
- Noah Gift — Founder, Pragmatic AI Labs · Duke University faculty
Course content © Pragmatic AI Labs. Code examples (including the netprobe crate) are released under the MIT License.