diff --git a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/_index.md b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/_index.md index 38398a2904..933c5aea16 100644 --- a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/_index.md +++ b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/_index.md @@ -1,20 +1,17 @@ --- -title: Get started with network microbenchmarking and tuning with iperf3 - -draft: true -cascade: - draft: true +title: Microbenchmark and tune network performance with iPerf3 and Linux traffic control minutes_to_complete: 30 -who_is_this_for: This is an introductory topic for performance engineers, Linux system administrators, or application developers who want to microbenchmark, simulate, or tune the networking performance of distributed systems. +who_is_this_for: This is an introductory topic for performance engineers, Linux system administrators, and application developers who want to microbenchmark, simulate, or tune the networking performance of distributed systems. learning_objectives: - - Understand how to use iperf3 and tc for network performance testing and traffic control to microbenchmark different network conditions. - - Identify and apply basic runtime parameters to tune application performance. + - Run accurate network microbenchmark tests using iPerf3. + - Simulate real-world network conditions using Linux Traffic Control (tc). + - Tune basic Linux kernel parameters to improve network performance. prerequisites: - - Foundational understanding of networking principles such as TCP/IP and UDP. + - Basic understanding of networking principles such as Transmission Control Protocol/Internet Protocol (TCP/IP) and User Datagram Protocol (UDP). - Access to two [Arm-based cloud instances](https://learn.arm.com/learning-paths/servers-and-cloud-computing/csp/). author: Kieran Hejmadi @@ -25,13 +22,13 @@ subjects: Performance and Architecture armips: - Neoverse tools_software_languages: - - iperf3 + - iPerf3 operatingsystems: - Linux further_reading: - resource: - title: iperf3 user manual + title: iPerf3 user manual link: https://iperf.fr/iperf-doc.php type: documentation diff --git a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/basic-microbenchmarking.md b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/basic-microbenchmarking.md index 6cf4f0aeef..56ecf0e796 100644 --- a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/basic-microbenchmarking.md +++ b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/basic-microbenchmarking.md @@ -6,17 +6,19 @@ weight: 3 layout: learningpathall --- -## Microbenchmark the TCP connection +With your systems configured and reachable, you can now use iPerf3 to microbenchmark TCP and UDP performance between your Arm-based systems. -You can microbenchmark the bandwidth between the client and server. +## Microbenchmark the TCP connection -First, start `iperf` in server mode on the server system with the following command: +Start by running `iperf` in server mode on the `SERVER` system: ```bash iperf3 -s ``` -You see the output, indicating the server is ready: +This starts the server on the default TCP port 5201. + +You should see: ```output ----------------------------------------------------------- @@ -25,20 +27,23 @@ Server listening on 5201 (test #1) ``` -The default server port is 5201. Use the `-p` flag to specify another port if it is in use. +The default server port is 5201. If it is already in use, use the `-p` flag to specify another. {{% notice Tip %}} -If you already have an `iperf3` server running, you can manually kill the process with the following command. +If you already have an `iperf3` server running, terminate it with: ```bash sudo kill $(pgrep iperf3) ``` {{% /notice %}} -Next, on the client node, run the following command to run a simple 10-second microbenchmark using the TCP protocol. +## Run a TCP test from the client + +On the client node, run the following command to run a simple 10-second microbenchmark using the TCP protocol: ```bash -iperf3 -c SERVER -V +iperf3 -c SERVER -v ``` +Replace `SERVER` with your server’s hostname or private IP address. The `-v` flag enables verbose output. The output is similar to: @@ -68,28 +73,47 @@ rcv_tcp_congestion cubic iperf Done. ``` +## TCP result highlights + +- The`Cwnd` column prints the control window size and corresponds to the allowed number of TCP transactions in flight before receiving an acknowledgment `ACK` from the server. This value grows as the connection stabilizes and adapts to link quality. + +- The `CPU Utilization` row shows both the usage on the sender and receiver. If you are migrating your workload to a different platform, such as from x86 to Arm, this is a useful metric. + +- The `snd_tcp_congestion cubic` and `rcv_tcp_congestion cubic` variables show the congestion control algorithm used. + +- `Bitrate` shows the throughput achieved. In this example, the the `t4g.xlarge` AWS instance saturates its 5 Gbps bandwidth available. -- The`Cwnd` column prints the control window size and corresponds to the allowed number of TCP transactions in flight before receiving an acknowledgment `ACK` from the server. This adjusts dynamically to not overwhelm the receiver and adjust for variable link connection strengths. +![instance-network-size#center](./instance-network-size.png "Instance network size") -- The `CPU Utilization` row shows both the usage on the sender and receiver. If you are migrating your workload to a different platform, such as from x86 to Arm, there may be variations. +## UDP result highlights -- The `snd_tcp_congestion cubic` abd `rcv_tcp_congestion cubic` variables show the congestion control algorithm used. +You can also microbenchmark the `UDP` protocol using the `-u` flag with iPerf3. Unlike TCP, UDP does not guarantee packet delivery which means some packets might be lost in transit. -- This `bitrate` shows the throughput achieved under this microbenchmark. As you can see, the 5 Gbps bandwidth available to the `t4g.xlarge` AWS instance is saturated. +To evaluate UDP performance, focus on the server-side statistics, particularly: -![instance-network-size](./instance-network-size.png) +* Packet loss percentage -### Microbenchmark UDP connection +* Jitter (variation in packet arrival time) -You can also microbenchmark the `UDP` protocol with the `-u` flag. As a reminder, UDP does not guarantee packet delivery with some packets being lost. As such you need to observe the statistics on the server side to see the percent of packets lost and the variation in packet arrival time (jitter). The UDP protocol is widely used in applications that need timely packet delivery, such as online gaming and video calls. +These metrics help assess reliability and responsiveness under real-time conditions. -Run the following command from the client to send 2 parallel UDP streams with the `-P 2` option. +UDP is commonly used in latency-sensitive applications such as: + +* Online gaming + +* Voice over IP (VoIP) + +* Video conferencing and streaming + +Because it avoids the overhead of retransmission and ordering, UDP is ideal for scenarios where timely delivery matters more than perfect accuracy. + +Run the following command from the client to send two parallel UDP streams with the `-P 2` option: ```bash -iperf3 -c SERVER -V -u -P 2 +iperf3 -c SERVER -v -u -P 2 ``` -Looking at the server output you observe 0% of packets where lost for the short test. +Look at the server output and you can see that none (0%) of packets were lost for the short test: ```output [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams @@ -98,8 +122,10 @@ Looking at the server output you observe 0% of packets where lost for the short [SUM] 0.00-10.00 sec 2.51 MBytes 2.10 Mbits/sec 0.015 ms 0/294 (0%) receiver ``` -Additionally on the client side, the 2 streams saturated 2 of the 4 cores in the system. +Additionally on the client side, the two streams saturated two of the four cores in the system: ```output CPU Utilization: local/sender 200.3% (200.3%u/0.0%s), remote/receiver 0.2% (0.0%u/0.2%s) -``` \ No newline at end of file +``` + +This demonstrates that UDP throughput is CPU-bound when pushing multiple streams. \ No newline at end of file diff --git a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/setup.md b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/setup.md index 29a34aa5c7..3495acf973 100644 --- a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/setup.md +++ b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/setup.md @@ -1,26 +1,36 @@ --- -title: Prepare for network performance testing +title: Set up Arm-based Linux systems for network performance testing with iPerf3 weight: 2 ### FIXED, DO NOT MODIFY layout: learningpathall --- -## Configure two Arm-based Linux computers +## Environment setup and Learning Path focus -To perform network performance testing you need two Linux computers. You can use AWS EC2 instances with Graviton processors or any other Linux virtual machines from another cloud service provider. +To benchmark bandwidth and latency between Arm-based systems, you'll need to configure two Linux machines running on Arm. -You will also experiment with a local computer and a cloud instance to learn the networking performance differences compared to two cloud instances. +You can use AWS EC2 instances with Graviton processors, or Linux virtual machines from any other cloud service provider. -The instructions below use EC2 instances from AWS connected in a virtual private cloud (VPC). +This tutorial walks you through a local-to-cloud test to compare performance between: -To get started, create two Arm-based Linux instances, one system to act as the server and the other to act as the client. The instructions below use two `t4g.xlarge` instances running Ubuntu 24.04 LTS. +* Two cloud-based instances +* One local system and one cloud instance -### Install software dependencies +The setup instructions below use AWS EC2 instances connected within a Virtual Private Cloud (VPC). -Use the commands below to install `iperf3`, a powerful and flexible open-source command-line tool used for network performance measurement and tuning. It allows network administrators and engineers to actively measure the maximum achievable bandwidth on IP networks. +To get started, create two Arm-based Linux instances, with each instance serving a distinct role: -Run the following on both systems: +* One acting as a client +* One acting as a server + +The instructions below use two `t4g.xlarge` instances running Ubuntu 24.04 LTS. + +## Install software dependencies + +Use the commands below to install iPerf3, which is a powerful open-source CLI tool for measuring maximum achievable network bandwidth. + +Begin by installing iPerf3 on both the client and server systems: ```bash sudo apt update @@ -28,45 +38,81 @@ sudo apt install iperf3 -y ``` {{% notice Note %}} -If you are prompted to start `iperf3` as a daemon you can answer no. +If you're prompted to run `iperf3` as a daemon, answer "no". {{% /notice %}} -## Update Security Rules +## Update security rules -If you are working in a cloud environment like AWS, you need to update the default security rules to enable specific inbound and outbound protocols. +If you're working in a cloud environment like AWS, you must update the default security rules to enable specific inbound and outbound protocols. -From the AWS console, navigate to the security tab. Edit the inbound rules to enable `ICMP`, `UDP` and `TCP` traffic to enable communication between the client and server systems. +To do this, follow these instructions below using the AWS console: -![example_traffic](./example_traffic_rules.png) +* Navigate to the **Security** tab for each instance. +* Configure the **Inbound rules** to allow the following protocols: + * `ICMP` (for ping) + * All UDP ports (for UDP tests) + * TCP port 5201 (for traffic to enable communication between the client and server systems) -{{% notice Note %}} -For additional security set the source and port ranges to the values being used. A good solution is to open TCP port 5201 and all UDP ports and use your security group as the source. This doesn't open any traffic from outside AWS. +![example_traffic#center](./example_traffic_rules.png "AWS console view") + +{{% notice Warning %}} +For secure internal communication, set the source to your instance’s security group. This avoids exposing traffic to the internet while allowing traffic between your systems. + +You can restrict the range further by: + +* Opening only TCP port 5201 + +* Allowing all UDP ports (or a specific range) {{% /notice %}} ## Update the local DNS -To avoid using IP addresses directly, add the IP address of the other system to the `/etc/hosts` file. +To avoid using IP addresses directly, add the other system's IP address to the `/etc/hosts` file. -The local IP address of the server and client can be found in the AWS dashboard. You can also use commands like `ifconfig`, `hostname -I`, or `ip address` to find your local IP address. +You can find private IPs in the AWS dashboard, or by running: + +```bash +hostname -I +ip address +ifconfig +``` +## On the client -On the client, add the IP address of the server to the `/etc/hosts` file with name `SERVER`. +Add the server's IP address, and assign it the name `SERVER`: ```output 127.0.0.1 localhost 10.248.213.104 SERVER ``` -Repeat the same thing on the server and add the IP address of the client to the `/etc/hosts` file with the name `CLIENT`. +## On the server + +Add the client's IP address, and assign it the name `CLIENT`: + +```output +127.0.0.1 localhost +10.248.213.105 CLIENT +``` -## Confirm server is reachable +| Instance Name | Role | Description | +|---------------|--------|------------------------------------| +| SERVER | Server | Runs `iperf3` in listen mode | +| CLIENT | Client | Initiates performance tests | -Finally, confirm the client can reach the server with the ping command below. As a reference you can also ping the localhost. + + + +## Confirm the server is reachable + +Finally, confirm the client can reach the server by using the ping command below. If required, you can also ping the localhost: ```bash ping SERVER -c 3 && ping 127.0.0.1 -c 3 ``` -The output below shows that both SERVER and localhost (127.0.0.1) are reachable. Naturally, the local host response time is ~10x faster than the server. Your results will vary depending on geographic location of the systems and other networking factors. +The output below shows that both SERVER and localhost (127.0.0.1) are reachable. + +Localhost response times are typically ~10× faster than remote systems, though actual values vary based on system location and network conditions. ```output PING SERVER (10.248.213.104) 56(84) bytes of data. @@ -87,4 +133,4 @@ PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. rtt min/avg/max/mdev = 0.022/0.027/0.032/0.004 ms ``` -Continue to the next section to learn how to measure the network bandwidth between the systems. \ No newline at end of file +Now that your systems are configured, the next step is to measure the available network bandwidth between them. \ No newline at end of file diff --git a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/simulating-network-conditions.md b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/simulating-network-conditions.md index 03e747efcf..590c7997be 100644 --- a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/simulating-network-conditions.md +++ b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/simulating-network-conditions.md @@ -1,22 +1,24 @@ --- -title: Simulating different network conditions +title: Simulate different network conditions weight: 4 ### FIXED, DO NOT MODIFY layout: learningpathall --- -## Add a delay to the TCP connection +You can simulate latency and packet loss to test how your application performs under adverse network conditions. This is especially useful when evaluating the impact of congestion, jitter, or unreliable connections in distributed systems. -The Linux `tc` utility can be used to manipulate traffic control settings. +## Add delay to the TCP connection -First, on the client system, find the name of network interface with the following command: +The Linux `tc` (traffic control) lets you manipulate network interface behavior such as delay, loss, or reordering. + +First, on the client system, identify the name of your network interface: ```bash ip addr show ``` -The output below shows the `ens5` network interface device (NIC) is the device we want to manipulate. +The output below shows that the `ens5` network interface device (NIC) is the device that you want to manipulate. ```output 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 @@ -34,7 +36,7 @@ The output below shows the `ens5` network interface device (NIC) is the device w ``` -Run the following command on the client system to add an emulated delay of 10ms on `ens5`. +Run the following command on the client system to add an emulated delay of 10ms on `ens5`: ```bash sudo tc qdisc add dev ens5 root netem delay 10ms @@ -43,13 +45,9 @@ sudo tc qdisc add dev ens5 root netem delay 10ms Rerun the basic TCP test as before on the client: ```bash -iperf3 -c SERVER -V +iperf3 -c SERVER -v ``` -Observe that the `Cwnd` size has grew larger to compensate for the longer response time. - -Additionally, the bitrate has dropped from ~4.9 to ~2.3 `Gbit/sec`. - ```output [ 5] local 10.248.213.97 port 43170 connected to 10.248.213.104 port 5201 Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test, tos 0 @@ -75,22 +73,29 @@ rcv_tcp_congestion cubic iperf Done. ``` +## Observations + +* The `Cwnd` size has grown larger to compensate for the longer response time. + +* The bitrate has dropped from ~4.9 to ~2.3 `Gbit/sec` - demonstrating how even modest latency impacts throughput. -### Simulate packet loss +## Simulate packet loss -To test the resiliency of a distributed application you can add a simulated packet loss of 1%. As opposed to a 10ms delay, this will result in no acknowledgment being received for 1% of packets. Given TCP is a lossless protocol a retry must be sent. +To test the resiliency of a distributed application, you can add a simulated packet loss of 1%. As opposed to a 10ms delay, this will result in no acknowledgment being received for 1% of packets. -Run these commands on the client system: +Given TCP is a lossless protocol, a retry must be sent. + +Run these commands on the client system. The first removes the delay configuration, and the second command introduces a 1% packet loss: ```bash sudo tc qdisc del dev ens5 root sudo tc qdisc add dev ens5 root netem loss 1% ``` -Rerunning the basic TCP test you see an increased number of retries (`Retr`) and a corresponding drop in bitrate. +Now rerunning the basic TCP test, and you will see an increased number of retries (`Retr`) and a corresponding drop in bitrate: ```bash -iperf3 -c SERVER -V +iperf3 -c SERVER -v ``` The output is now: @@ -102,4 +107,12 @@ Test Complete. Summary Results: [ 5] 0.00-10.00 sec 4.40 GBytes 3.78 Gbits/sec receiver ``` -Refer to the `tc` [user documentation](https://man7.org/linux/man-pages/man8/tc.8.html) for the different ways to simulate perturbation and check resiliency. +## Explore further with tc + +The tc tool can simulate: + +* Variable latency and jitter +* Packet duplication or reordering +* Bandwidth throttling + +For advanced options, refer to Refer to the [tc man page](https://man7.org/linux/man-pages/man8/tc.8.html). diff --git a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/tuning.md b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/tuning.md index dd6ddb40ec..c58b6aec3a 100644 --- a/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/tuning.md +++ b/content/learning-paths/servers-and-cloud-computing/microbenchmark-network-iperf3/tuning.md @@ -1,28 +1,33 @@ --- -title: Tuning kernel parameters +title: Tune kernel parameters weight: 5 ### FIXED, DO NOT MODIFY layout: learningpathall --- -### Connect from a local machine +You can further optimize network performance by adjusting Linux kernel parameters and testing across different environments, including local-to-cloud scenarios. + +## Connect from a local machine You can look at ways to mitigate performance degradation due to events such as packet loss. -In this example, you will connect to the server node a local machine to demonstrate a longer response time. Check the `iperf3` [installation guide](https://iperf.fr/iperf-download.php) to install `iperf3` on other operating systems. +In this example, you will connect to the server node a local machine to demonstrate a longer response time. Check the iPerf3 [installation guide](https://iperf.fr/iperf-download.php) to install iPerf3 on other operating systems. + +Before starting the test: -Make sure to set the server security group to accept the TCP connection from your local computer IP address. You will also need to use the public IP for the cloud instance. +- Update your cloud server’s **security group** to allow incoming TCP connections from your local machine’s public IP. +- Use the **public IP address** of the cloud instance when connecting. -Running `iperf3` on the local machine and connecting to the cloud server shows a longer round trip time, in this example more than 40ms. +Running iPerf3 on the local machine and connecting to the cloud server shows a longer round trip time, in this example more than 40ms. -On your local computer run: +Run this command on your local computer: ```bash -iperf3 -c -V +iperf3 -c -v ``` -Running a standard TCP client connection with `iperf3` shows an average bitrate of 157 Mbps compared to over 2 Gbps when the client and server are both in AWS. +Compared to over 2 Gbit/sec within AWS, this test shows a reduced bitrate (~157 Mbit/sec) due to longer round-trip times (for example, >40ms). ```output Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test, tos 0 @@ -33,9 +38,9 @@ Test Complete. Summary Results: [ 8] 0.00-10.03 sec 187 MBytes 156 Mbits/sec receiver ``` -### Modify kernel parameters +## Modify kernel parameters on the server -On the server, your can configure Linux kernel runtime parameters with the `sysctl` command. +On the server, you can configure Linux kernel runtime parameters with the `sysctl` command. There are a plethora of values to tune that relate to performance and security. The following command can be used to list all available options. The [Linux kernel documentation](https://docs.kernel.org/networking/ip-sysctl.html#ip-sysctl) provides a more detailed description of each parameter. @@ -44,13 +49,15 @@ sysctl -a | grep tcp ``` {{% notice Note %}} -Depending on your operating system, some parameters may not be available. For example on AWS Ubuntu 22.04 LTS only the `cubic` and `reno` congestion control algorithms are available. +Depending on your operating system, some parameters might not be available. For example, on AWS Ubuntu 22.04 LTS, only the `cubic` and `reno` congestion control algorithms are supported: ```bash net.ipv4.tcp_available_congestion_control = reno cubic ``` {{% /notice %}} -You can increase the read and write max buffer sizes of the kernel on the server to enable more data to be held. This tradeoff results in increased memory utilization. +## Increase TCP buffer sizes + +You can increase the kernel's read and write buffer sizes on the server improve throughput on high-latency connections. This consumes more system memory but allows more in-flight data. To try it, run the following commands on the server: @@ -59,19 +66,19 @@ sudo sysctl net.core.rmem_max=134217728 # default = 212992 sudo sysctl net.core.wmem_max=134217728 # default = 212992 ``` -Restart the `iperf3` server. +Then, restart the iPerf3 server: ```bash iperf3 -s ``` -Run `iperf3` again on the local machine. +Now rerun iPerf3 again on your local machine: ```bash -iperf3 -c -V +iperf3 -c -v ``` -You see a significantly improved bitrate with no modification on the client side. +Without changing anything on the client, the throughput improved by over 60%. ```output Test Complete. Summary Results: @@ -81,4 +88,10 @@ Test Complete. Summary Results: ``` -You now have an introduction to networking microbenchmarking and performance tuning. \ No newline at end of file +You’ve now completed a guided introduction to: + +* Network performance microbenchmarking +* Simulating real-world network conditions +* Tuning kernel parameters for high-latency links + +You can now explore this area further by testing other parameters, tuning for specific congestion control algorithms, or integrating these benchmarks into CI pipelines for continuous performance evaluation. \ No newline at end of file