Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: ability to print all ucx_perftest inputs and outputs in a structured, machine-readable format (like JSON) #9587

Open
ikatz-nvidia opened this issue Jan 10, 2024 · 8 comments

Comments

@ikatz-nvidia
Copy link

ikatz-nvidia commented Jan 10, 2024

TL;DR

Please include functionality in ucx_perftest that prints the value of all configuration parameters that were used to perform a test alongside the results of the test itself. Without this, it's not straightforward to compare test results taken on 2 different systems because any number of other parameters might be different; without reporting them as part of test output, it is difficult to be 100% sure what the test is actually doing.

I would prefer that this output be provided in a machine-readable format, for ease of ingesting into other systems.

Background

ucx_perftest seems to determine its test parameters from at least these sources:

  • environment variables, all of them optional
  • command-line arguments, many of them optional
  • defaults for all those variables and arguments
  • dynamically-chosen options at runtime

This can cause significant issues when trying to collect data in bulk, or longitudinally. Here are some examples:

  1. I run a test without specifying a CPU (the -c 0 flag). Then I run the test again. The results are different. Is it because a different CPU was used? This is not possible to tell, because the exact value used is not included in the program output.
  2. I run a test without specifying UCX_RNDV_THRESH. Then I decide I want to run multiple tests specifying a different value of UCX_RNDV_THRESH for each; but which of these new results is comparable to the original result? This is not possible to tell, because the exact value used is not included in the program output.
  3. I run a test without specifying UCX_NET_DEVICES. Then I change my network settings, and run another test without specifying UCX_NET_DEVICES. The results are different. Is it because different interfaces were used, or something else entirely? This is not possible to tell...
  4. A test negotiates protocols and options for those protocols. Which options did it choose? This information can be printed to the screen, but not in a way that is practical to parse by the test harness. I

Desired Behavior

Whether by outputting to STDOUT or to a specified filename, I would like the output to be in JSON format, containing:

  • all configuration options for the test, both explicit and implicit
  • all dynamically-determined parameters
  • all results of the test, in native data types (integer, float, string, boolean, etc)
@brminich
Copy link
Contributor

  • currently you can set UCX_LOG_LEVEL=info, which would output all env variables being used at this run
  • command line parameters can be captured from the command itself. Do you propose to additionaly print what command line params were used? I could not recall an existing tool which is doing that

@ikatz-nvidia
Copy link
Author

I see that by setting UCX_LOG_LEVEL=info in one of my test runs, I receive the following line in the output:

my-host:~$ UCX_LOG_LEVEL=info  ucx_perftest -c 0 -p 11005
Waiting for connection...
Accepted connection from 11.22.33.44:36064
+----------------------------------------------------------------------------------------------------------+
| API:          protocol layer                                                                             |
| Test:         am bandwidth / message rate                                                                |
| Data layout:  (automatic)                                                                                |
| Send memory:  host                                                                                       |
| Recv memory:  host                                                                                       |
| Message size: 7990                                                                                       |
| Window size:  32                                                                                         |
| AM header size: 256                                                                                      |
+----------------------------------------------------------------------------------------------------------+
[1704914499.020174] [my-host:509265:0]     ucp_context.c:2137 UCX  INFO  Version 1.16.0 (loaded from /lib/libucp.so.0)
[1704914499.150230] [my-host:509265:0]          parser.c:2042 UCX  INFO  UCX_* env variable: UCX_LOG_LEVEL=info
[1704914499.161951] [my-host:509265:0]      ucp_worker.c:1887 UCX  INFO    perftest inter-node cfg#0 am(rc_mlx5/mlx5_1:1)  ka(ud_mlx5/mlx5_1:1)

Notably, INFO UCX_* env variable: UCX_LOG_LEVEL=info.

The problem is that many parameters used to perform the test (such as those I named above, UCX_NET_DEVICES and UCX_RNDV_THRESH) are not printed. While UCX_RNDV_THRESH has a default value that I might choose to set, UCX_NET_DEVICES has no default value because it is determined at runtime.

TL;DR

I'm not looking for an echo of the input arguments and parameters, I'm looking for a dump of all relevant internal variables.

@yosefe
Copy link
Contributor

yosefe commented Jan 11, 2024

I'm not looking for an echo of the input arguments and parameters, I'm looking for a dump of all relevant internal variables.

Can you try UCX_PROTO_INFO=y ?

@ikatz-nvidia
Copy link
Author

ikatz-nvidia commented Jan 11, 2024

I have tried that, full output is below.

Here again, there is no mention of the network interfaces used (eth0, eth1, etc) nor the "rendezvous threshold". I keep using those 2 examples, because while I don't know what the full list of relevant variables looks like, I know that it must include at least those.

my-host$ UCX_LOG_LEVEL=info UCX_PROTO_INFO=y  ucx_perftest -t ucp_am_bw -m host,host -c 0 -s 7990 -w 100 -n 10000 -H 256 -p 11005 -e 11.22.33.44
+--------------+--------------+------------------------------+---------------------+-----------------------+
|              |              |       overhead (usec)        |   bandwidth (MB/s)  |  message rate (msg/s) |
+--------------+--------------+----------+---------+---------+----------+----------+-----------+-----------+
|    Stage     | # iterations | 50.0%ile | average | overall |  average |  overall |  average  |  overall  |
+--------------+--------------+----------+---------+---------+----------+----------+-----------+-----------+
[1704982673.833459] [my-host:2320454:0]     ucp_context.c:2137 UCX  INFO  Version 1.16.0 (loaded from /lib/libucp.so.0)
[1704982673.927110] [my-host:2320454:0]          parser.c:2042 UCX  INFO  UCX_* env variables: UCX_PROTO_INFO=y UCX_LOG_LEVEL=info
[1704982674.011965] [my-host:2320454:0]   +---------------------------+--------------------------------------------------------------+
[1704982674.011989] [my-host:2320454:0]   | perftest inter-node cfg#0 | active message by ucp_am_send* from host memory              |
[1704982674.011993] [my-host:2320454:0]   +---------------------------+-------------------------------------------+------------------+
[1704982674.011995] [my-host:2320454:0]   |                   0..2038 | short                                     | rc_mlx5/mlx5_0:1 |
[1704982674.011997] [my-host:2320454:0]   |                2039..8246 | zero-copy                                 | rc_mlx5/mlx5_0:1 |
[1704982674.011999] [my-host:2320454:0]   |               8247..46719 | multi-frag zero-copy                      | rc_mlx5/mlx5_0:1 |
[1704982674.012001] [my-host:2320454:0]   |                46720..inf | (?) rendezvous zero-copy read from remote | rc_mlx5/mlx5_0:1 |
[1704982674.012003] [my-host:2320454:0]   +---------------------------+-------------------------------------------+------------------+
[1704982674.012551] [my-host:2320454:0]   +---------------------------+---------------------------------------------------------------------+
[1704982674.012561] [my-host:2320454:0]   | perftest inter-node cfg#0 | active message by ucp_am_send*(fast-completion) from host memory    |
[1704982674.012563] [my-host:2320454:0]   +---------------------------+--------------------------------------------------+------------------+
[1704982674.012566] [my-host:2320454:0]   |                   0..2038 | short                                            | rc_mlx5/mlx5_0:1 |
[1704982674.012568] [my-host:2320454:0]   |                2039..8246 | copy-in                                          | rc_mlx5/mlx5_0:1 |
[1704982674.012573] [my-host:2320454:0]   |               8247..46222 | multi-frag copy-in                               | rc_mlx5/mlx5_0:1 |
[1704982674.012576] [my-host:2320454:0]   |             46223..262143 | multi-frag zero-copy                             | rc_mlx5/mlx5_0:1 |
[1704982674.012578] [my-host:2320454:0]   |                 256K..inf | (?) rendezvous zero-copy read from remote        | rc_mlx5/mlx5_0:1 |
[1704982674.012580] [my-host:2320454:0]   +---------------------------+--------------------------------------------------+------------------+
[1704982674.013727] [my-host:2320454:0]   +---------------------------+--------------------------------------------------------------+
[1704982674.013742] [my-host:2320454:0]   | perftest inter-node cfg#0 | active message by ucp_am_send*(multi) from host memory       |
[1704982674.013745] [my-host:2320454:0]   +---------------------------+-------------------------------------------+------------------+
[1704982674.013749] [my-host:2320454:0]   |                    0..514 | short                                     | rc_mlx5/mlx5_0:1 |
[1704982674.013751] [my-host:2320454:0]   |                 515..8246 | zero-copy                                 | rc_mlx5/mlx5_0:1 |
[1704982674.013753] [my-host:2320454:0]   |               8247..16074 | multi-frag zero-copy                      | rc_mlx5/mlx5_0:1 |
[1704982674.013755] [my-host:2320454:0]   |                16075..inf | (?) rendezvous zero-copy read from remote | rc_mlx5/mlx5_0:1 |
[1704982674.013761] [my-host:2320454:0]   +---------------------------+-------------------------------------------+------------------+
[1704982674.013785] [my-host:2320454:0]      ucp_worker.c:1887 UCX  INFO    perftest inter-node cfg#0 am(rc_mlx5/mlx5_0:1)  ka(ud_mlx5/mlx5_0:1)

@rakhmets
Copy link
Collaborator

Different network devices can be used for different data transfer operations.
mlx5_0 is a name of IB network device. This device is used in your case. This information can be found in the tables.

Threshold for switching from eager to rendezvous protocol is set to auto by default. This means that it is calculated UCP layer of UCX. You have three threshold values in your output depending on the flags used in the sending operation. The information can be found in the tables. E.g. UCX starts to use rendezvous protocol from message size 16075 bytes, if UCP_OP_ATTR_FLAG_MULTI_SEND is used.

@ikatz-nvidia
Copy link
Author

I don't see mlx5_0 listed in my network devices:

$ ifconfig -a | grep -Ev '^($| )' | cut -d: -f1
can0
can1
docker0
eth0
eth1
eth2
eth3
l4tbr0
lo
usb0
usb1
wlan0

This is why it's frustrating to not be able to see detailed output of the ucx_perftest command -- I had no way to know that the list of network devices I was considering isn't even complete as far as the command is concerned.

Threshold for switching from eager to rendezvous protocol is set to auto by default

Here again, this information is not evident to me by the program output as it exists today.

  • I thought that the switching threshold was defaulting to a specific number value, even though I couldn't see what that number was
  • I didn't know that there were 2 protocols called eager and rendezvous that were being interchanged as I changed the message size
  • I didn't know that auto was an option, nor how to correlate that setting with the output to understand what the automatically-chosen value was

I'm apparently expected to have a high degree of familiarity with the hardware, the networking protocol, and the tool itself in order to understand what the test is doing. But I don't even know what questions to ask to get up to speed on it. It's very frustrating.

@rakhmets
Copy link
Collaborator

You can use ucx_info tool. E.g. ucx_info -d | grep Device | sort --unique will show you all devices that can be used by UCX.
You can also use ibstat -l command to see the list of IB devices.

@ikatz-nvidia
Copy link
Author

I appreciate that, and I can see many uses for the ucx_info tool. ucx_info -d seems to be in a format that would very directly map to a JSON structure.

It sounds like we are in agreement that the ucx_perftest tool is not outputting information that is relevant to the testing it is performing. I am updating the bug description to include the fact that if 2 people run this tool on 2 different systems, they will not be able to account for the differences in performance by looking at the tool's output alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants