batchdav
is a Rust program for traversing a WebDAV file hierarchy using a
user-specified number of concurrent worker tasks and timing how long the
traversal takes. It was written as part of investigating dandi/dandidav#54
(primarily to double-check that the results seen there weren't due to some
quirk of rsync).
The traversal handles non-collection resources by simply making HEAD
requests
to them without following any redirects; if the server responds with a
redirect, the original and target URL will both be printed if running batchdav run
without the --quiet
option.
batchdav <command> [<args>]
batchdav
has two subcommands: run
, for performing a single traversal; and
batch
, for performing multiple traversals with different numbers of workers
and summarizing the results.
Worker tasks are executed on a multithreaded asynchronous executor. By
default, the executor uses as many threads as your machine has CPUs; a
different amount can be specified via the TOKIO_WORKER_THREADS
environment
variable.
batchdav run [-q|--quiet] <url> <workers>
Traverse the WebDAV hierarchy at the given URL using the given number of concurrent workers. The elapsed time and number of requests made is printed at the end.
If the -q
/--quiet
option is not given, then as each request is completed,
the URL requested is printed out along with the type of resource at that URL
(DIR
or FILE
) and, for non-collection resources, the URL (if any) that the
resource's URL redirects to.
batchdav batch [<options>] <url> <workers> ...
Traverse the WebDAV hierarchy at the given URL repeatedly and summarize the
elapsed times. For each number of workers listed on the command line, a
traversal is performed a number of times given by the -s
/--samples
option
(default: 10).
By default, upon completion, a CSV document listing the mean & standard
deviation of the traversal times for each number of workers is output. If the
-T
/--per-traversal-stats
option is given, then the command's output will
instead be a CSV with one line for each traversal, giving the number of
workers, number of requests made, and elapsed time in seconds. If the
-J
/--json-file
option is given with a filepath argument, then the command
will instead output a JSON document to the given path listing the elapsed time
for each request made in each traversal, along with the overall elapsed time of
each traversal. The -T
and -J
options are mutually exclusive.
Sample batch
output from traversing [1] on an 8-CPU 2020 Intel MacBook
Pro:
workers,time_mean,time_stddev
1,8.399036695100001,0.36142910510463516
5,1.6700318244,0.12919592271200123
10,1.0409548316000001,0.10855610294283857
15,0.7129774931999999,0.06181837739373458
20,0.750514105,0.10966455557731183
30,0.7945123642999999,0.10238084442203854
40,0.7258895968,0.08116879741778966
50,0.7132875974999999,0.07944869527032605