v3.0.0
Summary
The dev team had accumulated a large set of breaking changes that would require a major version bump. In this release, we have focused on clearing our breaking changes queue and merging those improvements. Because these are breaking changes, this release has bumped our major version from 2 to 3. This release also significantly improves the runtime performance compared to Shadow 2.5.0.
Configuration format
-
Shadow no longer implicitly searches its working directory for executables to be run under the simulation. If you wish to specify a process path relative to Shadow's working directory, prefix that path with
./
. -
Shadow now supports YAML merge keys and extension fields. This allows you to combine YAML maps using the
<<
key.Example:
# an "extension field" that we use to store common host options x-host-client: &host-client bandwidth_up: 10Mbps bandwidth_down: 10Mbps hosts: client1: # merge the fields from the extension field above <<: *host-client processes: ... client2: <<: *host-client processes: ...
-
Removed the quantity options for hosts and processes. It's now recommended to use YAML anchors and merge keys instead.
Shadow 2.x:
hosts: client: quantity: 3 processes: ...
Shadow 3.x:
hosts: client1: &client processes: ... # copy all fields from 'client1' client2: *client # copy all fields from 'client1' and add additional fields client3: <<: *client ip_addr: 152.21.4.24
-
Renamed the
host_defaults
field tohost_option_defaults
and renamed the host'soptions
field tohost_options
.Shadow 2.x:
host_defaults: ... hosts: client: options: ...
Shadow 3.x:
host_option_defaults: ... hosts: client: host_options: ...
-
Removed the host
pcap_directory
configuration option and replaced it with a newpcap_enabled
option.Shadow 2.x:
hosts: client: options: pcap_directory: ./
Shadow 3.x:
hosts: client: host_options: pcap_enabled: true
-
Host names are restricted to the patterns documented in hostname(7).
-
The process
environment
configuration option now takes a map instead of a semicolon-delimited string.Shadow 2.x:
hosts: client: processes: - path: curl environment: ENV_A=1;ENV_B=foo
Shadow 3.x:
hosts: client: processes: - path: curl environment: - ENV_A: "1" - ENV_B: foo
-
The per-process option
stop_time
has been replaced withshutdown_time
. When set, the signal specified byshutdown_signal
(a new option) will be sent to the process at the specified time. While shadow previously sentSIGKILL
at a process'sstop_time
, the defaultshutdown_signal
isSIGTERM
to better support graceful shutdown.Shadow 2.x:
hosts: client: processes: - path: curl stop_time: 10s
Shadow 3.x:
hosts: client: processes: - path: curl shutdown_time: 10s shutdown_signal: SIGKILL
-
A new
expected_final_state
allows you to specify the expected state of the process at the end of the simulation. The supported states areexited
,signaled
, orrunning
. If any process is not in the correct state at the end of the simulation, Shadow will return a non-zero exit code. The defaultexpected_final_state
is exited with code 0.In Shadow 2.x the behaviour was to consider any processes which exited with code 0, OR which were still running at the end of the simulation, as a success. Shadow 3.x does not support this specific behaviour, and you must choose a single state.
Example:
hosts: server: processes: - path: nginx # we expect nginx to run until the end of the simulation expected_final_state: running
-
Added support for a
parallelism
value of 0, which allows Shadow to choose a reasonable parallelism (we currently use the number of physical cores in Shadow's affinity/cgroup). The default value forparallelism
has also been changed from 1 to 0. -
It is now an error to set a process'
shutdown_time
orstart_time
to be after the simulation'sstop_time
. -
Sub-second configuration values are now allowed for all time-related options, including
start_time
,stop_time
, etc. -
Removed and updated various experimental options including
use_shim_syscall_handler
,interface_qdisc
, anduse_extended_yaml
.
File structure
- A host's data files (files in
<data-dir>/hosts/<hostname>/
) are no longer prefixed with the hostname. For example a file that was previously namedshadow.data/hosts/server/server.curl.1000.stdout
is now namedshadow.data/hosts/server/curl.1000.stdout
. - The per-process
.exitcode
file has been removed due to its confusing semantics, and the newexpected_final_state
attribute replacing its primary use-case. - Generated pcap files are now named using their interface name instead of their IP address. For example "lo.pcap" and "eth0.pcap" instead of "127.0.0.1.pcap" and "11.0.0.1.pcap".
Performance
Shadow's scheduler is very performance-sensitive and needs to run tasks on worker threads with low latency. We added a spinloop in the scheduler that significantly improves Shadow's runtime performance. Some simulations see more than a 2x runtime performance improvement (for example 160 minutes to 47 minutes in a 5% Tor network simulation).
Supported platforms
We have removed several of our supported platforms. Specifically, we've dropped support for Ubuntu 18.04, Fedora 34/35/36, and CentOS Stream 8. We've also dropped support for Clang, and set a minimum-supported Linux kernel version of 5.4, which requires installing a backports kernel on Debian 10.
Stability guarantees
We've updated our "stability guarantees" document with the following changes:
- Updated the filenames in Shadow's host-data directories to reflect the removal of the hostname prefix.
- Added the ability to drop supported platforms in minor releases if the platforms no longer receive free updates and support from the distribution's developer.
- Shadow no longer guarantees the order in which simulated process IDs (PIDs) are assigned.
- Shadow will not change the criteria for the minimum supported Linux kernel version as documented in our supported platforms. This still allows us to increase the minimum kernel version as a result of dropping support for a platform.
Additional changes
Minor changes
- Support the
MSG_TRUNC
flag for unix sockets. #2841 - Support the
TIMER_ABSTIME
flag forclock_nanosleep
. #2854 - Removed the
--profile
,--include
, and--library
setup script options. - Added partial support for the
epoll_pwait2
syscall. - Implemented the
clone3
syscall. Thread libraries we're aware of that useclone3
were gracefully falling back toclone
, but eventually they may not do so. This also reduces noise in shadow's log about an unimplemented syscall being attempted. - Shadow no longer requires
/dev/shm
to be executable.
Bug fixes
- Fixed a memory leak of about 16 bytes per thread due to failing to unregister exited threads with a watchdog thread. This is unlikely to have been noticeable effect in typical simulations. In particular the per-thread data was already getting freed when the whole process exited, so it would only affect a process that created and terminated many threads over its lifetime.
- Simulated Processes are now reaped and deallocated after the exit, reducing run-time memory usage when processes exit over the course of the simulation. This was unlikely to have affected most users, since Shadow currently doesn't support
fork
, so any simulation has a fixed number of processes, all of which are explicitly specified in shadow's config. - Fixed a potential race condition when exiting managed threads that did not have the
clear_child_tid
attribute set. This is unlikely to have affected most software running under Shadow, since most thread APIs use this attribute. - Changed an error value in
clock_nanosleep
andnanosleep
fromENOSYS
toENOTSUP
. - A managed process that tries to call the
execve
syscall will now get an error instead of escaping the Shadow simulation. #2718 - Stopped overriding libc's
getcwd
with an incorrect wrapper that was returning-1
instead ofNULL
on errors. - A call to
epoll_ctl
with an unknown operation will returnEINVAL
. - Fixed a bug that caused Shadow to panic in some cases when a simulated thread exits. #2913
- Fixed a bug causing
host_options
to undo any changes made tohost_option_defaults
.
Full changelog
Thanks to contributions from @robgjansen, @stevenengler, @sporksmith, @jtracey, @dependabot