These Python scripts provide a common way for creating, running, parsing, and plotting experiments using LITMUS^RT. These scripts are:
gen_exps.py
: for creating sets of experimentsrun_exps.py
: for running and tracing experimentsparse_exps.py
: for parsing LITMUS^RT trace dataplot_exps.py
: for plotting directories of csv data
They are designed with the following principles in mind:
-
Little or no configuration: all scripts use certain parameters to configure behavior. However, if the user does not give these parameters, the scripts will examine the properties of the user's system to pick a suitable default. Requiring user input is a last resort.
-
Interruptability: the scripts save their work as they evaluate multiple directories. When the scripts are interrupted, or if new data is added to those directories, the scripts can be re-run and they will resume where they left off. This vastly decreases turnaround time for testing new features.
-
Maximum Safety: where possible, scripts save metadata in their output directories about the data contained. This metadata can be used by the other scripts to safely use the data later.
-
Independence / legacy support: none of these scripts assume their input was generated by another of these scripts. Three are designed to recognize generic input formats inspired by past LITMUS^RT experimental setups. (The exception to this is gen_exps.py, which has only user intput and creates output only for run_exps.py)
-
Save everything: all output and parameters (even from subprocesses) is saved for debugging / reproducability. This data is saved in tmp/ directories while scripts are running in case scripts fail.
These scripts were tested using Python 2.7.2. They have not been tested using Python 3. The Matplotlib Python library is needed for plotting.
The run_exps.py
script should almost always be run using a LITMUS^RT kernel. In addition to the kernel, the following LITMUS-related repos must be in the user's PATH
:
- liblitmus: for real-time executable simulation and task set release
- feather-trace-tools: for recording and parsing overheads and scheduling events
Additional features will be enabled if these repos are present in the PATH
:
- rt-kernelshark: to record ftrace events for kernelshark visualization
- sched_trace (UNC internal) to output a file containing scheduling events as strings
Each of these scripts is designed to operate independently of the others. For example, parse_exps.py
will find any feather trace files resembling ft-xyz.bin
or xyz.ft
and print out overhead statistics for the records inside. However, the scripts provide the most features (especially safety) when their results are chained together, like so:
gen_exps.py --> [exps/*] --> run_exps.py --> [run-data/*] --.
.------------------------------------------------------------'
'--> parse_exps.py --> [parse-data/*] --> plot_exps.py --> [plot-data/*.pdf]
- Create experiments with
gen_exps.py
or some other script. - Run experiments using
run_exps.py
, generating binary files inrun-data/
. - Parse binary data in
run-data/
usingparse_exps.py
, generating csv files inparse-data/
. - Plot
parse-data
usingplot_exps.py
, generating pdfs inplot-data/
.
Each of these scripts will be described. The run_exps.py
script is first because gen_exps.py
creates schedule files that depend on run_exps.py
.
Usage: run_exps.py [OPTIONS] [SCHED_FILE]... [SCHED_DIR]...
where a SCHED_DIR
resembles:
SCHED_DIR/
SCHED_FILE
PARAM_FILE
Output: OUT_DIR/[files]
or OUT_DIR/SCHED_DIR/[files]
or OUT_DIR/SCHED_FILE/[files]
depending on input
If all features are enabled, these files are:
OUT_DIR/[SCHED_(FILE|DIR)/]
trace.slog # LITMUS logging
st-[1..m].bin # sched_trace data
ft.bin # feather-trace overhead data
trace.dat # ftrace data for kernelshark
params.py # Schedule parameters
exec-out.txt # Standard out from schedule processes
exec-err.txt # Standard err '''
Defaults: SCHED_FILE = sched.py
, PARAM_FILE = params.py
, DURATION = 30
, OUT_DIR = run-data/
This script reads schedule files (described below) and executes real-time task systems, recording all overhead, logging, and trace data that is enabled in the system (unless a specific set of tracers is specified in the parameter file, see below). For example, if trace logging is enabled, rt-kernelshark is found in the path, but feather-trace is disabled (the devices are not present), only trace logs and rt-kernelshark logs will be recorded.
When run_exps.py
is running a schedule file, temporary data is saved in a tmp
directory in the same directory as the schedule file. When execution completes, this data is moved into a directory under the run_exps.py
output directory (default: run-data/
, can be changed with the -o
option). When multiple schedules are run, each schedule's data is saved in a unique directory under the output directory.
If a schedule has been run and it's data is in the output directory, run_exps.py
will not re-run the schedule unless the -f
option is specified. This is useful if your system crashes midway through a set of experiments.
You can use the -j
option to send a jabber instant message every time an experiment completes. Running the script with -j
will print out more details about this option.
Schedule files have one of the following two formats:
- simple format
path/to/proc{proc_value}
...
path/to/proc{proc_value}
[real_time_task: default rtspin] task_arguments...
...
[real_time_task] task_arguments...
- python format
{'proc':[
('path/to/proc','proc_value'),
...,
('path/to/proc','proc_value')
],
'task':[
('real_time_task', 'task_arguments'),
...
('real_time_task', 'task_arguments')
]
}
The following creates a simple 3-task system with utilization 2.0, which is then run under the GSN-EDF
plugin:
$ echo "10 20
30 40
60 90" > test.sched
$ run_exps.py -s GSN-EDF test.sched
[Exp test/test.sched]: Enabling sched_trace
...
[Exp test/test.sched]: Switching to GSN-EDF
[Exp test/test.sched]: Starting 3 regular tracers
[Exp test/test.sched]: Starting the programs
[Exp test/test.sched]: Sleeping until tasks are ready for release...
[Exp test/test.sched]: Releasing 3 tasks
[Exp test/test.sched]: Waiting for program to finish...
[Exp test/test.sched]: Saving results in /root/schedules/test/run-data/test.sched
[Exp test/test.sched]: Stopping regular tracers
[Exp test/test.sched]: Switching to Linux scheduler
[Exp test/test.sched]: Experiment done!
Experiments run: 1
Successful: 1
Failed: 0
Already Done: 0
Invalid environment: 0
The following will write a release master using /proc/litmus/release_master
:
$ echo "release_master{2}
10 20" > test.sched && run_exps.py -s GSN-EDF test.sched
A longer form can be used for proc entries not under /proc/litmus
:
$ echo "/proc/sys/something{hello}
10 20" > test.sched
You can specify your own spin programs to run as well instead of rtspin by putting their name at the beginning of the line. This example also shows how you can reference files in the same directory as the schedule file on the command line.
$ echo "colorspin -f color1.csv 10 20" > test.sched
You can specify parameters for an experiment in a file instead of on the command line using params.py:
$ echo "{'scheduler':'GSN-EDF', 'duration':10}" > params.py
$ run_exps.py test.sched
You can also run multiple experiments with a single command, provided a directory with a schedule file exists for each. You can include non-relevant parameters which run_exps.py
does not understand in params.py
. These parameters will be saved with the data output by run_exps.py
. This is useful for tracking variations in system parameters versus experimental results. In the following example, multiple experiments are demonstrated and an extra parameter test-param
is included:
$ mkdir test1
# The duration will default to 30 and need not be specified
$ echo "{'scheduler':'C-EDF', 'test-param':1}" > test1/params.py
$ echo "-p 1 10 20" > test1/sched.py
$ cp -r test1 test2
$ echo "{'scheduler':'GSN-EDF', 'test-param':2}"> test2/params.py
$ run_exps.py test*
You can specify commands to run before and after each experiment is run using 'pre-experiment' and 'post-experiment'. This is useful for complicated system setup such as managing shared resources. The following example prints out a message before and after an experiment is run (note that command line arguments can be specified using arrays):
$ echo "10 20" > sched.py
$ echo "{'scheduler':'GSN-EDF',
'pre-experiment' : 'script1.sh',
'post-experiment' : ['echo', 'Experiment ends!']}" > params.py
$ echo "#!/bin/bash
Experiment begins!" > script1.sh
$ run_exps.py
$ cat pre-out.txt
Experiment begins!
$ cat post-out.txt
Experiment ends!
Finally, you can specify system properties in params.py
, which the environment must match for the experiment to run. These are useful if you have a large batch of experiments that must be run under different kernels or kernel configurations. The first property is a regular expression for the name of the kernel:
$ uname -r
3.0.0-litmus
$ echo "{'uname': r'.*linux.*'}" > params.py
$ run_exps.py -s GSN-EDF test.sched
Invalid environment for experiment 'test.sched'
Kernel name does not match '.*linux.*'.
Experiments run: 1
Successful: 0
Failed: 0
Already Done: 0
Invalid Environment: 1
$ echo "{'uname': r'.*litmus.*'}" > params.py
# run_exps.py will now succeed
The second property is kernel configuration options. These assume the configuration is stored at /boot/config-$(uname -r)
. You can specify these in params.py
. In the following example, the experiment will only run on an ARM system with the release master enabled:
{'config-options':{
'RELEASE_MASTER' : 'y',
'ARM' : 'y'}
}
The third property is required tracers. The tracers
property lets the user specify only those tracers they want to run with an experiment, as opposed to starting every available tracer (the default). If any of these specified tracers cannot be enabled, e.g. the kernel was not compiled with feather-trace support, the experiment will not run. The following example gives an experiment that will not run unless all four tracers are enabled:
{'tracers':['kernelshark', 'log', 'sched', 'overhead']}
Usage: gen_exps.py [options] [files...] [generators...] [param=val[,val]...]
Output: OUT_DIR/EXP_DIRS
that each contain sched.py
and params.py
Defaults: generators = G-EDF P-EDF C-EDF
, OUT_DIR = exps/
This script uses generators, one for each LITMUS scheduler supported, which each have different properties that can be varied to generate different types of schedules. Each of these properties has a default value that can be modified on the command line for quick and easy experiment generation.
This script as written should be used to create debugging task sets, but not for creating task sets for experiments shown in papers. That is because the safety features of run_exps.py
described above (uname
, config-options
) are not used here. If you are creating experiments for a paper, you should create your own generator that outputs values for the config-options
required for your plugin so that you cannot ruin your experiments at run time. Trust me, you will.
The -l
option lists the supported generators that can be specified:
$ gen_exps.py -l
G-EDF, P-EDF, C-EDF
The -d
option will describe the properties of a generator or generators and their default values. Note that some of these defaults will vary depending on the system the script is run. For example, the cpus
parameter defaults to the number of cpus on the current system, in this example 24.
$ gen_exps.py -d G-EDF,P-EDF
Generator GSN-EDF:
tasks -- Number of tasks per experiment.
Default: [24, 48, 72, 96]
Allowed: <type 'int'>
....
Generator PSN-EDF:
tasks -- Number of tasks per experiment.
Default: [24, 48, 72, 96]
Allowed: <type 'int'>
cpus -- Number of processors on target system.
Default: [24]
Allowed: <type 'int'>
....
You create experiments by specifying a generator. The following will create experiments 4 schedules with 24, 48, 72, and 96 tasks, because the default value of tasks
is an array of these values (see above).
$ gen_exps.py P-EDF
$ ls exps/
sched=GSN-EDF_num-tasks=24/ sched=GSN-EDF_num-tasks=48/
sched=GSN-EDF_num-tasks=72/ sched=GSN-EDF_num-tasks=96/
You can modify the default using a single value (the -f
option deletes previous experiments in the output directory, defaulting to exps/
, changeable with -o
):
$ gen_exps.py -f P-EDF tasks=24
$ ls exps/
sched=GSN-EDF_num-tasks=24/
Or with an array of values, specified as a comma-seperated list:
$ gen_exps.py -f tasks=`seq -s, 24 2 30` P-EDF
sched=PSN-EDF_num-tasks=24/ sched=PSN-EDF_num-tasks=26/
sched=PSN-EDF_num-tasks=28/ sched=PSN-EDF_num-tasks=30/
The generator will create a different directory for each possible configuration of the parameters. Each parameter that is varied is included in the name of the schedule directory. For example, to vary the number of CPUs but not the number of tasks:
$ gen_exps.py -f tasks=24 cpus=3,6 P-EDF
$ ls exps
sched=PSN-EDF_cpus=3/ sched=PSN-EDF_cpus=6/
The values of non-varying parameters are still saved in params.py
. Continuing the example above:
$ cat exps/sched\=PSN-EDF_cpus\=3/params.py
{'periods': 'harmonic', 'release_master': False, 'duration': 30,
'utils': 'uni-medium', 'scheduler': 'PSN-EDF', 'cpus': 3}
You can also have multiple schedules generated with the same configuration using the -n
option:
$ gen_exps.py -f tasks=24 -n 5 P-EDF
$ ls exps/
sched=PSN-EDF_trial=0/ sched=PSN-EDF_trial=1/ sched=PSN-EDF_trial=2/
sched=PSN-EDF_trial=3/ sched=PSN-EDF_trial=4/
Usage: parse_exps.py [options] [data_dir1] [data_dir2]...
where the data_dirx
contain feather-trace and sched-trace data, e.g. ft.bin
, mysched.ft
, or st-*.bin
.
Output: print out all parsed data or OUT_FILE
where OUT_FILE
is a python map of the data or OUT_DIR/[FIELD]*/[PARAM]/[TYPE]/[TYPE]/[LINE].csv
, depending on input.
The goal is to create csv files that record how varying PARAM
changes the value of FIELD
. Only PARAM
s that vary are considered.
FIELD
is a parsed value, e.g. 'RELEASE' overhead or 'miss-ratio'. PARAM
is a parameter that we are going to vary, e.g. 'tasks'. A single LINE
is created for every configuration of parameters other than PARAM
.
TYPE
is the statistic of the measurement, i.e. Max, Min, Avg, or Var[iance]. The two types are used to differentiate between measurements across tasks in a single taskset, and measurements across all tasksets. E.g. miss-ratio/*/Max/Avg
is the maximum of all the average miss ratios for each task set, while miss-ratio/*/Avg/Max
is the average of the maximum miss ratios for each task set.
Defaults: OUT_DIR, OUT_FILE = parse-data
, data_dir1 = .
This script reads a directory or directories, parses the binary files inside for feather-trace or sched-trace data, then summarizes and organizes the results for output. The output can be to the console, to a python map, or to a directory tree of csvs (default). The python map (using -m
) can be used for schedulability tests. The directory tree can be used to look at how changing parameters affects certain measurements.
The script will use all of the system CPUs to process data (changeable with -p
).
In the following example, too little data was found to create csv files, so the data is output to the console despite the user not specifying the -v
option. This use is the easiest for quick overhead evalutation and debugging. Note that for overhead measurements like these, parse_exps.py
will use the clock-frequency
parameter saved in a params.py file by run_exps.py
to calculate overhead measurements. If a param file is not present, as in this case, the current CPUs frequency will be used.
$ ls run-data/
taskset_scheduler=C-FL-split-L3_host=ludwig_n=10_idx=05_split=randsplit.ft
$ parse_exps.py
Loading experiments...
Parsing data...
0.00%
Writing result...
Too little data to make csv files.
<ExpPoint-/home/hermanjl/tmp>
CXS: Avg: 5.053 Max: 59.925 Min: 0.241
SCHED: Avg: 4.410 Max: 39.350 Min: 0.357
TICK: Avg: 1.812 Max: 21.380 Min: 0.241
In the next example, because the value of num-tasks varies, csvs can be created. The varying parameters used to create csvs were found by reading the params.py
files under each run-data
subdirectory.
$ ls run-data/
sched=C-EDF_num-tasks=4/ sched=GSN-EDF_num-tasks=4/
sched=C-EDF_num-tasks=8/ sched=GSN-EDF_num-tasks=8/
sched=C-EDF_num-tasks=12/ sched=GSN-EDF_num-tasks=12/
sched=C-EDF_num-tasks=16/ sched=GSN-EDF_num-tasks=16/
$ parse_exps.py run-data/*
$ ls parse-data/
avg-block/ avg-tard/ max-block/ max-tard/ miss-ratio/
You can use the -v
option to print out the values measured even when csvs could be created.
You can use the -i
option to ignore variations in a certain parameter (or parameters if a comma-seperated list is given). In the following example, the user has decided the parameter option
does not matter after viewing output. Note that the trial
parameter, used by gen_exps.py
to create multiple schedules with the same configuration, is always ignored.
$ ls run-data/
sched=C-EDF_num-tasks=4_option=1/ sched=C-EDF_num-tasks=4_option=2/
sched=C-EDF_num-tasks=8_option=1/ sched=C-EDF_num-tasks=8_option=2/
$ parse_exps.py run-data/*
$ for i in `ls parse-data/miss-ratio/tasks/Avg/Avg/`; do echo $i; cat
$i; done
option=1.csv
4 .1
8 .2
option=2.csv
4 .2
8 .4
# Now ignore 'option' for more accurate results
$ parse_exps.py -i option run-data/*
$ for i in `ls parse-data/miss-ratio/tasks/Avg/Avg/`; do echo $i; cat
$i; done
line.csv
4 .2
8 .3
The second command will also have run faster than the first. This is because parse_exps.py
will save the data it parses in tmp/
directories before it attempts to sort it into csvs. Parsing takes far longer than sorting, so this saves a lot of time. The -f
flag can be used to re-parse files and overwrite this saved data.
All output from the feather-trace-tools programs used to parse data is stored in the tmp/
directories created in the input directories. If the sched_trace repo is found in the users PATH
, st_show
will be used to create a human-readable version of the sched-trace data that will also be stored there.
Usage: plot_exps.py [OPTIONS] [CSV_DIR]...
where a CSV_DIR
is a directory or directory of directories (and so on) containing csvs, like:
CSV_DIR/[SUBDIR/...]
line1.csv
line2.csv
line3.csv
Outputs: OUT_DIR/[CSV_DIR/]*[PLOT]*.pdf
where a single plot exists for each directory of csvs, with a line for for each csv file in that directory. If only a single CSV_DIR
is specified, all plots are placed directly under OUT_DIR
.
Defaults: OUT_DIR = plot-data/
, CSV_DIR = .
This script takes directories of csvs (or directories formatted as specified below) and creates a pdf plot of each csv directory found. A line is created for each .csv file contained in a plot. Matplotlib is used to do the plotting. The script will use all of the system CPUs to process data (changeable with -p
).
If the csv filenames are formatted like: param=value_param2=value2.csv
, the variation of these parameters will be used to color the lines in the most readable way. For instance, if there are three parameters, variations in one parameter will change line color, another line style (dashes/dots/etc), and a third line markers (trianges/circles/etc).
If a directory of directories is passed in, the script will assume the top level directory is the measured value and the next level is the variable, ie: value/variable/[..../]line.csv
, and will put a title on the plot of "Value by variable (...)". Otherwise, the name of the top level directory will be the title, like "Value".
A directory with some lines:
$ ls
line1.csv line2.csv
$ plot_exps.py
$ ls plot-data/
plot.pdf
A directory with a few subdirectories:
$ ls test/
apples/ oranges/
$ ls test/apples/
line1.csv line2.csv
$ plot_exps.py test/
$ ls plot-data/
apples.pdf oranges.pdf
A directory with many subdirectories:
$ ls parse-data
avg-block/ avg-tard/ max-block/ max-tard/ miss-ratio/
$ ls parse-data/avg-block/tasks/Avg/Avg
scheduler=C-EDF.csv scheduler=PSN-EDF.csv
$ plot_exps.py parse-data
$ ls plot-data
avg-block_tasks_Avg_Avg.pdf avg-block_tasks_Avg_Max.pdf avg-block_tasks_Avg_Min.pdf
avg-block_tasks_Max_Avg.pdf avg-block_tasks_Max_Max.pdf avg-block_tasks_Max_Min.pdf
avg-block_tasks_Min_Avg.pdf avg-block_tasks_Min_Max.pdf avg-block_tasks_Min_Min.pdf
avg-block_tasks_Var_Avg.pdf avg-block_tasks_Var_Max.pdf avg-block_tasks_Var_Min.pdf
.......
If you run the previous example directly on the subdirectories, subdirectories will be created in the output:
$ plot_exps.py parse-data/*
$ ls plot-data/
avg-block/ max-tard/ avg-tard/ miss-ratio/ max-block/
$ ls plot-data/avg-block/
tasks_Avg_Avg.pdf tasks_Avg_Min.pdf tasks_Max_Max.pdf
tasks_Min_Avg.pdf tasks_Min_Min.pdf tasks_Var_Max.pdf
tasks_Avg_Max.pdf tasks_Max_Avg.pdf tasks_Max_Min.pdf
tasks_Min_Max.pdf tasks_Var_Avg.pdf tasks_Var_Min.pdf
However, when a single directory of directories is given, the script assumes the experiments are related and can make line styles match in different plots and more effectively parallelize the plotting.