Track benchmarks locally and interactively display the results.
A collection of chained tools are provided to create multiple entry points for different projects.
Your project can enter and exit at any point!
- clone to somewhere easily accessible
git clone https://github.com/LemonPi/benchtracker.git
- decide what points your application enters and exits the benchtracker flow
- [OPTIONAL] create your first task directory and define the task with a config file
- [OPTIONAL] outline what parameters you want to track in a parse file
- run the task over time, creating a <run#> directory each time
Example paths are used instead of placeholders to minimize confusion. Replace
occurances of ~/benchtracker
with your actual benchtracker directory.
A VTR user first runs the task, parses it, then populates the database with it:
bash$ cd ~/vtr/vtr_flow/scripts/
scripts$ ./run_vtr_task.pl checkin_reg
scripts$ ./parse_vtr_task.pl checkin_reg
scripts$ ~/benchtracker/populate_db.py ~/vtr/vtr_flow/tasks/checkin_reg -k arch circuit
populate_db.py
arguments:
~/vtr/vtr_flow/tasks/checkin_reg
required positional argument to the task directory-k arch circuit
key parameters that uniquely define a VTR benchmark
VTR has its own run_task
and parse_task
scripts, so it enters benchtracker at the populate_db
stage.
checkin_reg
is the name of the task to sweep benchmarks over, and it exists as a folder shown below:
vtr_flow
tasks
checkin_reg
config
config.txt
run1
run2
run3
...
A more advanced and safer usage would be to let populate_db
call the parse script for the runs
that are not parsed yet:
bash$ ~/vtr/vtr_flow/scripts/run_vtr_task.pl checkin_reg bash$ ~/benchtracker/populate_db.py ~/vtr/vtr_flow/tasks/checkin_reg -k arch circuit -s "~/vtr/vtr_flow/scripts/parse_vtr_task.pl {task_dir} -run {run_number}"
The command string following the -s
option (leave the {} enclosed values literally as above):
- The first item is the script to execute, given such that the shell can find it
- Arguments to pass to the script, custom for whichever script was run
{task_dir}
is replaced in the call with~/vtr/vtr_flow/tasks/checkin_reg
{run_number}
would be replaced with whichever run does not have a parsed results file{task_name}
is also available for substitution,checkin_reg
in this case
A task list (file with name of tasks on separate lines) can be used:
bash$ ~/vtr/vtr_flow/scripts/run_vtr_task.pl -l ~/vtr/vtr_flow/tasks/regression_tests/vtr_reg_strong/task_list.txt bash$ ~/benchtracker/populate_db.py regression_tests/vtr_reg_strong/task_list.txt --root_directory ~/vtr/vtr_flow/tasks/ -k arch circuit -s "~/vtr/vtr_flow/scripts/parse_vtr_task.pl {task_dir} -run {run_number}"
Whether a task list or a task directory is given depends on if the given path points to a file or a directory.
--root_directory
expands all task names in the list with the ~/vtr/vtr_flow/tasks/
prefix. This is
necessary here because the tasks listed in the task_list.txt
are relative paths from ~/vtr/vtr_flow/tasks/
.
A consequence of defining --root_directory
is that the task list name must also be relative to the root directory.
This is not necessary if the tasks in the list are absolute paths.
The other commands, such as the parse script, remain the same.
Now that the database is populated, the VTR user should host the database so that the online viewer can be used to visualize the data and for other users to see. Serving locally is as simple as a call to
bash$ ~/benchtracker/server_db.py
This is fine if your database is called results.db
in the ~/benchtracker_data/
directory, and port 5000 is open.
For some common command line options,
bash$ ~/benchtracker/server_db.py --root_directory ./ --database testing.db --port 8080
--root_directory
tells the server where to find its databases (multiple databases can be served by the same server)--database
gives the default database to serve if none are specified by the viewer; here it would be./testing.db
--port
specifies the port to listen for requests on; be wary of firewall rules on your system
More options and their default values can be found by running server_db
with -h
.
The server is configured to serve in production by default, with it's URL being the IP address of your machine
and the port to listen on. It can be put into debug mode by
changing server_db.py
's last line to app.run(debug=True, port=port)
.
The syntax for specifying database is the same for populate_db
and server_db
:
bash$ ~/benchtracker/populate_db.py ~/vtr/vtr_flow/tasks/checkin_reg -k arch circuit -d ~/benchdata/testing.db bash$ ~/benchtracker/server_db.py -d ~/benchdata/testing.db
For populate_db
this is the database to put data in while for server_db
this is the default database to host if none are specified.
While server_db
is running, assuming on port 5000 and has a valid database, the viewer can be
reached at
http://localhost:5000/view
The first box lists the name of the tasks in the served database, and might look like:
Clicking on tasks selects them if they were not selected and unselects them if they were.
Selecting tasks automatically populates the x and y parameter boxes with the tasks' shared parameters:
Note that the y-parameter has to be numeric.
The minimum selection for a full query is 1 task, x parameter, and y parameter selected, with the output options on the dividing row available:
The viewer under /view
allows you to select a portion of the stored data for further processing.
The current selection can be saved and shared by clicking Get query string
, which is shared by all processing options.
The processing options include:
- Generating plots inside
/view?[query_string]
for visualization - Saving plots as SVG and PNG after generating them by clicking
Save plots (png and svg)
- Downloading raw data in CSV format under
/csv?[query_string]
- Downloading raw data in JSON format under
/data?[query_string]
The plotter is designed to support basic and advanvced usage. After clicking the "generating plot" (or "reset plotter") button, default plots are generated. The policy for generating default plots are as follow:
- if the
x axis
chosen is related to time (e.g.:revision number
,parse date
...), then all the tuples of one single task (regardless of other parameter values) will be reduced into a single (x,y) series, by calculating the geometric mean of the y-axis. So the number of resulting plots will equal to number of tasks, and the number of lines in one plot will be 1.-
As an example, suppose the raw data before calculating the geometric mean is:
Tasks run ( x
axis)min channel width ( y
axis)fc wire length 1 1 40 0.1 1 1 1 70 0.1 2 1 1 50 0.4 1 1 1 60 0.25 4 1 2 43 0.1 1 1 2 68 0.1 2 1 2 51 0.4 1 1 2 62 0.25 4 And the resulting data after the geometric mean of this default plotter is:
Tasks run ( x
axis)min channel width ( y
axis)fc wire length 1 1 53.84 -- -- 1 2 55.14 -- --
-
The below plot is generated when 2 tasks are selected and the x
axis is run
:
- if the x axis chosen is not related to time (e.g.:
fc
,circuit
...), then the plotter will reduce the data by calculating the geometric mean of y axis, over tuples with the samex
and other parameters. The following tables illustrate the data reduction process:-
Suppose the raw data is as follow:
Tasks run min channel width ( y
axis)fc ( x
axis)wire length ( parameter
)circuit ( geo mean
)1 1 70 0.1 1 a.blif 1 1 110 0.1 1 b.blif 1 1 68 0.1 2 a.blif 1 1 107 0.1 2 b.blif 1 1 73 0.1 4 a.blif 1 1 98 0.1 4 b.blif 1 1 60 0.25 1 a.blif 1 1 73 0.25 1 b.blif 1 1 43 0.25 2 a.blif 1 1 88 0.25 2 b.blif 1 1 63 0.25 4 a.blif 1 1 86 0.25 4 b.blif -
Then the data after reduction by geometric mean is:
Tasks run min channel width ( y
axis)fc ( x
axis)wire length ( parameter
)1 1 87.75 0.1 1 1 1 85.30 0.1 2 1 1 85.58 0.1 4 1 1 66.73 0.25 1 1 1 61.51 0.25 2 1 1 73.61 0.25 4
-
- After the geometric mean is calculated, a legend is selected such that the resulting number of plots is minimized.
Task: A collection of benchmarks that are run together. Is defined by a config.txt
file inside <task_dir>/config/
. Structure should look like:
task_name
config
config.txt
run1
run2
run3
...
Run: One sweep through the benchmarks defined in a task.
Parameter: A value that either affects or measures the performance of a benchmark.
Key parameters: The set of parameters that in combination define a particular benchmark. They are usually input parameters. For example, architecture and circuit define a VTR benchmark, so each benchmark is one particular combination of an architecture and a circuit.
Input parameters: The set of parameters that affect the performance of a benchmark. They are usually the key parameters.
Output parameters: The set of parameters that measure the outcome of a benchmark run. For example minimum channel width and critical path delay are output parameters of a VTR benchmark.
Axis: is mainly referred to by the plotter. Basically, the data input to the plotter is stored in a 2D table, whose columns are filled in by values of either a parameter or a metric. This table can be transformed into a high dimensional array, with each dimension corresponding to a parameter or a metric. So an axis is a synonym for dimension. In other words, for the plotter, an axis can be a filter, an x or a y.
Raw: data is called "raw" if it is not "compressed" or "reduced". Specifically, for the plotter, data is regarded as raw if it is not reduced by "gmean" (see the definition below).
Gmean: "gmean" is the abbreviations for "geometric mean". For example, in the VTR experiment, if the metric in interest (i.e.: y axis) is minimum channel width, then the "gmean" operation over "circuits" will merge the circuit axis by calculating the geometric mean of all circuits' minimum channel width.
Line Series: in the plotter, if you choose a parameter to be a line series, then a different line will be produced for each distinct value of that parameter on each plot.
Each tool has usage information and additional options found by running them with -h
. It is highly recommended those be consulted. The documentation here will only present the overall flow (pictured below), see command line help for detailed usage of each tool.
Listed in order that they should be run:
(TODO) run_task.py:
- preconditions:
- task-run structure
- inputs:
- task directory (relative or absolute path to it)
- actions:
- runs the user script specified in the configuration file and creates the next run directory
- create subdirectories for all key parameters, with each level being one key parameter
- outputs:
- raw output files in the deepest subdirectories
(TODO) parse_task.py:
- preconditions:
- task-run structure
- output files under each run
- inputs:
- task directory
- a user specified parse file which can be optionally defined in the configuration file
- actions:
- tries to match regex corresponding to parameters in the parse file in the relevant output file
- outputs:
- a result file, defaulting to
parsed_results.txt
in the run's root directory
- a result file, defaulting to
populate_db.py:
- preconditions:
- task-run structure
- inputs:
- task directory
- actions:
- creates a SQLite database (defaults to
results.db
) if it does not exist - creates a table for the input task if it does not exist (table name is
<task_name>|<host_name>|<user_name>|<abs_path>
) - populate the table with all runs of that task, ignoring existing runs
- creates a SQLite database (defaults to
- outputs:
- updates database
interface_db.py:
- preconditions:
- none
- inputs:
- database name
- task names
- filters
- x and y parameters
- actions:
- defines a library of python functions to query into database
- defines
Task_filter
class
- outputs:
- none
server_db.py:
- preconditions:
- none
- inputs:
- database name
- actions:
- creates a WSGI web service to be either hosted locally or publically (see instructions)
- renders the viewer template for a web interface into the database
- outputs:
- web API for querying into database with query string as arguments
- arguments: t = tasks, p = parameter, m = mode, fp = filtered parameter, fm = filter method, fa = filter argument, x = x paramter, y = y parameter
- list tasks
/
,/tasks
() - describe parameter
/param
(t,p,m) - retieve data
/data
(t,x,y,[fp,fm,fa]...)
- web viewer and plotter for GUI into database under
/view
, the online plotter is quite similar with the offline one, in terms of functionality (see plotter-offline.py).
- web API for querying into database with query string as arguments
plotter-offline.py:
- preconditions:
- in the same dir with interface_db.py
- inputs:
- data table with x and y axis specified, overlay axes, gmean axes
- actions:
- calculates the geometric mean of y axis over the gmean axes
- generates plots with the input x, y axis. The legend is given by the input overlay axes.
- user can choose whether to display the plot of the gmean data or the raw data.
- the plot will ignore the data points with invalid value, and do interpolation (when calculating gmean, the invalid value will also be picked out)
- outputs:
- the resulting plot
The format of each line should be <parameter_name>=<parameter_value>
.
Lines beginning in '#' are treated as comments.
Parameters include:
- user_script:
- script_params
- key_param:
- key_param_dir:
- key_param_add:
- parse_file:
The format of each line should be
<parameter_name>;<output_file>;<regex>;<default_value>
where the regex should have one capture group, the match of which becomes the value of the parameter.
The default value is optional, defaulting to "-1" if not specified. If nothing is matched, then
the parameter will have the default value.
Each item on every line should end in a tab. This includes the last element.
The first line is the header with all the parameter names.
N lines follow, where N=number of benchmarks for that task.
On each line is the value of each parameter for that benchmark.