Skip to content

Latest commit

 

History

History
157 lines (120 loc) · 8.3 KB

PlotCreating.md

File metadata and controls

157 lines (120 loc) · 8.3 KB

Plotting Graphs

This file describes how to plot benchmark results. All paths in this files are given relative to the project root.

Plot Script Definition

Special plotting script is responsible for creating benchmark plots. It is a Python3 script defined in the file ADBench/plot_graphs.py. It has the following dependencies:

  • matplotlib
  • plotly

These dependencies are fetched automatically during CMake configure.

By default, this script looks through the directory tmp and for every subdirectory, that contains results produced by the global runner, it creates graphs and stores them in the folder tmp/graphs. Two kinds of graphs are available: static PNG files and dynamic plotly HTML files. Also, it saves two JSON files that are contain all test sizes (graphs_index.json) and all collected plotted info, e.g. times, tool names etc. (plot_data.json). If

The script command line arguments are:

Argument Definition
--save Makes the script save the static PNG files with the created plots
--plotly Makes the script save the dynamic plotly HTML files with the created plots
--show Makes the script show the created static plots on the screen
--use-file <file> Makes the script use the specified input file, which holds plot data information, instead of data files generated by the global runner. Also, the script will save all outputs to the directory where the file is located.
-h or --help or -? Prints the reference about the script command line arguments

Note: if neither --save nor --plotly is specified then --show is used by default.

Input Files

If the argument --use-file is not specified, then the script processes information from the files generated by the global runner that are stored in the directory tmp/<BuildType>. Here <BuildType> is a build type of the benchmark run (most often Release). This folder contains several subfolders that correspond to different objective types (ba, gmm etc.). In each of these subfolders there are different tool directories that contain calculated result files, correctness files, and benchmark time files. The plot script uses info from the correctness and the time files to create graphs. Time file names have the following form:

<DataFileName>_times_<ToolName>.txt

Here <DataFileName> is the name of the input data file that was used for the run. <ToolName> part of the name defines the name of the used tool. Note, that this part of the file name is used for choosing the plot style in graphs.

More descriptive information about files generated by the global runner see here.

Specifying --use-file argument, the user can pass a file, containig plot data, to the script. In this case the script will use this file to generate plots and will save all outputs to the same directory where the input file is located. A new plot_data.json file is not created in this case.

Output Files

Graphs

If --save or --plotly option is specified then the script creates the folder tmp/graphs where it stores the result plots. The structure of this directory has the following form:

tmp/graphs
    /static
        /<BuildType>
            /jacobain
                PNG files
            /jacobian ÷ objective
                PNG files
            /objective
                PNG files
            /objective ÷ Manual
                PNG files
        /<BuildType>
            ...
    /plotly
        /<BuildType>
            /jacobain
                HTML files
            /jacobian ÷ objective
                HTML files
            /objective
                HTML files
            /objective ÷ Manual
                HTML files
        /<BuildType>
            ...
    graphs_index.json
    plot_data.json

Here <BuildType> is the build type of the run that the global runner has performed. If --save or --plotly was not specified then the respective subfolder will not be created. Note, that the file plot_data.json won't be created in the case of using specified plot data file.

The following table describes the graph types:

Graph type What it shows
jacobian Jacobian calculation time
jacobian ÷ objective Ratio of the Jacobian calculation time and the objective calculation time
objective Objective calculation time
objective ÷ Manual Ratio of the objective calculation time by the current tool and by Manual

Names of the files in the graph type directories have the form

<Objective> <Size> [<GraphType>] - <BuildType> Graph.<Extension>

Here <Objective> is the type of the objective in the upper case (e.g. BA), <GraphType> is the type of the graph with the first letter capitalized (e.g. Jacobian), <BuildType> is the run build type, <Extension> is png or html respectively for static or plotly graph versions. <Size> contains info about the tool run benchmark size. For the Hand objective size can be simple or complicated and big or small, for the GMM objective it can be 1k or 10k. This part of the file name is omitted when not applicable (BA and LSTM objectives don't have different sizes).

Graphs Index

Information about the run benchmark sizes for all graph types is stored in the file graphs_index.json. This file has the following form:

{
    <BuildType>: {
        <GraphType>: {
            <Objective>: [ <Sizes> ],
            ...
        },
        ...
    },
    ...
}

Here <BuildType> is the run build type, <GraphType> is a graph type, <Objective> is an objective type in the upper case, and <Sizes> is an array of all run benchmark sizes for the specific objective type.

Plot Data File

The script collects all the data from the files, generated by the global runner. It saves it into the JSON file plot_data.json in the directory where the plots are stored. This file has the following structure:

[
    {
        "figure_info": {
            "build": <BuildType>,
            "objective": <Objective>,
            "function_type": <FunctionType>,
            "test_size": <TestSize>
        },
        "values": [
            {
                "tool": <ToolName>,
                "time": [ <RunningTimeValues> ],
                "variable_count": [ <VariableCount> ],
                "violations": [ <WasViolation> ]
            },
            ...
        ]
    },
    ...
]

Here <FunctionType> is the type of the metrics, e.g. "jacobian", "objective" etc.; <TestSize> is a string defines the test size, it can be empty if the size is not present (e.g. for BA); <RunningTimeValues> is an array of measured time values corresponding to respective <VariableCount> values; <WasViolation> is an array of bool that shows, whether the calculation for the respective <VariableCount> value was correct or not. <RunningTimeVales> vals can be both double or special Infinity constant that makes the file not pure JSON one.

Plot Styles

Each tool has a unique specific plot style in all graphs. These styles are defined in the variable tool_styles in the file ADBench/plot_graphs.py. This variable has the following form:

tool_styles = {
    <ToolName>: ( <LineColor>, <Marker>, <DisplayName> ),
    ...
}

Here <ToolName> is the name of the tool that is extracted form the time file (see input files section), <LineColor> and <Marker> define a plot line color and a marker style, and <DisplayName> is a name that is shown in a graph legend (it can be omitted, in that case <ToolName> is used as a display name). If you want to add a new style, you should just add a new element to this dictionary. Note, that the new style must be unique. Style display name is suggested to have the form <Language/Platform>, <ToolName>, where <Language/Platform> is a programming language or a platform of the tool, <ToolName> is a tool name.

If for some tools there are no specified styles then default styles are used. Default styles have no display names and their markers are crosses. Note, that the default styles are not stable. That means that a default style is not linked to the tool, so the tool plot can have different colors on different graphs.