Cian Wilson edited this page Jul 18, 2015 · 9 revisions

TerraFERMA Home | Wiki Home

tfplot

tfplot is a plotting tool for quickly interrogating the contents of diagnostics files. It can be called as:

tfplot <diagnostics file name>

where the diagnostics file name corresponds to a statistics (.stat), steady state (.steady), detectors (.det) or convergence (.conv) file output by TerraFERMA.

Once opened, the plot can be manipulated using the keys:

  • r - refresh data (update bounds)
  • R - refresh data (maintain bounds)
  • l - switch to line plot
  • s - switch to scatter plot
  • x - toggle between linear and logarithmic x-axis
  • y - toggle between linear and logarithmic y-axis
  • q - quit

If called with multiple filenames tfplot will attempt to interpret them as checkpointed files from the same simulation (requiring the content names to match). tfplot additionally accepts -h and -v command line arguments for help and verbose output respectively

For more complicated plotting requirements see the buckettools python module below.

Acknowledgement: tfplot was modified from Fluidity's statplot tool.

tfplot_meshfunctions

tfplot_meshfunction is a simple script to quickly plot DOLFIN meshfunctions. It is intended to be used with cellfunctions and facetfunctions output from functionals in TerraFERMA (/system/functional/output_cell_function and /system/functional/output_facet_function). It can be called with:

tfplot_meshfunction -m <mesh file name> <meshfunction file name>

where the mesh and meshfunction files are output from TerraFERMA (a mesh file is output if either a cell or facet function is requested even if an internal mesh is being used). Note that cell and facet function output is only available in serial.

tfbuild

tfbuild is a simple script to quickly [configure TerraFERMA simulations](Running TerraFERMA#tfbuild). It wraps a direct call to CMake to slightly simplify the build process. It can be called as:

tfbuild <tfml file name>

which will take the given tfml file, create a build directory and configure the build.

By default the build directory will be named build. This can be modified with the -d command line argument. Also, by default the build will be configured to name the executable the basename of the tfml file. This can be modified with the -e command line argument. Additionally -a allows required input files to be linked to the build directory.

To modify the CMake process, additional CMake arguments can be passed using -D or by setting the environment variable TF_CMAKE_EXTRA_DARGS. CMake can also be invoked in interactive mode using ccmake (if installed) using -i.

Command line help is available using -h.

tfsimulationharness

tfsimulationharness is a tool for managing suites of TerraFERMA simulations and their dependencies. It takes its own options file with the extension .shml for simulation harness markup language, an xml syntax with rules provided by the simulation harness schema. While a tfml file describes a single TerraFERMA simulation, a shml file (in combination with one or more tfml files) can be used to describe a suite of simulations and their dependencies over a range of parameters.

A shml file can be opened and editted in diamond, just like a tfml file:

diamond <shml file name>

Once open the simplest shml file should set the:

  • length - an approximate length of the simulation

    This is meant to be a guide to users but is only used by the simulation harness itself if asked to only run simulations of a certain length (see the -l command line argument below).

  • owner - indicate the simulation designer

    Again, this is intended as a guide to users who come across the shml and it sonly used by the simulation harness itself if asked to filter based on owner (see the -o command line argument below).

  • description - a brief description of the simulation the shml file is designed to run

    Purely for documenting purposes, users are encouraged to enter a description of the simulation here.

  • input_file - provide the name of the input tfml file name for the simulation

    This option will need to be activated by turning on the /simulations tab and at least one /simulations/simulation in diamond. You should then enter the associate tfml file name.

    After activating a simulation (/simulations/simulation) it will also be necessary to give the simulation a name.

Having set these options the simulation harness is capable of reading the shml file and running TerraFERMA through it (obviously this is a very simple example where it would have been just as easy to run the simulation manually). The command:

tfsimulationharness --run <shml file name>

will configure, build and run the simulation. Repeated invocation of the command will rebuild only if necessary and rerun the simulation. To only configure, use --configure instead of --run. Similarly to only build use --build.
The simulation harness will automatically construct a directory structure to manage the build and output. In this simple case the build will be found in <tfml file name>.build/build while the simulation output will be in <tfml file name>.run/run_0.

Other options that may be useful to set in a simple shml file are:

  • required_input - provide paths to any files required by the simulation as input

    Files listed here will be automatically copied into the run directory so that they are available for the simulation.

  • required_output - provide the names of any files that this simulation must output

    This list is used to judge whether a simulation has successfully completed or not. By default a simulation will not rerun if all its required_output is found to be present (and its input hasn't changed). This behaviour can be changed by varying the run_when option from the default, input_changed_or_output_missing, to the other options, input_changed, output_missing, always, never.

Runs

The simulation tab(s) in the shml file always describe TerraFERMA simulations however the simulation harness is also capable of running other programs. This functionality is primarily intended to describe dependencies of simulations, like mesh generators or other preprocessors. These can be described under a run tab and if, as is usual, this is to describe a dependency of a simulation then this should be activated by turning on the /simulations/simulation/dependencies tab to reveal a run. Note that simulation dependencies can also be other simulations, hence simulation tabs are also available. Each run will need a name and the following suboptions should be completed or activated:

  • input_file - provide the name of the input to whatever program is being called

    This file can be any generic configuration file. If it happens to be a schema-driven SPuD file then you can indicate this with the suboption spud_file.

  • required_output - provide the filename(s) of the output of this run that is needed by its parent simulation (if any)

    If this run has been activated as a dependency of a simulation then any files listed here will be copied to the run directory for that simulation.

  • commands - provide a list of shell commands that will be run

    Commands listed here will be run and should produce the output listed above in required_output.

Parameter Sweeps

The real power of the simulation harness comes when running a simulation over some parameter space. This functionality can be turned on by activating the /simulations/simulation/parameter_sweep option in the highest level simulation. This allows you to specify any number of parameters to explore. After activating a parameter tab and giving it a name you can specify suboptions describing:

  • values - a comma, semicolon or space separated list of parameter values

    Enter the values of the parameter that are being considered in this sweep.

  • update - provide python code to update the input file with the parameter values

    If this simulation depends on this parameter then activate this option to provide python code to update the input_file. Generally if this simulation doesn't depend on this parameter then one of its dependencies does so there's no need to activate this option here - instead activate it beneath the dependency.

    When dealing with a simulation the input_file will be loaded by the simulation harness by SPuD's libspud python interface. It can then be editted by providing update code like:

    import libspud
    libspud.set_option(<path to relevant option in tfml file>, <type>(<parameter name>))
    

    Where the SPuD path to the option in the tfml file needs to be provided. The parameter value is supplied as a string (to avoid floating point inaccuracies and so that the values can be used consistently in the output directory structure) so needs to be converted to the correct type (e.g. float, int etc.).

    A suboption of the update tab is a switch for single_build, which indicates if this modification of the input file is a runtime option or (if left deactivated) a compile time option. If you are uncertain about this it is best to leave this option unchecked as it will only cost an extra recompilation. A general rule of thumb is that modifying UFL, forms, functionals, C++ code or elements will require a recompilation. Editting constant values, resolution or python code will not require recompilation and hence can have the single_build option turned on.

Dependencies can also access the parameter sweep. Beneath any dependency simply turn on the parameter_sweep option. Activate (and give the same names to) as many parameters from the dependency's direct parent as the dependency (or its children) require. Dependencies cannot modify the list of parameter values. This can only be done at the highest level of the tree. Instead dependencies inherit the values from their parents and can only modify their own input files based on the values provided to them using one option:

  • update - provide python code to update the input file with the inherited parameter values

    If the dependency is itself a simulation then the input_file is again loaded using the libspud python interface. Similarly, for run dependencies where the input_file was listed as a SPuD file (input_file/spud_file). However, if the input_file for a run dependency is not a SPuD file then it is loaded as a simple string and made available through the python variable input_file. This can be editted most easily using the string.Template python module. For example:

    from string import Template as template
    input_file = template(input_file).safe_substitute({"<parameter name>":<parameter name>})
    

    where we have assumed that the initial input_file had suitable tags left for such substitution. Obviously this is just one way of updating these files and the simulation harness accepts any valid python in these code snippets so there is lots of scope for alternative methods.

Variables and Testing

The simulation harness can also be used to interrogate output from the simulations and runs the shml file describes. This is done by activating the variables tab and activating as many variable elements as desired. Each variable should be named and have python code describing how some output should be interrogated. The code must set a variable with the same name as given to the variable element it belongs to.

For example, to assign the final time of a simulation to the variable 'finish_time' using the buckettools python module (see below) use:

from buckettools.statfile import parser
stat = parser("<output base name>.stat")
finish_time = stat["ElapsedTime"]["value"][-1]

Variables assigned in this way are stored, collated and made available in the testing section of the shml file. This can be found be activating the /tests element and activating as many /tests/test as you want (as usual each test requires a name).
All variables (referenced by the names given above) are available here as:

  • a single value - if only a single simulation was run

  • a list of values - if a simulation was run multiple times (using e.g. /simulations/simulation/parameter_sweep/number_runs)

  • a NestedList of values - if a simulation was run over a parameter sweep

    A NestedList is a derived python list object that allows users to access the list of values both by index (as in a normal python list) and by dictionary of parameter name(s) and value(s).

    For example, given three parameters, a, b and c, each with three input values pv1, pv2 and pv3 (where p = a, b or c respectively):

    /simulations/simulation/parameter_sweep/parameter::a/values : a0, a1, a2

    /simulations/simulation/parameter_sweep/parameter::b/values : b0, b1, b2

    /simulations/simulation/parameter_sweep/parameter::c/values : c0, c1, c2

    then we have a 27 simulation parameter sweep, which means that any variable we assign will have 27 possible values, depending on the parameter combination it used. These will be stored in a NestedList list with the same name as the variable in question.

    For example, given a variable called var that gets assigned concatenated strings of the input parameter values:

    var[0][0][0]
    

    would return:

    'a0b0c0'
    

    However, we can also access this element using a dictionary of parameter names and values such that:

    var[{'a':'a0', 'b':'b0', 'c':'c0'}]
    

    also returns:

    'a0b0c0'
    

    The second syntax can be more useful as it doesn't require referencing back to determine the order in which the parameters were given. Indexing order does not matter as, for example, var[{'c':'c0', 'b':'b0', 'a':'a0'}] would return the same answer.

    NestedList indexing by dictionary can also be used to access multiple values:

    var[{'a':'a0', 'b':'b0', 'c':['c0', 'c1', 'c2']}]
    

    which, in this case, is the equivalent of:

    var[{'a':'a0', 'b':'b0'}]
    

    and returns a list:

    ['a0b0c0', 'a0b0c1', 'a0b0c2']
    

    Similarly:

    var[{'a':['a0', 'a1'], 'b':['b0', 'b2'], 'c':['c0', 'c1', 'c2']}]
    

    returns:

    [[['a0b0c0', 'a0b0c1', 'a0b0c2'], ['a0b2c0', 'a0b2c1', 'a0b2c2']],
     [['a1b0c0', 'a1b0c1', 'a1b0c2'], ['a1b2c0', 'a1b2c1', 'a1b2c2']]]
    

    Parameters unknown to the NestedList will be ignored so var[{'d':'d1'}] will return a standard list of all 27 possible values:

    [[['a0b0c0', 'a0b0c1', 'a0b0c2'],
      ['a0b1c0', 'a0b1c1', 'a0b1c2'],
      ['a0b2c0', 'a0b2c1', 'a0b2c2']],
     [['a1b0c0', 'a1b0c1', 'a1b0c2'],
      ['a1b1c0', 'a1b1c1', 'a1b1c2'],
      ['a1b2c0', 'a1b2c1', 'a1b2c2']],
     [['a2b0c0', 'a2b0c1', 'a2b0c2'],
      ['a2b1c0', 'a2b1c1', 'a2b1c2'],
      ['a2b2c0', 'a2b2c1', 'a2b2c2']]]
    

    The parameters dictionary being used to index into any particular NestedList is available as:

    var.parameters
    

    which, in this case, returns:

    {'a': ['a0', 'a1', 'a2'], 'b': ['b0', 'b1', 'b2'], 'c': ['c0', 'c1', 'c2']}
    

    Again, this is useful to iterate over values (or sub-sets of values) within a test without manually referring back to the parameters set.

  • a NestedList of lists of values - if a simulation was run multiple times (using e.g. /simulations/simulation/parameter_sweep/number_runs) over a parameter sweep

    The NestedList in this case is identical to that described above but at the lowest level a standard python list is returned containing the values of the variable for each run.

Tests can be used to compare multiple runs; for example to test for convergence or generate post-processing output. If assert statements are included then any test returning an AssertionError will be flagged as failing and reported to screen. This functionality is widely used in the testing and benchmarking suites.

To make the simulation harness evaluate variables and run the tests it is necessary to invoke it as:

tfsimulationharness --test <shml file name>

This will configure, build (when necessary) and run the simulations described in the shml file before testing the output. If the output is already available and you don't want the simulations to rerun then use --just-test instead of --test.

Suites

So far we have discussed setting up a single shml file however the simulation harness is capable of searching for, building, running and testing multiple simulations (in a single or multiple shml files) simultaneously. For example:

tfsimulationharness --test -r -- '*.shml'

will recursively find all shml files in the the current or lower directories and run and test them. The shml files that are run can be filtered based on length, -l (short, medium, long or special), owner, -o, parallelism, -p (serial, parallel, any), or tag, -t (to include tags) or -e (to exclude tags).

Multiple simulations can be run side by side by using the -n command line argument. Note that this does not limit the number of processes used as any parallel simulations found will still be run (and only count towards a single simulation count).

A full description of all the command line options is available using the -h option:

usage: tfsimulationharness [-h] [-n NTHREADS] [-r [depth]]
                           [-l length [length ...]] [-p parallelism]
                           [-o ownerid [ownerid ...]] [-t tag [tag ...]]
                           [-e tag [tag ...]] [--generate] [--configure]
                           [--build] [--run] [--test] [--just-test]
                           [--just-list] [--list-input] [--clean] [-f]
                           filename [filename ...]

Run simulations and manipulate the output data.

positional arguments:
  filename              specify filename(s)

optional arguments:
  -h, --help            show this help message and exit
  -n NTHREADS, --nthreads NTHREADS
                        number of threads
  -r [depth], --recursive [depth]
                        recursively search the directory tree for files (if no
                        depth is specified full recursion will be used)
  -l length [length ...], --length length [length ...]
                        length(s) of problem (if unspecified will run all
                        lengths)
  -p parallelism, --parallelism parallelism
                        parallelism of problem: options are serial, parallel
                        or any (default=any)
  -o ownerid [ownerid ...], --owner ownerid [ownerid ...]
                        run only tests that have specific owner ids (if
                        unspecified will include all owners)
  -t tag [tag ...], --tags tag [tag ...]
                        run only tests that have specific tags (if unspecified
                        will run all tags)
  -e tag [tag ...], --exclude tag [tag ...]
                        run only tests that do not have specific tags (takes
                        precedence over -t, if unspecified will not exclude
                        any tags)
  --generate            generate the simulation directories and input
  --configure           configure the simulations
  --build               build the simulations
  --run                 run the simulations
  --test                test the simulations
  --just-test           only test the current output of the simulations (do
                        not rerun)
  --just-list           only list the simulations
  --list-input          list the input to the simulations
  --clean               removes the run (and build) directories from previous
                        simulation runs
  -f, --force           force rebuild(s)

For examples of completed shml files see the tests and benchmarks. There are also worked examples of using shml files in the cookbook with corresponding worked examples in the tutorials directory of the source.

updatetfml

updatetfml is a simple script for updating tfml files following changes to the TerraFERMA schema. It is invoked on a single file using:

updatetfml <tfml file name>

If changes are found this will replace the given tfml file, first making a backup at <tfml file name>.bak.

Generally running updatetfml will be unecessary. Small changes to the options tree, like adding a new option or removing a deprecated one, will be handled when a file is opened in diamond, with a warning to the user and any elements that require input being highlighted in blue. updatetfml only becomes necessary when some tree restructuring has happened, such as an option moving to a new location or it being renamed. In these cases we add rules to updatetfml to handle the change.

updatetfml can also be invoked on a suite of tfml files using:

updatetfml -r -- '*.tfml'

which will find all tfml files in the current and lower directories and update them.

Note that updatetfml requires the python lxml package.

updateshml

updateshml is similar to updatetfml except that it handles changes to the TerraFERMA simulation harness schema. It is invoked using:

updateshml <shml file name>

for a single shml file, or:

updateshml -r -- '*.shml'

to recursively search for and update multiple shml files.

buckettools python module

The buckettools python module contains various python tools. These include preprocessing classes that generate UFL and C++ code that is later compiled and linked to TerraFERMA executables. Here, however, we focus on a few tools used for postprocessing and interacting with TerraFERMA output.

statfile parser

The statfile parser provides a way of accessing data in diagnostic files from TerraFERMA in a python shell.
It is compatible with statistics (.stat), steady state (.steady), detectors (.det) and convergence (.conv) files. Loading any of these files using the parser will return a nested tree of python dictionaries, at the base of which lies the values for the particular diagnostic.

For example, a statistics file may be loaded using:

from buckettools.statfile import parser
stat = parser("<output base name>.stat")

The stat object returned by this command is a dictionary so, for example:

stat.keys()

will return a list of available keys. These will include special values for the elapsed time (ElapsedTime), the wall time (ElapsedWallTime), the timestep size (dt) and the timestep number (timestep). An array containing the values of any of these special fields can be accessed using:

stat["<statistic name>"]["value"]

If any fields, coefficients or functionals were included in the statistics file their statistics will also be available, indexed by system name. So, for example:

stat["<system name>"]["<field or coefficient name>"]["max"]

will return the maximum value of a given field or coefficient in a given system. Similarly:

stat["<system name>"]["<functional name>"]["functional_value"]

will return the value of given function in a given system.

Statistics that are constant for the simulation are accessible as a separate dictionary using:

stat.constants

which returns information like the compile time, the simulation start time, the TerraFERMA version number and the host name.

All diagnostic files can be interrogated in this way with a similar hierarchy of nested dictionaries representing data from different systems, fields, coefficients and functionals. Once the data array is reached, vector fields and coefficients are indexed by dimension first and output line second. Similarly, tensor fields and coefficients are indexed by dimension first (row priority ordering) and output line second. Arrays of detectors are grouped together (so higher rank fields and coefficients are split up over dimension), again with array entry indexed first then output line second.

In combination with, for example, python's matplotlib module the statfile parser can be used to produce better plots than are available with tfplot.

The statfile parser is used in most tests and benchmarks.

Acknowledgement: the statfile parser is modified from Fluidity.

vtktools

vtktools is a small python module that wraps several useful python vtk commands to more easily perform certain common operations. It reads in .vtu and .pvtu files (not the .pvd wrappers):

from buckettools import vtktools
vtu = vtktools.vtu("<output base name><dump number>.[p]vtu")

where the returned vtu object is a python class that contains several useful commands.

For example:

  • vtu.GetFieldNames()

    will return the names of all fields and coefficients in the vtu file

  • vtu.GetField("<system name>::<field or coefficient name>")

    will return an array containing the values of the given field or coefficient at the vtu grid points

  • vtu.GetLocations()

    will return the locations of the vtu grid points

  • ...

vtktools is used in several tests and benchmarks.

Acknowledgement: vtktools was copied from Fluidity.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.