Skip to content

Technical Documentation

Patrick Lauer edited this page Mar 6, 2020 · 3 revisions

The PROPTI Scripts

The PROPTI scripts are written in Python 3.x. Emphasis has been put into compatibility with different operating systems (Mac OS, Linux and Windows) by utilisation of the os package. On the most basic level the scripts provide definition of data structures and elementary functions that are used throughout the package.

The scripts are open source and participation, in order to maintain their code or add further functionality, is very welcome. For you to work on the scripts, keep in mind there are some code style conventions to be maintained for the PROPTI framework. In general the style guide for Python code PEP 8 is to be followed.

PROPTI

The PROPTI scripts are the most basic scripts, used within the PROPTI framework. Those are:

  • data_structures.py
  • basic_functions.py
  • spotpy_wrapper.py
  • propti_analyse.py

Table of Contents

[[TOC]]

Data Structures

OptimiserProperties

class propti.OptimiserProperties

propti.OptimiserProperties is used to store information that needs to be provided to the optimiser itself. This would be information like what kind of optimisation algorithm to use, how many repetitions to perform or how to name the file the results are saved into.

The constructor of the OptimiserProperties class accepts the following input:

propti.OptimiserProperties(algorithm: str = 'sceua', 
                           repetitions: int = 1,
                           backup_every: int = 100,
                           ngs: int = None, 
                           db_name: str = 'propti_db',
                           db_type: str = 'csv',
                           db_precision=np.float64,
                           num_subprocesses: int = 1,
                           mpi: bool = False)
  • algorithm Defines the optimisation algorithm to be used. Default is set to SCEUA from SPOTPY.

  • repetitions Number of sampling repetitions to be performed by the algorithm. Note: The actual number of repetitions may be slightly different than what is entered here, depending on the algorithm.

    For now limited to the usage of SCEUA.

  • backup_every For algorithm that support this functionality, this value defines when break points are to be written. These breakpoints are used to restart the simulations in the event of a crash or when the computing time token at a computing cluster has expired and the process is stopped.

  • ngs Number of complexes that are to be used by the optimisation algorithm, if necessary.

    For now limited to the usage of SCEUA.

  • db_name File name of the data base file that will be created during the optimisation process.

  • db_type File type of the data base. For now only comma sperated value files (csv) are implemented.

  • db_precision Defines the precision of the output data, stored in the data base. NumPy floats are used, with a default precision of 64 bit.

  • num_subprocesses Defines the number of sub processes, that are to be used for each repetition during the inverse modelling process. This is used when a parameter set is to be tested in multiple different simulation setups simultaniously, for example two different irradiance levels in the Cone Calorimeter would lead to num_subprocesses=2.

  • mpi This parameter is provided to SPOTPY to leverage the power of multiple computing cores via MPI, thus speed up the execution of the optimisation algorithm. When set to False sequential execution is performed on one core alone.

    For now limited to the usage of SCEUA.


Public Methods - OptimiserProperties

The OptimiserProperties class has the following public methods:

  • upgrade This method upgrades legacy object instances with missing default values. Used to ensure compatibility of legacy inverse modelling projects with newer PROPTI versions.

    Use with care!

  • __str__ When called, it provides a human-readable output of the parameter, based on predefined definitions coded into this method (thus not exposed to the user).


Parameter

class propti.Parameter

propti.Parameter is used to store general parameter information. This class is used for parameters that are worked on by the optimisation algorithm, as well as general information (meta data) that describes environmental conditions.

The constructor of the Parameter class accepts the following input:

propti.Parameter(name: str, 
                 units: str = None, 
                 place_holder: str = None, 
                 value: Union[float, int, str] = None, 
                 distribution: str = 'uniform', 
                 min_value: float = None, 
                 max_value: float = None, 
                 max_increment: float = None)
  • name Character string that describes the parameter. Used for internal reference that is human readable.

  • units Character string that describes the measurement units.

  • place_holder Character string that is used in the template to mark location of parameter to be written. If no value is provided, name is chosen.

  • value Holds the current parameter value as a float. Can be initialised with a specific value for optimisation algorithms that need a guess vector to start from. Type checking is allowing multiple different types, since the parameter could be used to provide numerical values for computations or string e.g. for the naming of files.

  • distribution Specifies a distribution by which the algorithm shall sample individual parameters during the IMP, if needed by the algorithm. Possible values: 'uniform'.

  • min_value Lower limit of the range in which the algorithm is allowed to sample parameter values.

  • max_value Upper limit of the range in which the algorithm is allowed to sample parameter values.

  • max_increment Increment by which the algorithm is allowed to change the parameter value between simulations. This option is needed for some algorithms.


Public Methods - Parameter

The Parameter class has the following public methods:

  • create_spotpy_parameter No functionality right now - WIP.

  • upgrade This method upgrades legacy object instances with missing default values. Used to ensure compatibility of legacy inverse modelling projects with newer PROPTI versions.

    Use with care!

  • __str__ When called, it provides a human-readable output of the parameter, based on predefined definitions coded into this method (thus not exposed to the user).


ParameterSet

class propti.ParameterSet

propti.ParameterSet is a container for the parameters, used by the optimisation algorithm. Also, all the meta data need to be collected into a different ParameterSet.

The constructor of the ParameterSet class accepts the following input:

propti.ParameterSet(name: str = None, 
                    params: list[Parameter] = None)
  • name Optional label for the parameter set. Encouraged to be used for human-readable internal referencing and easier error tracking.

  • params Initial list of the parameters (deep copy).


Public Methods - ParameterSet

The ParameterSet class has the following public methods:

  • upgrade This method upgrades legacy object instances with missing default values. Used to ensure compatibility of legacy inverse modelling projects with newer PROPTI versions.

    CAREFUL! Since lists, like params, will be initalised as [], it may cause unrecognised consequences.

  • update Updates the already existing parameter set with a new one (other).

  • __len__ Returns the length (number of parameters) of the parameter set.

  • append Appends a new Parameter to the ParameterSet (as a deep copy).

  • __getitem__ Returns the Parameter at a given index of the ParameterSet

  • __str__ When called, it provides a human-readable output of the parameter, based on predefined definitions coded into this method (thus not exposed to the user).


DataSource

class propti.DataSource

The class propti.DataSource is used as a container for experimental and simulation data. Meta data is stored that identifies the desired data, as well as the data itself. This means the file name and the labels to read a data series. The pandas library is working in the back end to extract the information, thus the labels follow the pandas conventions.

The constructor of the DataSource class accepts the following input:

propti.DataSource(file_name: str = None,
                  header_line: int = None,
                  label_x: str = None,
                  label_y: str = None,
                  column_x: int = None,
                  column_y: int = None,
                  x_values: list = None,
                  y_values: list = None,
                  factor: float = None,
                  offset: float = None)
  • file_name Name of the file which contains the desired data, simulation or experiment.

  • header_line Row that contains the column labels. Follows the conventions of pandas data frames (which are working in the back end).

  • label_x Label of the column which contains the information of the x-axis (pandas data frames).

  • label_y Label of the column which contains the information of the y-axis (pandas data frames).

  • column_x Index of the column containing the data series of the x_values (not functional, yet).

  • column_y Index of the column containing the data series of the y_values (not functional, yet).

  • x_values Data of the x-axis (based on label_x or column_x).

  • y_values Data of the y-axis (based on label_y or column_y).

  • factor Factor to scale the data on-the-fly.

  • offset Offset to shift the data on-the-fly.


Public Methods - DataSource

The DataSource class has the following public methods:

  • __str__ When called, it provides a human-readable output of the DataSource, based on predefined definitions coded into this method (thus not exposed to the user).

Relation

class propti.Relation

propti.Relation utilises the propti.DataSource to connect specific experimental data with simulation results. Later, this created relationship is referred to when the fitness of a parameter set is to be evaluated. Multiple relations can be assigned, for example to account for different repetitions under the same experimental conditions.

The constructor of the Relation class accepts the following input:

propti.Relation(model: DataSource = None,
                experiment: DataSource = None,
                fitness_method: FitnessMethodInterface=None,
                weight: float=1.0)
  • model Data series that was produced by the model (simulation), which is to be compared to the experimental data. Will be initialised as empty propti.DataSource, if no input is provided.

  • experiment Data series of the experimental (target) data, which is to be compared to the model data. Will be initialised as empty propti.DataSource, if no input is provided.

  • fitness_method Fitness method used to compare experimental and model data. Available Methods:

    • FitnessMethodRMSE(n_points = None, x_def_range = None, scale_fitness = True)

    • FitnessMethodThreshold(threshold_type, threshold_value = None, threshold_range = None, scale_fitness = True)

  • weight Weight factor for total fitness calculation


Public Methods - Relation

The Relation class has the following public methods:

  • read_data Reads the specified experimental or model data and stores it in the Relation object.

  • map_to_def Maps the data series to the definition range. Takes factor and offset, defined in DataSource, into account. Furthermore, it performs an linear interpolation operation for the mapping.

  • __str__ When called, it provides a human-readable output of the Relation, based on predefined definitions coded into this method (thus not exposed to the user).


SimulationSetup

class propti.SimulationSetup

A propti.SimulationSetup is a specific set of information that describes an intended simulation, within an inverse modelling run, completely. It draws upon the classes that have been described above, further meta data is added as well. In general, it merges information on where the simulation is to be executed (working directory), what simulation software template and data shall be used, as well as where to store the results.

One could regard it as an experimental setup, with information on a sample and the conditions to be tested in. The same material, tested in the same apparatus, but at different conditions would require different SimulationSetups.

The constructor of the SimulationSetup class accepts the following input:

propti.SimulationSetup(name: str,
                       work_dir: os.path = os.path.join('.'),
                       model_template: os.path = None,
                       model_input_file: os.path = 'model_input.file',
                       model_parameter: ParameterSet = None,
                       model_executable: os.path = None,
                       execution_dir: os.path = None,
                       execution_dir_prefix: os.path = None,
                       best_dir: os.path = 'best_para',
                       analyser_input_file: os.path = 'input_analyser.py',
                       relations: List[Relation] = None)
  • name Human-readable identifier of the SimulationSetup.

  • work_dir The working directory, where all the data connected to this SimulationSetup is stored.

  • model_template Points to the simulation input file template.

  • model_input_file Name of the simulation input file. It will be created from model_template and the model_parameter.

  • model_parameter The ParameterSet that contains the parameters that are to be worked on by the optimisation algorithm. Note: parameters that describe the environment (experimental conditions), like heat flux, are provided elsewhere.

  • model_executable The simulation software to be used to perform the simulation. This argument will be provided to the command line down the line.

  • execution_dir Location where the model execution (simulation) is performed. In this directory PROPTI creates temporary directories where the model execution is performed. These directories will get a randomised name each, to avoid conflicts during the parallel execution of multiple simulations. When the simulation is finished, the information spefified in the Relation will be extracted and the remainig files deleted automatically. The extracted information is stored in the data base file.

  • execution_dir_prefix Identifier for the working sub-directories, prepended to the random part of the name. Allows easier error tracking.

  • best_dir Directory for performing simulation(s) with the best parameter set, after the conclusion of overall inverse modelling process.

    WIP - not functional yet.

  • analyser_input_file File name used as input for the analyser.

    WIP - not functional yet. Likely to be deprecated in the future, due to pursueing a different approach for now.

  • relations List of the relations between experimental and model data.


Public Methods - SimulationSetup

The SimulationSetup class has the following public methods:

  • upgrade This method upgrades legacy object instances with missing default values. Used to ensure compatibility of legacy inverse modelling projects with newer PROPTI versions.

    Use with care!

  • __str__ When called, it provides a human-readable output of the Relation, based on predefined definitions coded into this method (thus not exposed to the user).


SimulationSetupSet

class propti.SimulationSetupSet

The propti.SimulationSetupSet class merges all the different SimulationSetups. Thus, a complete set of information, defining the overall inverse modelling process, is obtained.

The constructor of the SimulationSetup class accepts the following input:

propti.SimulationSetupSet(name: str,
                          setups: List[SimulationSetup] = None)
  • name Human-readable identifier of the SimulationSetupSet.

  • setups List of the different SimulationSetups.


Public Methods - SimulationSetupSet

The SimulationSetupSet class has the following public methods:

  • upgrade This method upgrades legacy object instances with missing default values. Used to ensure compatibility of legacy inverse modelling projects with newer PROPTI versions.

    CAREFUL! Since lists, like params, will be initalised as [], it may cause unrecognised consequences.

  • __len__ Computes and returns the length of the set (number of elements).

  • append Appends a deep copy of the SimulationSetup to set.

  • __getitem__ Returns specific SimulationSetup.

  • __str__ When called, it provides a human-readable output of the Relation, based on predefined definitions coded into this method (thus not exposed to the user).


Version

class propti.Version

The propti.Version class is used to provide information of the current PROPTI version, as well as the utilised simulation software. The user does not need to interact with it directly. When executing propti_prepare.py propti.Version will be called automatically. The different version numbers are collected and stored in the pickle.init file. From there the version information can be extracted. This allows to keep track with which programme versions a specific inverse modelling process was performed. This information may be of interest for tracking down errors. It is also beneficial to expose the software versions for publications. For example, plots can get a label containing information on what software was used to create the data for this specific plot.

Note: This class is heavy WIP!

The constructor of the Version class contains the following parameters:

propti.Version(flag_propti = 0
               flag_exec = 0
               ver_propti = self.propti_versionCall()
               ver_exec = self.exec_versionCall()
               ver_spotpy = spotpy.__version__)
  • flag_propti

  • flag_exec

  • ver_propti Stores version information of PROPTI. When initialised, propti_version_call is called.

  • ver_exec Stores version information of simulation software executable. When initialised, exec_version_call is called.

  • ver_spotpy Stores version information of SPOTPY. When initialised, the build-in method spotpy.__version__ is called.


Public Methods - Version

The Version class has the following public methods:

  • __str__ When called, it provides a human-readable output of the Version, based on predefined definitions coded into this method (thus not exposed to the user).

Private Methods - Version

The Version class has the following private methods:

  • propti_version_call Reads the PROPTI version text file and extracts the version number.

  • exec_version_call Determines the version of the simulation software executable.

    Note: Right now it is pretty much taylored to FDS.

  • __repr__


Basic Functions

create_input_file

propti.create_input_file()

Creates input files for the simulation software. Information on how to construct the file is taken from a simulation setup. The input file will be written to the working directory.


Parameters - create_input_file

  • setup The SimulationSetup from which the input file is to be created.

  • work_dir Flag to determine if regular inverse modelling process is to be performed, or if the best parameter set is to be simulated, range:['execution', 'best'].

  • return Writes an input file to the directory specified by the SimulationSetup.


write_input_file

propti.write_input_file()

Writes data to a file, input expected to be string. The file is created in a specified directory.


Parameters - write_input_files

  • content Information to be written to a file, expected to be string.

  • file_path File name and path of file to be created.

  • return File written to specified location.


fill_place_holder

propti.fill_place_holder()

Takes a string that contains specific markers, or place holders. The string gets parsed and the markers are compared with information provided by a ParameterSet. When a matching place holder is found it is exchanged by the parameter value needed for the simulation.

By convention, the markers are to be encapsulated by the '#' character, for example: '#example_marker#'. This encapsulation is performed automatically by the fill_place_holder function. Note: The user needs to take care for the encapsulation, when the template file is prepared!


Parameters - fill_place_holder

  • template_content A string containing markers that are encapsulated by the '#' character.

  • paras A ParameterSet from which the parameter information is taken and implemented into the template.

  • return A string where placeholders have been exchanged by parameter values.


read_template

propti.read_template()

Reads a specified text file. Returns the content as a string.


Parameters - read_template

  • filename Name of the file to be read.

  • return File content as string.


run_simulations

propti.run_simulations()

Takes information from the SimulationSetupsSet to find the necessary simulation input files and start the simulation(s). Multiple simulation setups can be processed at once, which is targeted at cases where the users wants to tests the behaviour of a parameter set in different experimental conditions at once.

Distributes the information to either run_simulation_serial or run_simulation_mp, which actually handle simulation execution.


Parameters - run_simulations

  • setups Set of simulation setups, describes what simulations need to be executed.

  • num_subprocesses Defines how many sub-processes are to be used per repetition.

  • best_para_run Flag to switch between an inverse modelling run or simulation of best parameter set.

    WIP - not functional, yet.

  • return None.


run_simulation_serial

propti.run_simulation_serial()

Executes the simulation of a single simulation setup. Takes information from the SimulationSetup to find the necessary simulation input files and start the simulation.


Parameters - run_simulation_serial

  • setup Simulation setup, describes what simulation needs to be executed.

  • best_para_run Flag to switch between an inverse modelling run or simulation of best parameter set.

    WIP - not functional, yet.

  • return None.


run_simulation_mp

propti.run_simulation_mp()

Executes the simulation of multiple simulation setups in parallel. Takes information from the SimulationSetupsSet to find the necessary simulation input files and start the simulation(s). Multiple simulation setups can be processed at once, which is targeted at cases where the users wants to tests the behaviour of a parameter set in different experimental conditions at once.


Parameters - run_simulation_mp

  • setups Set of simulation setups, describes what simulations need to be executed.

  • num_threads Defines how many sub-processes are to be used per repetition.

  • return None.


extract_simulation_data

propti.extract_simulation_data()

WIP - thus no further information is provided here at this point.


map_data

propti.map_data()

WIP - thus no further information is provided here at this point.


SPOTPY Wrapper

SpotpySetup

class propti.SpotpySetup

propti.SpotpySetup is used to handle the communication with SPOTPY. It is responsible to extract the proper information from SimulationSetupSets and transfer it to SPOTPY, as well as transfer guess vectors of new parameter sets back to PROPTI. For now it is mainly tailored to the SCEUA algorithm that is implemented in SPOTPY.

The constructor of the SpotpySetup class accepts the following input:

propti.SpotpySetup(params: ParameterSet,
                   setups: SimulationSetupSet,
                   optimiser: OptimiserProperties)
  • params Parameter sets that provide information on the parameters which shall be optimised.

  • setups Information that describes the inverse modelling run.

  • optimiser Parameters passed to the optimiser, like the algorithm do be used or the distribution for guess vector sampling.


Public Methods - SpotpySetup

The SpotpySetup class has the following public methods:

  • parameters Provides a guess vector (parameter set) from SPOTPY.

  • simulation Starts the simulation of an individual parameter set that has been generated by SPOTPY, based on information provided by the 'SimulationSetupSet'. Creates input files from templates, fills in parameter values from SPOTPY, executes the simulation, collects the simulation results, deletes the remaining files and gives the simulation results back.

    Needs a parameter list (vector) as input.

  • evaluation Reads the target (experimental) data used for the evaluation of the performance of the parameter set against the target data. It returns an array of the target data.

  • objectivefunction Compares the simulation results with the target (experimental) data, by utilising a root mean square error RMSE, implemented in SPOTPY.


PROPTI Analyse

The script propti_analyse.py is a collection of some basic tools to aid the user in accessing their data more easily. This collection is not meant to be the solution to all issues the user might come across. It is rather a starting point for new users, to develop tools they need for themselves.

As a basic concept, the tools are set up that they query the pickle file to get necessary information. This could be the labels for the different parameters, or number of individuals within a generation of a SCEUA run, and similar information. This ability, to revert back to the pickle files, makes the tools quite dynamic.

To interact with this script, the user uses the command line and provides a parameter that calls the desired functionality. For example:

python3 path/to/propti/propti_analyse.py --dump_plots .

This calls Python 3 with propti_analyse.py, the --dump_plots parameter creates predefined plots of the inverse modeling process (see below). Note the period at the end, this tells propti_analyse.py where to look for the pickle file, thus in this example it is assumed the users starts in the main directory of the IMP run.

Note: This script is highly work in progress and my change a lot in the future. Here, only the most robust tools are described, which may be also the most useful to new users.

--dump_plots

This creates a number of plots, focused on the SCEUA. Thes following plots are created:

  • Scatter plot of the fitness values over the whole IMP run
  • Scatter plots of the single parameter values during the IMP, one for each parameter
  • Scatter plots for the "best" parameter value per generation, for each parameter
  • Boxplots of the fitness values, each generation represented as a boxplot

--clean_db

When using the restart functionality, where the batch script write markers into the database file (csv), this function cleans the database file. It is focussed on the SCEUA. It will create two files, one with only the markers removed. A second file gets the markers removed, as well as the partly completed generations, that might be left over after restart after a system crash.

It is rather simple, in that it just counts the lines between the restart markers and takes only the first n generations times m individuals and stops when no full generation is left.