# Simulation of Contrained Properties
### Abstract
Sometimes in the physical sciences it is necessary to build a simulation of a physical property. When a simulation is built the physical porperty value is calculated for various input values using a mathematical model of the property. The mathematical model having been obtained by fitting a function to experimental data. The most common functions used for modelling physical properties are often montonic and unconstrained (for example linear or polynomial regression). This means that the function can take every increasing or decreasing values at the extremes of input values. However, most physical properties have constraints on the values they can take, most physical properties have a defined zero value, few can take negative values. Exceptions to this are properties such as gain/attenuation measured in dB and pH measured in non-aqueous solution. Some properties may be measured relative to some datum and so may have a limited negative range, e.g. temperature in Celcius.

When a simulation is run, since the underlying mathematical function are unconstrained, it is possible for the simulation to produce phyically invalid values, even when a value may be physically valid it may still be so extreme as to be a value that is never observed in nature, i.e. would domain invalid.

Here a method for guaranteeing that simulated property values remain valid is investigated. If a simulation introduces noise (to produce a realistic (more natural) simulation), the effect of controling the character of the noise by summing after processing is compared to including oise before final processing.

## Introduction
Most physical properties are constrained to a range that is greater than zero (for absolute properties) or to fractional values in the range from 0 to 1, either inclusive or exclusive. Some properties are measured relative to a datum and so can take a limited range of negative values (e.g. temperature on the Fahrenheit or Celcius scale). Very few properties are physically unconstrained, examples include gain/attenuation measured in dB and pH in non-aqueous solutions (both of these properties are logarithms and can have negative values).

Often when fitting a model to experimental data simple functions are used (for example linear or polnomial regression) these functions have no inherent constraint on their range. So care must be taken when using these to build a simulation of the property. The simplest way to maintain a simulation that remains within the desired range is to clip the simulation (replace simulation values that are out of range to the nearest valid value). This however can introduce undesirable artefacts in the distribution of property values (excess at the limits of the valid range).

Compare this to using a neural network for classification, due to the use of a logistic activation function the output of a classifier is also within the range $(0, 1)$. Values in the range $(0,1)$ can be linear transformed into any arbitrary range. Let $y$ be a number in the range $(0,1)$ that is $y \in (0,1)$ then a property value $p$ in a range $\left(p_{min}, p_{max}\right)$ by $p = p_{min} + y\left(p_{max} - p_{min}\right)$. Thus if the last operation of building a simulation of a property is to apply a logistic transformation and scale and translate the result into the desired range it is guaranteed that the simulation values will be constrained to the desired range. This is the basis of soft-clipping.

All that is required is to find a function that will produce values that when the logistic function and the scaling and translation produces vaues that correctly estimate the property value we are simulating. This can be achieve by fitting the experimental data after the inverse to the logistic function has been applied. This inverse is the *logit* function.

### Soft Clipping
Soft clipping uses a mathematical transformation at the final step which always results in property values that will be valid. This can be achieved for any range of property values by starting with a transform which results in values greater than 0 and less than 1 and rescaling these into any range of values. This transform is based on a function which maps from the domain of real numbers to the range (0, 1), i.e. $ \mathbb{R} \mapsto (0, 1)$. *Note the domain of real number domain contains **all** real numbers, while the range (0, 1) contains only real numbers that are greater then 0 and less than 1.* It should be noted that the transform approaches the limits but never reaches them. Having found the transform for the final step it must be possible to transform the known data into a form that will retrieved by applying the final transform. (That is we need a transform which is reversible.) This inverse function maps the known values from a limited range (rescaled into the range (0,1) if necessary), into the domain of real numbers, for example $(x_{min}, x_{max}) \mapsto \mathbb{R}$

The steps to analyse experimental data, produce models that can then be used to build a simulation with all values within the correct range is as follows.

1. Transform experimental data from valid range to (0,1), $\left(p_{min}, p_{max}\right) \mapsto (0,1), p_{exp} \mapsto p_{exp}^\prime$.
1. Apply logit function $y_{exp} = \mathrm{logit}\left(p_{exp}^\prime\right): (0,1) \mapsto \mathbb{R}$
1. Perform analyis (e.g. linear regression) to produce a model function $\hat{y} = \mathrm{f}(\mathbf{x})$ where $\mathbf{x}$ are the independent variables from which $\hat{y}$ is estimated.
1. Build the simulation by calculating vales for $\hat{y}$ for various values of $\mathbf{x}: \mathbb{R}^n \mapsto \mathbb{R}$
1. Apply *logistic* function $\hat{p}^\prime = \mathrm{logisitic}\left(\hat{y}\right): \mathbb{R} \mapsto (0,1)$
1. Apply the sacling and translation required to return to desired range $\hat{p} = p_{min} + \hat{p}^\prime\left(p_{max} - p_{min}\right): (0,1) \mapsto \left(p_{min}, p_{max}\right)$