# Welcome to the Parameter Estimation Feature Example

The goal of this notebook is to instruct ProgPy users on how to use the estimate_params feature for PrognosticModels.

First some background. Parameter estimation is used to tune the parameters of a general model so its behavior matches the behavior of a specific system. For example, parameters of the battery model can be tuned to configure the model to describe the behavior of a specific battery.

Generally, parameter estimation is done by tuning the parameters of the model so that simulation best matches the behavior observed in some available data. In ProgPy, this is done using the prog_models.PrognosticsModel.estimate_params() method. This method takes input and output data from one or more runs, and uses scipy.optimize.minimize function to estimate the parameters of the model. For more information, refer to our Documentation [here](https://nasa.github.io/progpy/prog_models_guide.html#parameter-estimation)

A few definitions:
* __`keys`__ `(list[str])`: Parameter keys to optimize
* __`times`__ `(list[float])`: Array of times for each run
* __`inputs`__ `(list[InputContainer])`: Array of input containers where inputs[x] corresponds to times[x]
* __`outputs`__ `(list[OutputContainer])`: Array of output containers where outputs[x] corresponds to times[x]
* __`method`__ `(str, optional)`: Optimization method- see scipy.optimize.minimize for options
* __`tol`__ `(int, optional)`: Tolerance for termination. Depending on the provided minimization method, specifying tolerance sets solver-specific options to tol
* __`error_method`__ `(str, optional)`: Method to use in calculating error. See calc_error for options
* __`bounds`__ `(tuple or dict, optional)`: Bounds for optimization in format ((lower1, upper1), (lower2, upper2), ...) or {key1: (lower1, upper1), key2: (lower2, upper2), ...}
* __`options`__ `(dict, optional)`: Options passed to optimizer. See scipy.optimize.minimize for options

#### Example 1) Simple Example

Now we will show an example demonstrating the model parameter estimation feature. In this example, we will be estimating the parameters for a model from data . In general, the data will usually be collected from the physical system or from a different model (model surrogacy). 

First, we will import a model from the ProgPy Package. For this example we're using the simple ThrownObject model.

In [None]:
from prog_models.models import ThrownObject

Now we can build a model with a best guess for the parameters.

We will use a guess that our thrower is 20 meters tall. However, given our times, inputs, and outputs, we can clearly tell this is not true! Let's see if parameter estimation can fix this!

In [None]:
m = ThrownObject(thrower_height=20)

Next, we will collect data from the system. Let's pretend we threw the ball once, and collected position measurements.

In [None]:
times = [0, 1, 2, 3, 4, 5, 6, 7, 8]
inputs = [{}]*9
outputs = [
    {'x': 1.83},
    {'x': 36.95},
    {'x': 62.36},
    {'x': 77.81},
    {'x': 83.45},
    {'x': 79.28},
    {'x': 65.3},
    {'x': 41.51},
    {'x': 7.91},
]

For this example, we will define specific parameters that we want to estimate.

We can pass the desired parameters to our __keys__ keyword argument.

In [None]:
keys = ['thrower_height', 'throwing_speed']

To really see what `estimate_params()` is doing, we will print out the state before executing the estimation

In [None]:
# Printing state before
print('Model configuration before')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(times, inputs, outputs, dt=1e-4))

Notice that the error is quite high. This indicates that the parameters are not accurate

Now, we will run `estimate_params()` with the data to correct these parameters.

In [None]:
m.estimate_params(times = times, inputs = inputs, outputs = outputs, keys = keys, dt=0.01)

Now, let's see what the new parameters are after estimation.

In [None]:
print('\nOptimized configuration')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(times, inputs, outputs, dt=1e-4))

Sure enough- parameter estimation determined that the thrower's height wasn't 20m, instead was closer to 1.9m, a much more reasonable height!

#### Example 2) Using Tol

An additional feature of the `estimate_params()` function is the tolerance feature, or `tol`. The `tol` makes our parameter estimation function to continue optimizing until we reach a particular error.

In our previous example, note that our total Error was roughly 0.5272 after the `estimate_params()` call. Now, let us see what happens to the parameters when we set a low tolerance and bounds to their respective keys!

First, let us create a more complicated example! In this example, we are selecting our thrower_height to be 29, our throwing_speed to be 3.1, and our g to be 10.

In [None]:
m = ThrownObject()
results = m.simulate_to_threshold(save_freq=0.5)
# Resetting parameters to their originally incorrectly set values.
m.parameters['thrower_height'] = 3.1
m.parameters['throwing_speed'] = 29
m.parameters['g'] = 10

# Furthermore, changing our keys values to encompass all incorrectly set parameters
keys = ['thrower_height', 'throwing_speed', 'g']

Then, we'll set the bounds of our values. Let's say that we know our thrower_height will be some value between -15 and 15, our throwing_speed somewhere between 24 and 42, and the gravity is somewhere between -20 and 10.

Note that we are calling `simulate_to_threshold()` here instead of using data that is readily available. Typically, either data has been collected by the user, or the user can utilize our Simulation features to predict how the system would change overtime! More information can be found [here](https://nasa.github.io/progpy/prog_models_guide.html#simulation)!

Now that we have all our information, it's time to call our `estimate_params()`!

In [None]:
m.estimate_params(times = results.times, inputs = results.inputs, outputs = results.outputs, keys = keys)

In [None]:
print('\nOptimized configuration')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(results.times, results.inputs, results.outputs))

Now, let's reset our parameters to their incorrect values, and then call `estimate_params()` with a low tolerance value passed in! in this case, we are passing in a value of __1e-9__ to `tol`

In [None]:
m.parameters['thrower_height'] = 3.1
m.parameters['throwing_speed'] = 29
m.parameters['g'] = 10
m.estimate_params(times = results.times, inputs = results.inputs, outputs = results.outputs, keys = keys, tol=1e-9)
print('\nOptimized configuration')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(results.times, results.inputs, results.outputs))

Note, if we were to set a high tolerance, such as 10, our error would consequently be very high!

For more information on how the `tol` feature works, please consider scipy's `minimize()` documentation located [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html)

We can use our optimization function, which runs `calc_error()` as a subroutine (more information about `calc_error()` found in our Calculating Error Example).

You can also adjust the metric that is used to estimate parameters by setting the error_method to a different `calc_error()` method.
e.g., m.estimate_params([(times, inputs, outputs)], keys, dt=0.01, error_method='MAE')
Default is Mean Squared Error (MSE)
See calc_error method for list of options.

In [None]:
m.parameters['thrower_height'] = 3.1
m.parameters['throwing_speed'] = 29
m.parameters['g'] = 10
# Using MAE, or Mean Absolute Error instead of the default Mean Squared Error.
m.estimate_params(times = results.times, inputs = results.inputs, outputs = results.outputs, keys = keys, tol=1e-9, error_method = 'MAE')
print('\nOptimized configuration')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(results.times, results.inputs, results.outputs))

#### Example 3) Handling Noise with Multiple Runs

In the previous two examples, we have demonstrated how to use `estimate_params()` using a clearly defined ThrownObject model. However, unlike most models, we assumed that there would be 0 noise to our system!

In this example, we'll show how `estimate_params()` may not necessarily produce optimal results when handling a system with noise.

In [None]:
m = ThrownObject(process_noise = 1)
results = m.simulate_to_threshold(save_freq=0.5)
# Resetting parameters to their originally incorrectly set values.
m.parameters['thrower_height'] = 3.1
m.parameters['throwing_speed'] = 29
m.parameters['g'] = 10

# Furthermore, changing our keys values to encompass all incorrectly set parameters
keys = ['thrower_height', 'throwing_speed', 'g']

In [None]:
m.estimate_params(times = results.times, inputs = results.inputs, outputs = results.outputs, keys = keys)
print('\nOptimized configuration')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(results.times, results.inputs, results.outputs))

One thing to note is to have a good estimation of the error, we should be manually measuring the Absolute Mean Error rather than using calc_error().

The reason being is simple! calc_error() is simulating the error between teh simulated and observed data, however, observed and simulated data in this case are being generated from a model that has noise! In other words, we are comparing the difference of noise to noise, which can lead to inconsistent results!

Let's create a helper function to calculate the Absolute Mean Error between our original and estimated parameters!

In [None]:
# Function to determine the Absolute Mean Error (AME) of the model.
def AME(m, keys):
    error = 0
    true_Values = ThrownObject() # Creating a new model with the original parameters to compare to the model with noise.
    for key in keys:
        error += abs(m.parameters[key] - true_Values.parameters[key])
    return error

Note that the error that was outputted is an error that changes for every simulated model. In otherwords,

In [None]:
count = 1
while count <= 10:
    m = ThrownObject(process_noise = 1)
    results = m.simulate_to_threshold(save_freq=0.5)
    # Resetting parameters to their originally incorrectly set values.
    m.parameters['thrower_height'] = 3.1
    m.parameters['throwing_speed'] = 29
    m.parameters['g'] = 10

    m.estimate_params(times = results.times, inputs = results.inputs, outputs = results.outputs, keys = keys)
    error = AME(m, ['thrower_height', 'throwing_speed', 'g'])
    print(f'Estimate Call Number {count} - AME Error {error}')
    count += 1

Behind the scenes, `estimate_params()` applies the `calc_error()` method to each run independently (e.g., Run 0 = (times[]))

`estimate_params()` creates a structure of 'runs' by taking each index of times, inputs, and outputs and placing them into a tuple.

                `runs = [(t, u, z) for t, u, z in zip(times, inputs, outputs)]`

In [None]:
runs = [[], [], []]
count = 1
while count <= 100:
    m = ThrownObject(process_noise = 1)
    results = m.simulate_to_threshold(save_freq=0.5)
    # Resetting parameters to their originally incorrectly set values.
    m.parameters['thrower_height'] = 3.1
    m.parameters['throwing_speed'] = 29
    m.parameters['g'] = 10
    
    runs[0].append(results.times)
    runs[1].append(results.inputs)
    runs[2].append(results.outputs)
    count+=1

print(runs)

In [None]:
m.estimate_params(times = runs[0], inputs = runs[1], outputs = runs[2], keys = keys)
print('\nOptimized configuration')
for key in keys:
    print("-", key, m.parameters[key])
error = AME(m, ['thrower_height', 'throwing_speed', 'g'])
print('AME Error: ', error)

Notice that by creating multiple runs, we are able to produce a lower AME Error than before! This is because we are able to simulate the noise multiple times, which in turn, allows our `estimate_params()` to produce a more accurate result since it is given more values to work with!