# Welcome to the Parameter Estimation Feature Example

The goal of this notebook is to instruct ProgPy users on how to use the estimate_params feature for PrognosticModels.

First some background. Parameter estimation is used to tune the parameters of a general model so its behavior matches the behavior of a specific system. For example, parameters of the battery model can be tuned to configure the model to describe the behavior of a specific battery.

Generally, parameter estimation is done by tuning the parameters of the model so that simulation best matches the behavior observed in some available data. In ProgPy, this is done using the prog_models.PrognosticsModel.estimate_params() method. This method takes input and output data from one or more runs, and uses scipy.optimize.minimize function to estimate the parameters of the model. For more information, refer to our Documentation [here](https://nasa.github.io/progpy/prog_models_guide.html#parameter-estimation)

A few definitions:
* __`keys`__ `(list[str])`: Parameter keys to optimize
* __`times`__ `(list[float])`: Array of times for each run
* __`inputs`__ `(list[InputContainer])`: Array of input containers where inputs[x] corresponds to times[x]
* __`outputs`__ `(list[OutputContainer])`: Array of output containers where outputs[x] corresponds to times[x]
* __`method`__ `(str, optional)`: Optimization method- see scipy.optimize.minimize for options
* __`tol`__ `(int, optional)`: Tolerance for termination. Depending on the provided minimization method, specifying tolerance sets solver-specific options to tol
* __`error_method`__ `(str, optional)`: Method to use in calculating error. See calc_error for options
* __`bounds`__ `(tuple or dict, optional)`: Bounds for optimization in format ((lower1, upper1), (lower2, upper2), ...) or {key1: (lower1, upper1), key2: (lower2, upper2), ...}
* __`options`__ `(dict, optional)`: Options passed to optimizer. See scipy.optimize.minimize for options

### Simple Example

Now we will show an example demonstrating the model parameter estimation feature. In this example, we will be estimating the parameters for a model from data . In general, the data will usually be collected from the physical system or from a different model (model surrogacy). 

First, we will import a model from the ProgPy Package. For this example we're using the simple ThrownObject model.

In [1]:
from prog_models.models import ThrownObject

Now we can build a model with a best guess for the parameters.

We will use a guess that our thrower is 20 meters tall. However, given our times, inputs, and outputs, we can clearly tell this is not true! Let's see if parameter estimation can fix this!

In [2]:
m = ThrownObject()

Next, we will collect data from the system. Let's pretend we threw the ball once, and collected position measurements.

In [3]:
results = m.simulate_to_threshold(save_freq=0.5)

times = [0, 1, 2, 3, 4, 5, 6, 7, 8]
inputs = [{}]*9
outputs = [
    {'x': 1.83},
    {'x': 36.95},
    {'x': 62.36},
    {'x': 77.81},
    {'x': 83.45},
    {'x': 79.28},
    {'x': 65.3},
    {'x': 41.51},
    {'x': 7.91},
]

Let us take a look at what these collected values are!

For this example, we will define specific parameters that we want to estimate.

We can pass the desired parameters to our __keys__ keyword argument.

In [4]:
keys = ['thrower_height', 'throwing_speed', 'g']

To really see what `estimate_params()` is doing, we will print out the state before executing the estimation

In [13]:
# Printing state before
print('Model configuration before')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(times, inputs, outputs, dt=1e-4))

Model configuration before
- thrower_height -15.0
- throwing_speed 24.0
- g 10.0
 Error:  32788.728480144986


Notice that the error is quite high. This indicates that the parameters are not accurate

Now, we will run `estimate_params()` with the data to correct these parameters.

In [14]:
m.estimate_params(times = times, inputs = inputs, outputs = outputs, keys = keys, tol=10)

       message: Optimization terminated successfully.
       success: True
        status: 0
           fun: 64.68781551362554
             x: [-1.896e+01  4.533e+01 -1.083e+01]
           nit: 32
          nfev: 60
 final_simplex: (array([[-1.896e+01,  4.533e+01, -1.083e+01],
                       [-1.888e+01,  4.799e+01, -1.166e+01],
                       [-1.966e+01,  4.603e+01, -1.104e+01],
                       [-1.923e+01,  4.782e+01, -1.184e+01]]), array([ 6.469e+01,  6.641e+01,  6.744e+01,  7.370e+01]))

Now, let's see what the new parameters are after estimation.

In [15]:
print('\nOptimized configuration')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(times, inputs, outputs))


Optimized configuration
- thrower_height -18.95582418824081
- throwing_speed 45.33042588714248
- g -10.830918521548558
 Error:  64.68781551362554


Sure enough- parameter estimation determined that the thrower's height wasn't 20m, instead was closer to 1.9m, a much more reasonable height!

Another feature of the `estimate_params()` function is the tolerance feature, or `tol`. The `tol` makes our parameter estimation function to continue optimizing until we reach a particular error. Furthermore, another feature that is typically useful for `estimate_params()` is `bounds`. `Bounds` allows for the user to set a constraint on the values that our parameters can be! This means if you your `estimate_params()` function to return a value within a particular set of constraints, then you can pass values to `bounds` to obtain the corresponding values.

In our previous example, note that our total Error was roughly 0.5272 after the `estimate_params()` call. Now, let us see what happens to the parameters when we set a low tolerance and bounds to their respective keys!

First, let us create a more complicated example! In this example, we are selecting our thrower_height to be 29, our throwing_speed to be 3.1, and our g to be 10.

In [16]:
# Resetting parameters to their originally incorrectly set values.
m.parameters['thrower_height'] = 3.1
m.parameters['throwing_speed'] = 29
m.parameters['g'] = 10

# Furthermore, changing our keys values to encompass all incorrectly set parameters
keys = ['thrower_height', 'throwing_speed', 'g']

Then, we'll set the bounds of our values. Let's say that we know our thrower_height will be some value between -15 and 15, our throwing_speed somewhere between 24 and 42, and the gravity is somewhere between -20 and 10.

In [17]:
bound = ((-15, 15), (24, 42), (-20, 10))
m.estimate_params(times = results.times, inputs = results.inputs, outputs = results.outputs, keys = keys, bounds = bound, dt = 0.1)

       message: Optimization terminated successfully.
       success: True
        status: 0
           fun: 824.6256006168826
             x: [-1.500e+01  2.400e+01  1.000e+01]
           nit: 22
          nfev: 41
 final_simplex: (array([[-1.500e+01,  2.400e+01,  1.000e+01],
                       [-1.500e+01,  2.400e+01,  1.000e+01],
                       [-1.500e+01,  2.400e+01,  1.000e+01],
                       [-1.500e+01,  2.400e+01,  1.000e+01]]), array([ 8.246e+02,  8.246e+02,  8.246e+02,  8.246e+02]))

In [18]:
print('\nOptimized configuration')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(results.times, results.inputs, results.outputs))


Optimized configuration
- thrower_height -15.0
- throwing_speed 24.0
- g 10.0
 Error:  755.5129416299291


In [19]:
m.parameters['thrower_height'] = 3.1
m.parameters['throwing_speed'] = 29
m.parameters['g'] = 10
m.estimate_params(times = results.times, inputs = results.inputs, outputs = results.outputs, keys = keys, bounds = bound, dt = 0.1, tol=1e-9)
print('\nOptimized configuration')
for key in keys:
    print("-", key, m.parameters[key])
print(' Error: ', m.calc_error(results.times, results.inputs, results.outputs))


Optimized configuration
- thrower_height -15.0
- throwing_speed 24.0
- g 10.0
 Error:  755.5129416299291


Behind the scenes, `estimate_params()` applies the `calc_error()` method to each run independently (e.g., Run 0 = (times[]))

`estimate_params()` creates a structure of 'runs' by taking each index of times, inputs, and outputs and placing them into a tuple.

                `runs = [(t, u, z) for t, u, z in zip(times, inputs, outputs)]`


Using our optimization function, which runs `calc_error()` as a subroutine (more information about `calc_error()` found in our Calculating Error Example), given each of the runs.

You can also adjust the metric that is used to estimate parameters by setting the error_method to a different `calc_error()` method.
e.g., m.estimate_params([(times, inputs, outputs)], keys, dt=0.01, error_method='MAX_E')
Default is Mean Squared Error (MSE)
See calc_error method for list of options.

* Cover multiple inputs, range of inputs at different levels, to make sure the model works for all runs, then do it multiple times,
* Or if there is noise, and you'll need multiple runs.

=============
* ONE RUN WITH NOISE, THEN SHOW IT'S NOT VERY GOOD
* Why multiple runs with the noise, both noises work, run estimate_params(), not very good of a job with one run.
* Add tolerance, and change the error calculation metric, (just a note for error calculation metric).