Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding prediction and identified coefficient of input/output models. #52

Closed
SelmaCho opened this issue Jan 15, 2024 · 8 comments
Closed

Comments

@SelmaCho
Copy link

Dear SIPPY staff,

I am a beginner of SIPPY to use it for developing MPC framework.

I am studying system identification and SIPPY based on the book published by A. Kumar and Flores-Cerrillo, "Machine learning in python for dynamic process systems".

While I read this book and try some example codes, I wonders how to predict new output of ARX and extract coefficients with difference equation form.

In the manual, it is written that using fset.validation is proper to predict new output; However, its description is ambiguous.
I would appreciate you all if you let me know how to predict the new output or extract coefficients with difference equation format.

Here is my toy problem code:

from sippy import system_identification
from sippy import functionset as fset
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import r2_score # model diagnostics
from statsmodels.graphics.tsaplots import plot_acf # model diagnostics

temp = pd.read_csv('simpleProcess.csv')
data = np.array(temp)
u = data[:,0, None]; y = data[:,1, None]

u_scaler = StandardScaler(with_std=False)
y_scaler = StandardScaler(with_std=False)
u_centered = u_scaler.fit_transform(u)
y_centered = y_scaler.fit_transform(y)

Id_ARX = system_identification(y_centered,u_centered, 'ARX', ARX_orders=[3,3,2]) # y(k)-y(k-1)-y(k-2)-y(k-3)=u(k-1)+u(k-2)+u(k-3)
y_centered_pred = np.transpose(Id_ARX.Yid)
y_centered_pred2 = fset.validation(Id_ARX, y_centered, u_centered, np.linspace(1,999,999), 1)

plt.figure()
plt.plot(y_centered, 'o', label='raw data')
plt.plot(y_centered_pred, '--', label='ARX fit')
plt.legend(); plt.xlabel('k'); plt.ylabel('y')
print('Fit accuracy=', r2_score(y_centered,y_centered_pred))

res = y_centered-y_centered_pred
plot_acf(res,lags=50)
plt.xlabel('lag')

ValueError Traceback (most recent call last)
Cell In[112], line 21
19 Id_ARX = system_identification(y_centered,u_centered, 'ARX', ARX_orders=[3,3,2]) # y(k)-y(k-1)-y(k-2)-y(k-3)=u(k-1)+u(k-2)+u(k-3)
20 y_centered_pred = np.transpose(Id_ARX.Yid)
---> 21 y_centered_pred2 = fset.validation(Id_ARX, y_centered, u_centered, np.linspace(1,999,999), 1)
23 plt.figure()
24 plt.plot(y_centered, 'o', label='raw data')

ValueError: A value (0.0) in x_new is below the interpolation range's minimum value (1.0).

@RBdC
Copy link
Collaborator

RBdC commented Jan 15, 2024

Dear SelmaCho,
to test your code and check the error,
you need to share your data file ('simpleProcess.csv').. send it to us by mail, if the case.

Please consider that you find the output of your model - obtained with the LLS approach - within the output attribute as Id_ARX.Yid;
the ARX coefficients in the difference equation form can be extracted from the variable Id_ARX.G (which is a transfer function relating input u and output y) with a simple arrangement, recalling that yk-t = z^-t*y, where z is the shift operator.

The syntax for fset.validation - usually to be used for model validation on new data, not the same dataset used for model identification - appears wrong, as input/output position needs to be swapped, as:
fset.validation(Id_ARX, u_centered, y_centered, np.linspace(1,999,999), 1)

Best
RBdC

@SelmaCho
Copy link
Author

SelmaCho commented Jan 15, 2024

simpleProcess.csv

Thank you for your kind reply, RBDC.

I've attached it by github link because I cannot find your email address ^^;;
I have more questions.

  1. Regarding yk-t = z^(-t*y)
    the Id_ARX.G is following. Sorry for my lack knowledge, I couldn't get an idea to extract the coefficient from this transfer function.
    Can you give me a hint?

    (0.756 z^2 + 0.08442 z + 0.1733) / (z^5 - 0.2474 z^4 - 0.1489 z^3 - 0.1318 z^2)

dt = 1.0

  1. The syntax for fset.validation
    As you kindly explained, I changed the location of inputs. But it has still problem related to interpolation.
    fset.validation(Id_ARX, u_centered, y_centered, np.linspace(1,999,999), 1)

The reason that I need to use a method-function for predicting new output is to optimize manipulated variables..
can fset.validation be used like sklearn's predict method?


ValueError Traceback (most recent call last)
Cell In[126], line 1
----> 1 fset.validation(Id_ARX, u_centered, y_centered, np.linspace(1,999,999), 1)

File ~\Desktop\Python\SysID\sippy\functionset.py:192, in validation(SYS, u, y, Time, k, centering)
189 for i in range(ydim):
190 # one-step ahead predictor
191 if k == 1:
--> 192 T, Y_u = cnt.forced_response((1/SYS.H[i,0])*SYS.G[i,:], Time, u)
193 T, Y_y = cnt.forced_response(1 - (1/SYS.H[i,0]), Time, y[i,:] - y_rif[i])
194 Yval[i,:] = (Y_u + np.atleast_2d(Y_y) + y_rif[i])

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\control\timeresp.py:407, in forced_response(sys, T, U, X0, transpose, interpolate, squeeze)
403 dsys = (A, B, C, D, sys_dt)
405 # Use signal processing toolbox for the discrete time simulation
406 # Transpose the input to match toolbox convention
--> 407 tout, yout, xout = sp.signal.dlsim(dsys, np.transpose(U), T, X0)
409 if not interpolate:
410 # If dt is different from sys.dt, resample the output
411 inc = int(round(dt / sys_dt))

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\scipy\signal_ltisys.py:3556, in dlsim(system, u, t, x0)
3553 u = u[:, np.newaxis]
3555 u_dt_interp = interp1d(t, u.transpose(), copy=False, bounds_error=True)
-> 3556 u_dt = u_dt_interp(tout).transpose()
3558 # Simulate the system
3559 for i in range(0, out_samples - 1):

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\scipy\interpolate_polyint.py:80, in _Interpolator1D.call(self, x)
59 """
60 Evaluate the interpolant
61
(...)
77
78 """
79 x, x_shape = self._prepare_x(x)
---> 80 y = self._evaluate(x)
81 return self._finish_y(y, x_shape)

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\scipy\interpolate_interpolate.py:755, in interp1d._evaluate(self, x_new)
753 y_new = self._call(self, x_new)
754 if not self._extrapolate:
--> 755 below_bounds, above_bounds = self._check_bounds(x_new)
756 if len(y_new) > 0:
757 # Note fill_value must be broadcast up to the proper size
758 # and flattened to work here
759 y_new[below_bounds] = self._fill_value_below

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\scipy\interpolate_interpolate.py:784, in interp1d._check_bounds(self, x_new)
782 if self.bounds_error and below_bounds.any():
783 below_bounds_value = x_new[np.argmax(below_bounds)]
--> 784 raise ValueError("A value ({}) in x_new is below "
785 "the interpolation range's minimum value ({})."
786 .format(below_bounds_value, self.x[0]))
787 if self.bounds_error and above_bounds.any():
788 above_bounds_value = x_new[np.argmax(above_bounds)]

ValueError: A value (0.0) in x_new is below the interpolation range's minimum value (1.0).


Regards

Selma Cho

@SelmaCho
Copy link
Author

SelmaCho commented Jan 15, 2024

I found the answer about question 1).
tf = (0.756 z^2 + 0.08442 z + 0.1733) / (z^5 - 0.2474 z^4 - 0.1489 z^3 - 0.1318 z^2)

y[k]-0.2474y[k-5]-0.1489y[k-4]-0.1318y[k-3] = 0.1733u[k-1]+0.08442u[k-2]+0.756u[k-3]

Really. thanks.

@RBdC
Copy link
Collaborator

RBdC commented Jan 15, 2024

Hi SelmaCho,

about Q1, you are close, but the right solution is as follows:

The general difference equation is:
yk + a1 yk- 1 + ...  + ana yk-na = b1 uk-1-theta + .... + bnb uk-nb-theta

here you have
from tf = (0.756 z^2 + 0.08442 z + 0.1733) / (z^5 - 0.2474 z^4 - 0.1489 z^3 - 0.1318 z^2)
y[k]-0.2474y[k-1]-0.1489y[k-2]-0.1318y[k-3] = 0.756u[k-1-2]+0.08442u[k-2-2]+0.1733u[k-3-2]
since you set ARX_orders=[3,3,2], you have 3 coef.s in NUM (na =3) and 3 coefs. in DEN (nb =3) ,
and theta (the time-delay) is indeed 2


about Q2, I run the code, and I did get no errors.. (e.g., Fit accuracy= 0.8843802125888518)
I see ..."control\timeresp.py", so guess your error comes from an "old issue"..
Please refer to this topic: #42
" we noted that the cnt.lsim function changed its input format depending on which version you have.
Testing EX_ARMAX and EX_ARMAX_MIMO both on control versions 0.8.3 and 0.9.0 it worked without errors. "

Let us know which version of control you have (control.version) and if updating it solves the problem.

Best
RBdC

@SelmaCho
Copy link
Author

SelmaCho commented Jan 16, 2024

Dear RBdC
Thank you for your kindly reply..
About Q2, I re-installed control 0.9.0 but it still has problem.
Also, control 0.8.3 version has a lot of problem like numpy integer problem and compatibility.
It seems Scipy's error since error appears on scipy interpolate problem.

Regards
Selma

@RBdC
Copy link
Collaborator

RBdC commented Jan 16, 2024

Dear Selma,
thanks for your specification.
Yeah, control is a bad beast in terms of compatibility.
Consider it works for me with control 0.8.3 and scipy 1.4.1
Best
RBdC

@SelmaCho
Copy link
Author

Dear RBdC

Thank you. Another example does not show an error when using control version 0.9.0 and scipy version 1.11.1.

I have one more question.

What is the role of "Time" in fset.validation?
The explanation in document is quite confusing, since it has k which delineate time-delay.
Is "Time" to allocate time stamp to every samples?

Do I correctly understand?
for example, we have u and y which are of 100 samples, respectively.
Time should be np.linspace(0,99,100).
Then, from the start point, samples of u and y have time stamp (0-th, 1-th, .... 99-th).
Running fset.validation(ARX, u, y, np.linspace(0,99,100), k=1) results predicted y (0+k-th, 1+k-th, .. 99-k-th).

P.S. Here is a code and data which does not cause an error.
IndustrialFiredHeater_SISO_Validation.csv
IndustrialFiredHeater_SISO.csv

--

import packages

import matplotlib.pyplot as plt, numpy as np, control
from sklearn.preprocessing import StandardScaler
from sippy import system_identification as SysID
from statsmodels.tsa.stattools import ccf
from sippy import functionset as fset

package settings

plt.rcParams.update({'font.size': 14})
#assert(control.version < '0.9'), "To avoid errors, downgrade the control package to a version < 0.9.0. See #48 for details."

read fired heater identification test data and plot

data = np.loadtxt('IndustrialFiredHeater_SISO.csv', delimiter=',')
FG = data[:,0, None]
TO = data[:,1, None]

plots

plt.figure(figsize=(6,2.5)), plt.plot(FG, 'steelblue', linewidth=0.8)
plt.ylabel('u(k)'), plt.xlabel('k'), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')

plt.figure(figsize=(6,2.5)), plt.plot(TO, 'g', linewidth=0.8)
plt.ylabel('y(k)'), plt.xlabel('k'), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')
plt.show()

center data before model fitting

u_scaler = StandardScaler(with_std=False)
FG_centered = u_scaler.fit_transform(FG)

y_scaler = StandardScaler(with_std=False)
TO_centered = y_scaler.fit_transform(TO)

fit ARX model using AIC

ARXmodel = SysID(TO_centered, FG_centered, 'ARX', IC='AIC', na_ord=[1,20], nb_ord=[1,20], delays=[0,5])
print(ARXmodel.G)

read fired heater validation data and plot

data_val = np.loadtxt('IndustrialFiredHeater_SISO_Validation.csv', delimiter=',')
FG_val, TO_val = data_val[:,0, None], data_val[:,1, None]

plots

plt.figure(figsize=(6,2.5))
plt.plot(FG_val, 'steelblue', linewidth=0.8)
plt.ylabel('u(k) validation'), plt.xlabel('k'), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')

plt.figure(figsize=(6,2.5))
plt.plot(TO_val, 'g', linewidth=0.8)
plt.ylabel('y(k) validation'), plt.xlabel('k'), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')
plt.show()

center validation data

FG_val_centered, TO_val_centered = u_scaler.fit_transform(FG_val), y_scaler.fit_transform(TO_val)

get model's 1-step ahead and 5-step ahead predictions, and simulation responses on validation dataset

from sippy import functionset as fset

TO_val_predicted_1step_centered = np.transpose(fset.validation(ARXmodel, FG_val_centered, TO_val_centered, np.linspace(0,299,300), k=1))
TO_val_predicted_5step_centered = np.transpose(fset.validation(ARXmodel, FG_val_centered, TO_val_centered, np.linspace(0,299,300), k=5))
TO_val_simulated_centered, _, _ = control.matlab.lsim(ARXmodel.G, FG_val_centered[:, 0], np.linspace(0,299,300))

TO_val_predicted_1step = y_scaler.inverse_transform(TO_val_predicted_1step_centered)
TO_val_predicted_5step = y_scaler.inverse_transform(TO_val_predicted_5step_centered)
TO_val_simulated = y_scaler.inverse_transform(TO_val_simulated_centered[:, None])

plt.figure(figsize=(18,7.5))
plt.plot(TO_val, 'g', linewidth=0.8, label='Measurements')
plt.plot(TO_val_predicted_1step, 'r', linewidth=0.8, label='1-step ahead predictions')
plt.plot(TO_val_predicted_5step, 'c', linewidth=0.8, label='5-step ahead predictions')
plt.plot(TO_val_simulated, 'm', linewidth=0.8, label='Simulation response')
plt.title('Validation data')
plt.ylabel('y(k): Measured vs predicted/simulated'), plt.xlabel('k')
plt.legend(), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')
plt.show()

Best regards.

Selma

@RBdC
Copy link
Collaborator

RBdC commented Jan 17, 2024

Dear Selma,
thank you for your new post, the data, and the code.
I ran it and worked ok.

What is the role of "Time" in fset.validation?
The explanation in document is quite confusing, since it has k which delineate time-delay.
Is "Time" to allocate time stamp to every samples?
--> Time is indeed the time vector to define the prediction of your model.

Do I correctly understand?
for example, we have u and y which are of 100 samples, respectively.
Time should be np.linspace(0,99,100).
Then, from the start point, samples of u and y have time stamp (0-th, 1-th, .... 99-th).
Running fset.validation(ARX, u, y, np.linspace(0,99,100), k=1) results predicted y (0+k-th, 1+k-th, .. 99-k-th).
--> Yes, you predict the output of the model k-steps ahead.

Best
RBdC

@RBdC RBdC closed this as completed Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants