Regarding prediction and identified coefficient of input/output models. #52

SelmaCho · 2024-01-15T06:53:54Z

Dear SIPPY staff,

I am a beginner of SIPPY to use it for developing MPC framework.

I am studying system identification and SIPPY based on the book published by A. Kumar and Flores-Cerrillo, "Machine learning in python for dynamic process systems".

While I read this book and try some example codes, I wonders how to predict new output of ARX and extract coefficients with difference equation form.

In the manual, it is written that using fset.validation is proper to predict new output; However, its description is ambiguous.
I would appreciate you all if you let me know how to predict the new output or extract coefficients with difference equation format.

Here is my toy problem code:

from sippy import system_identification
from sippy import functionset as fset
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import r2_score # model diagnostics
from statsmodels.graphics.tsaplots import plot_acf # model diagnostics

temp = pd.read_csv('simpleProcess.csv')
data = np.array(temp)
u = data[:,0, None]; y = data[:,1, None]

u_scaler = StandardScaler(with_std=False)
y_scaler = StandardScaler(with_std=False)
u_centered = u_scaler.fit_transform(u)
y_centered = y_scaler.fit_transform(y)

Id_ARX = system_identification(y_centered,u_centered, 'ARX', ARX_orders=[3,3,2]) # y(k)-y(k-1)-y(k-2)-y(k-3)=u(k-1)+u(k-2)+u(k-3)
y_centered_pred = np.transpose(Id_ARX.Yid)
y_centered_pred2 = fset.validation(Id_ARX, y_centered, u_centered, np.linspace(1,999,999), 1)

plt.figure()
plt.plot(y_centered, 'o', label='raw data')
plt.plot(y_centered_pred, '--', label='ARX fit')
plt.legend(); plt.xlabel('k'); plt.ylabel('y')
print('Fit accuracy=', r2_score(y_centered,y_centered_pred))

res = y_centered-y_centered_pred
plot_acf(res,lags=50)
plt.xlabel('lag')

ValueError Traceback (most recent call last)
Cell In[112], line 21
19 Id_ARX = system_identification(y_centered,u_centered, 'ARX', ARX_orders=[3,3,2]) # y(k)-y(k-1)-y(k-2)-y(k-3)=u(k-1)+u(k-2)+u(k-3)
20 y_centered_pred = np.transpose(Id_ARX.Yid)
---> 21 y_centered_pred2 = fset.validation(Id_ARX, y_centered, u_centered, np.linspace(1,999,999), 1)
23 plt.figure()
24 plt.plot(y_centered, 'o', label='raw data')

ValueError: A value (0.0) in x_new is below the interpolation range's minimum value (1.0).

RBdC · 2024-01-15T12:40:32Z

Dear SelmaCho,
to test your code and check the error,
you need to share your data file ('simpleProcess.csv').. send it to us by mail, if the case.

Please consider that you find the output of your model - obtained with the LLS approach - within the output attribute as Id_ARX.Yid;
the ARX coefficients in the difference equation form can be extracted from the variable Id_ARX.G (which is a transfer function relating input u and output y) with a simple arrangement, recalling that yk-t = z^-t*y, where z is the shift operator.

The syntax for fset.validation - usually to be used for model validation on new data, not the same dataset used for model identification - appears wrong, as input/output position needs to be swapped, as:
fset.validation(Id_ARX, u_centered, y_centered, np.linspace(1,999,999), 1)

Best
RBdC

SelmaCho · 2024-01-15T13:02:26Z

simpleProcess.csv

Thank you for your kind reply, RBDC.

I've attached it by github link because I cannot find your email address ^^;;
I have more questions.

Regarding yk-t = z^(-t*y)
the Id_ARX.G is following. Sorry for my lack knowledge, I couldn't get an idea to extract the coefficient from this transfer function.
Can you give me a hint?

(0.756 z^2 + 0.08442 z + 0.1733) / (z^5 - 0.2474 z^4 - 0.1489 z^3 - 0.1318 z^2)

dt = 1.0

The syntax for fset.validation
As you kindly explained, I changed the location of inputs. But it has still problem related to interpolation.
fset.validation(Id_ARX, u_centered, y_centered, np.linspace(1,999,999), 1)

The reason that I need to use a method-function for predicting new output is to optimize manipulated variables..
can fset.validation be used like sklearn's predict method?

ValueError Traceback (most recent call last)
Cell In[126], line 1
----> 1 fset.validation(Id_ARX, u_centered, y_centered, np.linspace(1,999,999), 1)

File ~\Desktop\Python\SysID\sippy\functionset.py:192, in validation(SYS, u, y, Time, k, centering)
189 for i in range(ydim):
190 # one-step ahead predictor
191 if k == 1:
--> 192 T, Y_u = cnt.forced_response((1/SYS.H[i,0])*SYS.G[i,:], Time, u)
193 T, Y_y = cnt.forced_response(1 - (1/SYS.H[i,0]), Time, y[i,:] - y_rif[i])
194 Yval[i,:] = (Y_u + np.atleast_2d(Y_y) + y_rif[i])

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\control\timeresp.py:407, in forced_response(sys, T, U, X0, transpose, interpolate, squeeze)
403 dsys = (A, B, C, D, sys_dt)
405 # Use signal processing toolbox for the discrete time simulation
406 # Transpose the input to match toolbox convention
--> 407 tout, yout, xout = sp.signal.dlsim(dsys, np.transpose(U), T, X0)
409 if not interpolate:
410 # If dt is different from sys.dt, resample the output
411 inc = int(round(dt / sys_dt))

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\scipy\signal_ltisys.py:3556, in dlsim(system, u, t, x0)
3553 u = u[:, np.newaxis]
3555 u_dt_interp = interp1d(t, u.transpose(), copy=False, bounds_error=True)
-> 3556 u_dt = u_dt_interp(tout).transpose()
3558 # Simulate the system
3559 for i in range(0, out_samples - 1):

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\scipy\interpolate_polyint.py:80, in _Interpolator1D.call(self, x)
59 """
60 Evaluate the interpolant
61
(...)
77
78 """
79 x, x_shape = self._prepare_x(x)
---> 80 y = self._evaluate(x)
81 return self._finish_y(y, x_shape)

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\scipy\interpolate_interpolate.py:755, in interp1d._evaluate(self, x_new)
753 y_new = self._call(self, x_new)
754 if not self._extrapolate:
--> 755 below_bounds, above_bounds = self._check_bounds(x_new)
756 if len(y_new) > 0:
757 # Note fill_value must be broadcast up to the proper size
758 # and flattened to work here
759 y_new[below_bounds] = self._fill_value_below

File ~\anaconda3\envs\SysIDBox\Lib\site-packages\scipy\interpolate_interpolate.py:784, in interp1d._check_bounds(self, x_new)
782 if self.bounds_error and below_bounds.any():
783 below_bounds_value = x_new[np.argmax(below_bounds)]
--> 784 raise ValueError("A value ({}) in x_new is below "
785 "the interpolation range's minimum value ({})."
786 .format(below_bounds_value, self.x[0]))
787 if self.bounds_error and above_bounds.any():
788 above_bounds_value = x_new[np.argmax(above_bounds)]

ValueError: A value (0.0) in x_new is below the interpolation range's minimum value (1.0).

Regards

Selma Cho

SelmaCho · 2024-01-15T13:15:06Z

I found the answer about question 1).
tf = (0.756 z^2 + 0.08442 z + 0.1733) / (z^5 - 0.2474 z^4 - 0.1489 z^3 - 0.1318 z^2)

y[k]-0.2474y[k-5]-0.1489y[k-4]-0.1318y[k-3] = 0.1733u[k-1]+0.08442u[k-2]+0.756u[k-3]

Really. thanks.

RBdC · 2024-01-15T14:54:27Z

Hi SelmaCho,

about Q1, you are close, but the right solution is as follows:

The general difference equation is:
yk + a1 yk- 1 + ... + ana yk-na = b1 uk-1-theta + .... + bnb uk-nb-theta

here you have
from tf = (0.756 z^2 + 0.08442 z + 0.1733) / (z^5 - 0.2474 z^4 - 0.1489 z^3 - 0.1318 z^2)
y[k]-0.2474y[k-1]-0.1489y[k-2]-0.1318y[k-3] = 0.756u[k-1-2]+0.08442u[k-2-2]+0.1733u[k-3-2]
since you set ARX_orders=[3,3,2], you have 3 coef.s in NUM (na =3) and 3 coefs. in DEN (nb =3) ,
and theta (the time-delay) is indeed 2

about Q2, I run the code, and I did get no errors.. (e.g., Fit accuracy= 0.8843802125888518)
I see ..."control\timeresp.py", so guess your error comes from an "old issue"..
Please refer to this topic: #42
" we noted that the cnt.lsim function changed its input format depending on which version you have.
Testing EX_ARMAX and EX_ARMAX_MIMO both on control versions 0.8.3 and 0.9.0 it worked without errors. "

Let us know which version of control you have (control.version) and if updating it solves the problem.

Best
RBdC

SelmaCho · 2024-01-16T00:33:31Z

Dear RBdC
Thank you for your kindly reply..
About Q2, I re-installed control 0.9.0 but it still has problem.
Also, control 0.8.3 version has a lot of problem like numpy integer problem and compatibility.
It seems Scipy's error since error appears on scipy interpolate problem.

Regards
Selma

RBdC · 2024-01-16T09:06:28Z

Dear Selma,
thanks for your specification.
Yeah, control is a bad beast in terms of compatibility.
Consider it works for me with control 0.8.3 and scipy 1.4.1
Best
RBdC

SelmaCho · 2024-01-17T05:27:14Z

Dear RBdC

Thank you. Another example does not show an error when using control version 0.9.0 and scipy version 1.11.1.

I have one more question.

What is the role of "Time" in fset.validation?
The explanation in document is quite confusing, since it has k which delineate time-delay.
Is "Time" to allocate time stamp to every samples?

Do I correctly understand?
for example, we have u and y which are of 100 samples, respectively.
Time should be np.linspace(0,99,100).
Then, from the start point, samples of u and y have time stamp (0-th, 1-th, .... 99-th).
Running fset.validation(ARX, u, y, np.linspace(0,99,100), k=1) results predicted y (0+k-th, 1+k-th, .. 99-k-th).

P.S. Here is a code and data which does not cause an error.
IndustrialFiredHeater_SISO_Validation.csv
IndustrialFiredHeater_SISO.csv

--

import packages

import matplotlib.pyplot as plt, numpy as np, control
from sklearn.preprocessing import StandardScaler
from sippy import system_identification as SysID
from statsmodels.tsa.stattools import ccf
from sippy import functionset as fset

package settings

plt.rcParams.update({'font.size': 14})
#assert(control.version < '0.9'), "To avoid errors, downgrade the control package to a version < 0.9.0. See #48 for details."

read fired heater identification test data and plot

data = np.loadtxt('IndustrialFiredHeater_SISO.csv', delimiter=',')
FG = data[:,0, None]
TO = data[:,1, None]

plots

plt.figure(figsize=(6,2.5)), plt.plot(FG, 'steelblue', linewidth=0.8)
plt.ylabel('u(k)'), plt.xlabel('k'), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')

plt.figure(figsize=(6,2.5)), plt.plot(TO, 'g', linewidth=0.8)
plt.ylabel('y(k)'), plt.xlabel('k'), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')
plt.show()

center data before model fitting

u_scaler = StandardScaler(with_std=False)
FG_centered = u_scaler.fit_transform(FG)

y_scaler = StandardScaler(with_std=False)
TO_centered = y_scaler.fit_transform(TO)

fit ARX model using AIC

ARXmodel = SysID(TO_centered, FG_centered, 'ARX', IC='AIC', na_ord=[1,20], nb_ord=[1,20], delays=[0,5])
print(ARXmodel.G)

read fired heater validation data and plot

data_val = np.loadtxt('IndustrialFiredHeater_SISO_Validation.csv', delimiter=',')
FG_val, TO_val = data_val[:,0, None], data_val[:,1, None]

plots

plt.figure(figsize=(6,2.5))
plt.plot(FG_val, 'steelblue', linewidth=0.8)
plt.ylabel('u(k) validation'), plt.xlabel('k'), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')

plt.figure(figsize=(6,2.5))
plt.plot(TO_val, 'g', linewidth=0.8)
plt.ylabel('y(k) validation'), plt.xlabel('k'), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')
plt.show()

center validation data

FG_val_centered, TO_val_centered = u_scaler.fit_transform(FG_val), y_scaler.fit_transform(TO_val)

get model's 1-step ahead and 5-step ahead predictions, and simulation responses on validation dataset

from sippy import functionset as fset

TO_val_predicted_1step_centered = np.transpose(fset.validation(ARXmodel, FG_val_centered, TO_val_centered, np.linspace(0,299,300), k=1))
TO_val_predicted_5step_centered = np.transpose(fset.validation(ARXmodel, FG_val_centered, TO_val_centered, np.linspace(0,299,300), k=5))
TO_val_simulated_centered, _, _ = control.matlab.lsim(ARXmodel.G, FG_val_centered[:, 0], np.linspace(0,299,300))

TO_val_predicted_1step = y_scaler.inverse_transform(TO_val_predicted_1step_centered)
TO_val_predicted_5step = y_scaler.inverse_transform(TO_val_predicted_5step_centered)
TO_val_simulated = y_scaler.inverse_transform(TO_val_simulated_centered[:, None])

plt.figure(figsize=(18,7.5))
plt.plot(TO_val, 'g', linewidth=0.8, label='Measurements')
plt.plot(TO_val_predicted_1step, 'r', linewidth=0.8, label='1-step ahead predictions')
plt.plot(TO_val_predicted_5step, 'c', linewidth=0.8, label='5-step ahead predictions')
plt.plot(TO_val_simulated, 'm', linewidth=0.8, label='Simulation response')
plt.title('Validation data')
plt.ylabel('y(k): Measured vs predicted/simulated'), plt.xlabel('k')
plt.legend(), plt.xlim(0)
plt.grid(which='both', axis='y', linestyle='--')
plt.show()

Best regards.

Selma

RBdC · 2024-01-17T14:36:26Z

Dear Selma,
thank you for your new post, the data, and the code.
I ran it and worked ok.

What is the role of "Time" in fset.validation?
The explanation in document is quite confusing, since it has k which delineate time-delay.
Is "Time" to allocate time stamp to every samples?
--> Time is indeed the time vector to define the prediction of your model.

Do I correctly understand?
for example, we have u and y which are of 100 samples, respectively.
Time should be np.linspace(0,99,100).
Then, from the start point, samples of u and y have time stamp (0-th, 1-th, .... 99-th).
Running fset.validation(ARX, u, y, np.linspace(0,99,100), k=1) results predicted y (0+k-th, 1+k-th, .. 99-k-th).
--> Yes, you predict the output of the model k-steps ahead.

Best
RBdC

RBdC closed this as completed Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding prediction and identified coefficient of input/output models. #52

Regarding prediction and identified coefficient of input/output models. #52

SelmaCho commented Jan 15, 2024

RBdC commented Jan 15, 2024

SelmaCho commented Jan 15, 2024 •

edited

Loading

SelmaCho commented Jan 15, 2024 •

edited

Loading

RBdC commented Jan 15, 2024

SelmaCho commented Jan 16, 2024 •

edited

Loading

RBdC commented Jan 16, 2024 •

edited

Loading

SelmaCho commented Jan 17, 2024

RBdC commented Jan 17, 2024

Regarding prediction and identified coefficient of input/output models. #52

Regarding prediction and identified coefficient of input/output models. #52

Comments

SelmaCho commented Jan 15, 2024

RBdC commented Jan 15, 2024

SelmaCho commented Jan 15, 2024 • edited Loading

SelmaCho commented Jan 15, 2024 • edited Loading

RBdC commented Jan 15, 2024

Hi SelmaCho,

SelmaCho commented Jan 16, 2024 • edited Loading

RBdC commented Jan 16, 2024 • edited Loading

SelmaCho commented Jan 17, 2024

import packages

package settings

read fired heater identification test data and plot

plots

center data before model fitting

fit ARX model using AIC

read fired heater validation data and plot

plots

center validation data

get model's 1-step ahead and 5-step ahead predictions, and simulation responses on validation dataset

RBdC commented Jan 17, 2024

SelmaCho commented Jan 15, 2024 •

edited

Loading

SelmaCho commented Jan 15, 2024 •

edited

Loading

SelmaCho commented Jan 16, 2024 •

edited

Loading

RBdC commented Jan 16, 2024 •

edited

Loading