[BUG] Iteration problem when using ensemble SINDy and GenerlizedLibrary #158

BMP-TUD · 2022-01-27T16:06:07Z

Dear all,

sorry for reaching out to you again, after you already cleared up a few things with the WeakPDELibrary, but I encountered a problem with the Generlized library. Here, I am creating a large library only containing specific terms from a set of 6 different smaller libraries. I have a set of 5 variables, but not all variables are implemented in each Library. This works perfectly fine, with the use of the GeneralizedLibrary and I can fit the model normally. Here is the example on how I create the library (I suppose there is an easier way):

    A_T=1
    P_T=1
    G_T=1
    E_T=3
    
    lf0 = [
    lambda x : x,
        ]
    lf0 = [
        lambda x : x,
        ]
    l0=ps.CustomLibrary(library_functions=lf0,function_names=lf0,include_bias=False)
    
    lf1 = [
    lambda x,y : x*y,
        ]
    lf1 = [
        lambda x,y : x+y,
        ]
    l1=ps.CustomLibrary(library_functions=lf1,function_names=lf1,include_bias=True)
    
    lf2=[
        lambda x: A_T-x]
    lf2_names=[
        lambda x: 'A_T-'+x]
    l2=ps.CustomLibrary(library_functions=lf2, function_names=lf2_names, include_bias=False)
    
    lf3=[
        lambda x: P_T-x]
    lf3_names=[
        lambda x: 'P_T-'+x]
    l3=ps.CustomLibrary(library_functions=lf3, function_names=lf3_names, include_bias=False)
    
    lf4=[
        lambda x: G_T-x]
    lf4_names=[
        lambda x: 'G_T-'+x]
    l4=ps.CustomLibrary(library_functions=lf4, function_names=lf4_names, include_bias=False)
    
    lf5=[
        lambda x,y: E_T-x-y]
    lf5_names=[
        lambda x,y: 'E_T-'+x+'-'+y]
    l5=ps.CustomLibrary(library_functions=lf5, function_names=lf5_names, include_bias=False)
    
   
    
    inputs_temp=np.tile([0,1,2,3,4],6)
    inputs_per_library = np.reshape(inputs_temp, (6, 5))
    inputs_per_library[2, :] = 0 # Just use the u input for generating the library
    inputs_per_library[3, :] = 4 # Just use the y input for generating the library
    inputs_per_library[4, :] = 3 # Just use the x input for generating the library
    inputs_per_library[5, :3] = 3 # Just use the x and y input for generating the library
    tensor_array=[[1,0,1,0,0,0],[1,0,0,1,0,0], [1,0,0,0,1,0], [1,0,0,0,0,1], [0,0,1,1,0,0], [0,0,1,0,1,0], [0,0,1,0,0,1],
                  [0,0,0,1,1,0],[0,0,0,1,0,1], [0,0,0,0,1,1]]
    custom_library=ps.GeneralizedLibrary(libraries=[l0,l1,l2,l3,l4,l5],inputs_per_library=inputs_per_library, tensor_array=tensor_array)

The problem occurs when I try to run the ensemble method together with the generalized library as follows (x_noisy consists of 5 variables):

ensemble_optimizer = ps.STLSQ(threshold=0.03,alpha=lam)
custom_library=mass_conservation_library()
model = ps.SINDy(feature_names=feature_names, optimizer=ensemble_optimizer, feature_library=custom_library)
model.fit(x_noisy, t=dt, ensemble=True , n_models=n_models, quiet=True)

The error that I get is the following:

  File "XXX.py", line 348, in <module>
    model.fit(x_noisy_t, t=dt, ensemble=True , n_models=n_models, quiet=True)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\pysindy\pysindy.py", line 471, in fit
    self.model.fit(x_ensemble, x_dot_ensemble)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\sklearn\pipeline.py", line 390, in fit
    Xt = self._fit(X, y, **fit_params_steps)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\sklearn\pipeline.py", line 348, in _fit
    X, fitted_transformer = fit_transform_one_cached(

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\joblib\memory.py", line 349, in __call__
    return self.func(*args, **kwargs)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\sklearn\pipeline.py", line 893, in _fit_transform_one
    res = transformer.fit_transform(X, y, **fit_params)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\sklearn\base.py", line 855, in fit_transform
    return self.fit(X, y, **fit_params).transform(X)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\pysindy\feature_library\generalized_library.py", line 202, in fit
    fitted_libs = [

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\pysindy\feature_library\generalized_library.py", line 203, in <listcomp>
    lib.fit(x[:, np.unique(self.inputs_per_library_[i, :])], y)

IndexError: index 6 is out of bounds for axis 0 with size 6

The problem here is, that I tried to have a look at what is going on in the code, but changing something in there always breaks the GenerlizedLibrary script. Furthermore, I am confused that this did not happen when I ran the analysis with just the normal SINDy, because the library is the same for both. Maybe you can tell me if I am just doing something wrong, or what the issue could be. Thank you again in advance and I am looking forward to your answer,

Best wishes,
Bartosz

The text was updated successfully, but these errors were encountered:

akaptano · 2022-01-27T16:29:58Z

Hi Bartosz, glad you're using all the new functionality. What version of PySINDy are you using? We just corrected an ensembling-related bug from versions 1.5.1 -> 1.6.1

BMP-TUD · 2022-01-27T17:06:25Z

Thank you for quick response, I am using the newest version 1.6.1.

akaptano · 2022-01-27T17:14:43Z

Good, let me try to reproduce this now.

akaptano · 2022-01-27T17:38:18Z

Small oversight on our part. Looks like when ensemble=True, the library is fit each time, so it adds the tensored libraries the first time and then it is confused when the second call of fit happens. Fixed this and pushing the changes in a moment, so you'll need to get the latest version of the main branch. Let me know if you need anything else, and here is the silly example I used to debug:

import numpy as np
import pysindy as ps
from scipy.integrate import solve_ivp

def lorenz(t, x, sigma=10, beta=2.66667, rho=28):
    return [
        sigma * (x[1] - x[0]),
        x[0] * (rho - x[2]) - x[1],
        x[0] * x[1] - beta * x[2],
        0,
        0,
    ]
integrator_keywords = {}
integrator_keywords['rtol'] = 1e-12
integrator_keywords['method'] = 'LSODA'
integrator_keywords['atol'] = 1e-12

dt = .002
t_train = np.arange(0, 10, dt)
x0_train = [-8, 8, 27, 10, 4]
t_train_span = (t_train[0], t_train[-1])
x_train = solve_ivp(lorenz, t_train_span, x0_train, 
                    t_eval=t_train, **integrator_keywords).y.T

A_T=1
P_T=1
G_T=1
E_T=3

lf0 = [
lambda x : x,
    ]
lf0 = [
    lambda x : x,
    ]
l0=ps.CustomLibrary(library_functions=lf0,function_names=lf0,include_bias=False)

lf1 = [
lambda x,y : x*y,
    ]
lf1 = [
    lambda x,y : x+y,
    ]
l1=ps.CustomLibrary(library_functions=lf1,function_names=lf1,include_bias=True)

lf2=[
    lambda x: A_T-x]
lf2_names=[
    lambda x: 'A_T-'+x]
l2=ps.CustomLibrary(library_functions=lf2, function_names=lf2_names, include_bias=False)

lf3=[
    lambda x: P_T-x]
lf3_names=[
    lambda x: 'P_T-'+x]
l3=ps.CustomLibrary(library_functions=lf3, function_names=lf3_names, include_bias=False)

lf4=[
    lambda x: G_T-x]
lf4_names=[
    lambda x: 'G_T-'+x]
l4=ps.CustomLibrary(library_functions=lf4, function_names=lf4_names, include_bias=False)

lf5=[
    lambda x,y: E_T-x-y]
lf5_names=[
    lambda x,y: 'E_T-'+x+'-'+y]
l5=ps.CustomLibrary(library_functions=lf5, function_names=lf5_names, include_bias=False)



inputs_temp=np.tile([0,1,2,3,4],6)
inputs_per_library = np.reshape(inputs_temp, (6, 5))
inputs_per_library[2, :] = 0 # Just use the u input for generating the library
inputs_per_library[3, :] = 4 # Just use the y input for generating the library
inputs_per_library[4, :] = 3 # Just use the x input for generating the library
inputs_per_library[5, :3] = 3 # Just use the x and y input for generating the library
tensor_array=[[1,0,1,0,0,0],[1,0,0,1,0,0], [1,0,0,0,1,0], 
              [1,0,0,0,0,1], [0,0,1,1,0,0], [0,0,1,0,1,0], [0,0,1,0,0,1],
              [0,0,0,1,1,0],[0,0,0,1,0,1], [0,0,0,0,1,1]]
custom_library=ps.GeneralizedLibrary(libraries=[l0,l1,l2,l3,l4,l5],
                                     inputs_per_library=inputs_per_library, 
                                     tensor_array=tensor_array)
lam = 0 
n_models = 20
feature_names = ['x', 'y', 'z', 'v', 'w']
ensemble_optimizer = ps.STLSQ(threshold=0.03, alpha=lam)
model = ps.SINDy(feature_names=feature_names, 
                 optimizer=ensemble_optimizer, 
                 feature_library=custom_library)
model.fit(x_train, t=dt, ensemble=True , n_models=n_models, quiet=True)
model.print()

BMP-TUD · 2022-01-27T17:55:36Z

Hi and thank you for the fast response and fix of the issue, really appreciate it. 👍

BMP-TUD · 2022-01-27T18:26:08Z

Hi, sorry for the maybe misguiding answer before, I still appreciate your work ;) but: I tried it now with the new 1.6.2 push but it does not seem to work. I get the same error message. I compared the code for this right now and nothing changed. Is the push already through? Or has something gone wrong? Thanks, bw Bartosz

akaptano · 2022-01-27T18:39:40Z

Are you cloning the GitHub repo directly or something else? The 1.6.2 push did not have the update... there is another git commit after that, containing the fix. If you are not cloning the repo directly, let me know, and I will just make a v1.6.3 release so you can get it

BMP-TUD · 2022-01-28T09:33:20Z

I am normally using pip to manage my packages or conda if necessary. I can just git-clone and have it straight away, but maybe it would be good to have it accessible through pip or conda. Thank you for your help :)

akaptano · 2022-01-28T15:21:54Z

Just published a new release (v1.6.3) with the change. Let me know if that does not fix your issue!

znicolaou · 2022-01-28T20:48:13Z

Thanks for the progress on this, @akaptano ! It looks like the last push and release went part of the way to addressing this issue, but I had to add another line to the generalized library for the weak case (to set the number of samples to the number of domains times the number of trajectories instead of x.shape[0]). I pushed this change to the weak_optimization branch for now.

I think some iterations of multiple_trajectories and nonweak/weak libraries may still cause issues, and we probably should do a few tests still before pushing another release. @BMP-TUD, I suggest you clone the github repository and checkout the weak_optimization branch for now. You can use pip to install the developmental version as well, as in:

git clone https://github.com/dynamicslab/pysindy.git
cd pysindy
git checkout weak_optimization
pip install .

Since you're using the weak libraries, you will also benefit from faster fitting in the weak_library branch, but you should remove the num_pts_per_domain option when you declare the WeakPDELibrary, since it is being eliminated. See the pull request #153 , which I'm linking to this issue for now. Some time soon we will merge this with master and create a new release.

akaptano · 2022-01-28T20:54:28Z

@znicolaou I think you're right but @BMP-TUD if you aren't using the weak form stuff, I think you should be able to get away with using v1.6.3 for now.

znicolaou · 2022-01-28T21:33:25Z

Oops, had a typo in that last push that I fixed, so I pushed again... FYI I am also trying to use the generalized library with multiple trajectories to fit an integro-differential equation with the weak form library, so hopefully I can flesh out all these issues as I go.

BMP-TUD · 2022-02-03T12:30:02Z

Thank you both, I accessed the package via git clone. It works perfectly. :)

akaptano · 2022-05-03T00:14:28Z

Addressed this in the new release I believe. Closing this now.

MTS runs may fail when the train/val/test period (including warmup) at some frequency has a length that is not a multiple of the frequency_factor (factor between lowest and current frequency). This adds a check to find these cases and fail gracefully.

akaptano closed this as completed Jan 27, 2022

znicolaou reopened this Jan 28, 2022

akaptano added the bug Something isn't working label Feb 4, 2022

akaptano mentioned this issue Apr 24, 2022

Enhanced subdomain integration for the weak form library #153

Merged

akaptano closed this as completed May 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Iteration problem when using ensemble SINDy and GenerlizedLibrary #158

[BUG] Iteration problem when using ensemble SINDy and GenerlizedLibrary #158

BMP-TUD commented Jan 27, 2022

akaptano commented Jan 27, 2022

BMP-TUD commented Jan 27, 2022

akaptano commented Jan 27, 2022

akaptano commented Jan 27, 2022 •

edited

Loading

BMP-TUD commented Jan 27, 2022

BMP-TUD commented Jan 27, 2022

akaptano commented Jan 27, 2022 •

edited

Loading

BMP-TUD commented Jan 28, 2022

akaptano commented Jan 28, 2022

znicolaou commented Jan 28, 2022

akaptano commented Jan 28, 2022

znicolaou commented Jan 28, 2022

BMP-TUD commented Feb 3, 2022

akaptano commented May 3, 2022

[BUG] Iteration problem when using ensemble SINDy and GenerlizedLibrary #158

[BUG] Iteration problem when using ensemble SINDy and GenerlizedLibrary #158

Comments

BMP-TUD commented Jan 27, 2022

akaptano commented Jan 27, 2022

BMP-TUD commented Jan 27, 2022

akaptano commented Jan 27, 2022

akaptano commented Jan 27, 2022 • edited Loading

BMP-TUD commented Jan 27, 2022

BMP-TUD commented Jan 27, 2022

akaptano commented Jan 27, 2022 • edited Loading

BMP-TUD commented Jan 28, 2022

akaptano commented Jan 28, 2022

znicolaou commented Jan 28, 2022

akaptano commented Jan 28, 2022

znicolaou commented Jan 28, 2022

BMP-TUD commented Feb 3, 2022

akaptano commented May 3, 2022

akaptano commented Jan 27, 2022 •

edited

Loading

akaptano commented Jan 27, 2022 •

edited

Loading