Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Iteration problem when using ensemble SINDy and GenerlizedLibrary #158

Closed
BMP-TUD opened this issue Jan 27, 2022 · 14 comments
Closed
Labels
bug Something isn't working

Comments

@BMP-TUD
Copy link

BMP-TUD commented Jan 27, 2022

Dear all,

sorry for reaching out to you again, after you already cleared up a few things with the WeakPDELibrary, but I encountered a problem with the Generlized library. Here, I am creating a large library only containing specific terms from a set of 6 different smaller libraries. I have a set of 5 variables, but not all variables are implemented in each Library. This works perfectly fine, with the use of the GeneralizedLibrary and I can fit the model normally. Here is the example on how I create the library (I suppose there is an easier way):

    A_T=1
    P_T=1
    G_T=1
    E_T=3
    
    lf0 = [
    lambda x : x,
        ]
    lf0 = [
        lambda x : x,
        ]
    l0=ps.CustomLibrary(library_functions=lf0,function_names=lf0,include_bias=False)
    
    lf1 = [
    lambda x,y : x*y,
        ]
    lf1 = [
        lambda x,y : x+y,
        ]
    l1=ps.CustomLibrary(library_functions=lf1,function_names=lf1,include_bias=True)
    
    lf2=[
        lambda x: A_T-x]
    lf2_names=[
        lambda x: 'A_T-'+x]
    l2=ps.CustomLibrary(library_functions=lf2, function_names=lf2_names, include_bias=False)
    
    lf3=[
        lambda x: P_T-x]
    lf3_names=[
        lambda x: 'P_T-'+x]
    l3=ps.CustomLibrary(library_functions=lf3, function_names=lf3_names, include_bias=False)
    
    lf4=[
        lambda x: G_T-x]
    lf4_names=[
        lambda x: 'G_T-'+x]
    l4=ps.CustomLibrary(library_functions=lf4, function_names=lf4_names, include_bias=False)
    
    lf5=[
        lambda x,y: E_T-x-y]
    lf5_names=[
        lambda x,y: 'E_T-'+x+'-'+y]
    l5=ps.CustomLibrary(library_functions=lf5, function_names=lf5_names, include_bias=False)
    
   
    
    inputs_temp=np.tile([0,1,2,3,4],6)
    inputs_per_library = np.reshape(inputs_temp, (6, 5))
    inputs_per_library[2, :] = 0 # Just use the u input for generating the library
    inputs_per_library[3, :] = 4 # Just use the y input for generating the library
    inputs_per_library[4, :] = 3 # Just use the x input for generating the library
    inputs_per_library[5, :3] = 3 # Just use the x and y input for generating the library
    tensor_array=[[1,0,1,0,0,0],[1,0,0,1,0,0], [1,0,0,0,1,0], [1,0,0,0,0,1], [0,0,1,1,0,0], [0,0,1,0,1,0], [0,0,1,0,0,1],
                  [0,0,0,1,1,0],[0,0,0,1,0,1], [0,0,0,0,1,1]]
    custom_library=ps.GeneralizedLibrary(libraries=[l0,l1,l2,l3,l4,l5],inputs_per_library=inputs_per_library, tensor_array=tensor_array)

The problem occurs when I try to run the ensemble method together with the generalized library as follows (x_noisy consists of 5 variables):

ensemble_optimizer = ps.STLSQ(threshold=0.03,alpha=lam)
custom_library=mass_conservation_library()
model = ps.SINDy(feature_names=feature_names, optimizer=ensemble_optimizer, feature_library=custom_library)
model.fit(x_noisy, t=dt, ensemble=True , n_models=n_models, quiet=True)

The error that I get is the following:

  File "XXX.py", line 348, in <module>
    model.fit(x_noisy_t, t=dt, ensemble=True , n_models=n_models, quiet=True)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\pysindy\pysindy.py", line 471, in fit
    self.model.fit(x_ensemble, x_dot_ensemble)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\sklearn\pipeline.py", line 390, in fit
    Xt = self._fit(X, y, **fit_params_steps)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\sklearn\pipeline.py", line 348, in _fit
    X, fitted_transformer = fit_transform_one_cached(

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\joblib\memory.py", line 349, in __call__
    return self.func(*args, **kwargs)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\sklearn\pipeline.py", line 893, in _fit_transform_one
    res = transformer.fit_transform(X, y, **fit_params)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\sklearn\base.py", line 855, in fit_transform
    return self.fit(X, y, **fit_params).transform(X)

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\pysindy\feature_library\generalized_library.py", line 202, in fit
    fitted_libs = [

  File "C:\XXX\Anaconda3\envs\mamc\lib\site-packages\pysindy\feature_library\generalized_library.py", line 203, in <listcomp>
    lib.fit(x[:, np.unique(self.inputs_per_library_[i, :])], y)

IndexError: index 6 is out of bounds for axis 0 with size 6

The problem here is, that I tried to have a look at what is going on in the code, but changing something in there always breaks the GenerlizedLibrary script. Furthermore, I am confused that this did not happen when I ran the analysis with just the normal SINDy, because the library is the same for both. Maybe you can tell me if I am just doing something wrong, or what the issue could be. Thank you again in advance and I am looking forward to your answer,

Best wishes,
Bartosz

@akaptano
Copy link
Collaborator

Hi Bartosz, glad you're using all the new functionality. What version of PySINDy are you using? We just corrected an ensembling-related bug from versions 1.5.1 -> 1.6.1

@BMP-TUD
Copy link
Author

BMP-TUD commented Jan 27, 2022

Thank you for quick response, I am using the newest version 1.6.1.

@akaptano
Copy link
Collaborator

Good, let me try to reproduce this now.

@akaptano
Copy link
Collaborator

akaptano commented Jan 27, 2022

Small oversight on our part. Looks like when ensemble=True, the library is fit each time, so it adds the tensored libraries the first time and then it is confused when the second call of fit happens. Fixed this and pushing the changes in a moment, so you'll need to get the latest version of the main branch. Let me know if you need anything else, and here is the silly example I used to debug:

import numpy as np
import pysindy as ps
from scipy.integrate import solve_ivp

def lorenz(t, x, sigma=10, beta=2.66667, rho=28):
    return [
        sigma * (x[1] - x[0]),
        x[0] * (rho - x[2]) - x[1],
        x[0] * x[1] - beta * x[2],
        0,
        0,
    ]
integrator_keywords = {}
integrator_keywords['rtol'] = 1e-12
integrator_keywords['method'] = 'LSODA'
integrator_keywords['atol'] = 1e-12

dt = .002
t_train = np.arange(0, 10, dt)
x0_train = [-8, 8, 27, 10, 4]
t_train_span = (t_train[0], t_train[-1])
x_train = solve_ivp(lorenz, t_train_span, x0_train, 
                    t_eval=t_train, **integrator_keywords).y.T

A_T=1
P_T=1
G_T=1
E_T=3

lf0 = [
lambda x : x,
    ]
lf0 = [
    lambda x : x,
    ]
l0=ps.CustomLibrary(library_functions=lf0,function_names=lf0,include_bias=False)

lf1 = [
lambda x,y : x*y,
    ]
lf1 = [
    lambda x,y : x+y,
    ]
l1=ps.CustomLibrary(library_functions=lf1,function_names=lf1,include_bias=True)

lf2=[
    lambda x: A_T-x]
lf2_names=[
    lambda x: 'A_T-'+x]
l2=ps.CustomLibrary(library_functions=lf2, function_names=lf2_names, include_bias=False)

lf3=[
    lambda x: P_T-x]
lf3_names=[
    lambda x: 'P_T-'+x]
l3=ps.CustomLibrary(library_functions=lf3, function_names=lf3_names, include_bias=False)

lf4=[
    lambda x: G_T-x]
lf4_names=[
    lambda x: 'G_T-'+x]
l4=ps.CustomLibrary(library_functions=lf4, function_names=lf4_names, include_bias=False)

lf5=[
    lambda x,y: E_T-x-y]
lf5_names=[
    lambda x,y: 'E_T-'+x+'-'+y]
l5=ps.CustomLibrary(library_functions=lf5, function_names=lf5_names, include_bias=False)



inputs_temp=np.tile([0,1,2,3,4],6)
inputs_per_library = np.reshape(inputs_temp, (6, 5))
inputs_per_library[2, :] = 0 # Just use the u input for generating the library
inputs_per_library[3, :] = 4 # Just use the y input for generating the library
inputs_per_library[4, :] = 3 # Just use the x input for generating the library
inputs_per_library[5, :3] = 3 # Just use the x and y input for generating the library
tensor_array=[[1,0,1,0,0,0],[1,0,0,1,0,0], [1,0,0,0,1,0], 
              [1,0,0,0,0,1], [0,0,1,1,0,0], [0,0,1,0,1,0], [0,0,1,0,0,1],
              [0,0,0,1,1,0],[0,0,0,1,0,1], [0,0,0,0,1,1]]
custom_library=ps.GeneralizedLibrary(libraries=[l0,l1,l2,l3,l4,l5],
                                     inputs_per_library=inputs_per_library, 
                                     tensor_array=tensor_array)
lam = 0 
n_models = 20
feature_names = ['x', 'y', 'z', 'v', 'w']
ensemble_optimizer = ps.STLSQ(threshold=0.03, alpha=lam)
model = ps.SINDy(feature_names=feature_names, 
                 optimizer=ensemble_optimizer, 
                 feature_library=custom_library)
model.fit(x_train, t=dt, ensemble=True , n_models=n_models, quiet=True)
model.print()

@BMP-TUD
Copy link
Author

BMP-TUD commented Jan 27, 2022

Hi and thank you for the fast response and fix of the issue, really appreciate it. 👍

@BMP-TUD
Copy link
Author

BMP-TUD commented Jan 27, 2022

Hi, sorry for the maybe misguiding answer before, I still appreciate your work ;) but: I tried it now with the new 1.6.2 push but it does not seem to work. I get the same error message. I compared the code for this right now and nothing changed. Is the push already through? Or has something gone wrong? Thanks, bw Bartosz

@akaptano
Copy link
Collaborator

akaptano commented Jan 27, 2022

Are you cloning the GitHub repo directly or something else? The 1.6.2 push did not have the update... there is another git commit after that, containing the fix. If you are not cloning the repo directly, let me know, and I will just make a v1.6.3 release so you can get it

@BMP-TUD
Copy link
Author

BMP-TUD commented Jan 28, 2022

I am normally using pip to manage my packages or conda if necessary. I can just git-clone and have it straight away, but maybe it would be good to have it accessible through pip or conda. Thank you for your help :)

@akaptano
Copy link
Collaborator

Just published a new release (v1.6.3) with the change. Let me know if that does not fix your issue!

@znicolaou
Copy link
Collaborator

Thanks for the progress on this, @akaptano ! It looks like the last push and release went part of the way to addressing this issue, but I had to add another line to the generalized library for the weak case (to set the number of samples to the number of domains times the number of trajectories instead of x.shape[0]). I pushed this change to the weak_optimization branch for now.

I think some iterations of multiple_trajectories and nonweak/weak libraries may still cause issues, and we probably should do a few tests still before pushing another release. @BMP-TUD, I suggest you clone the github repository and checkout the weak_optimization branch for now. You can use pip to install the developmental version as well, as in:

git clone https://github.com/dynamicslab/pysindy.git
cd pysindy
git checkout weak_optimization
pip install .

Since you're using the weak libraries, you will also benefit from faster fitting in the weak_library branch, but you should remove the num_pts_per_domain option when you declare the WeakPDELibrary, since it is being eliminated. See the pull request #153 , which I'm linking to this issue for now. Some time soon we will merge this with master and create a new release.

@znicolaou znicolaou reopened this Jan 28, 2022
@akaptano
Copy link
Collaborator

@znicolaou I think you're right but @BMP-TUD if you aren't using the weak form stuff, I think you should be able to get away with using v1.6.3 for now.

@znicolaou
Copy link
Collaborator

Oops, had a typo in that last push that I fixed, so I pushed again... FYI I am also trying to use the generalized library with multiple trajectories to fit an integro-differential equation with the weak form library, so hopefully I can flesh out all these issues as I go.

@BMP-TUD
Copy link
Author

BMP-TUD commented Feb 3, 2022

Thank you both, I accessed the package via git clone. It works perfectly. :)

@akaptano
Copy link
Collaborator

akaptano commented May 3, 2022

Addressed this in the new release I believe. Closing this now.

@akaptano akaptano closed this as completed May 3, 2022
jpcurbelo pushed a commit to jpcurbelo/pysindy_fork that referenced this issue May 9, 2024
MTS runs may fail when the train/val/test period (including warmup) at some frequency has a length that is not a multiple of the frequency_factor (factor between lowest and current frequency).

This adds a check to find these cases and fail gracefully.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants