Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model.score does not work with WeakPDELibrary #155

Closed
BMP-TUD opened this issue Jan 25, 2022 · 6 comments
Closed

model.score does not work with WeakPDELibrary #155

BMP-TUD opened this issue Jan 25, 2022 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@BMP-TUD
Copy link

BMP-TUD commented Jan 25, 2022

Hi, I noticed that when I want to analyse spatiotemporal data set with a weak pde formulation, the application of the score function does not work well, as it states a problem of dimensions of the predicted data. Here is an excerpt of the error message. In my analysis i attempted to use a data of spatial dims 300x300 and 50 time points with two variables --> u=array of 300,300,50,2.

error_pred=model.score(u)
Traceback (most recent call last):

  File "C:\Users\u0149745\AppData\Local\Temp/ipykernel_12676/3774757058.py", line 1, in <module>
    error_pred=model.score(u, metric=abs_error)

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\pysindy\pysindy.py", line 799, in score
    x_dot_predict = self.model.predict(x)

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\sklearn\utils\metaestimators.py", line 113, in <lambda>
    out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)  # noqa

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\sklearn\pipeline.py", line 469, in predict
    Xt = transform.transform(Xt)

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\pysindy\feature_library\weak_pde_library.py", line 473, in transform
    x = check_array(x)

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\sklearn\utils\validation.py", line 794, in check_array
    raise ValueError(

ValueError: Found array with dim 4. Estimator expected <= 2.

If I modify the data to estimate the score for the time series of one point, the error message is as follows:

error_pred=model.score(u[100,100,:,:])
Traceback (most recent call last):

  File "C:\Users\u0149745\AppData\Local\Temp/ipykernel_12676/1236244835.py", line 1, in <module>
    error_pred=model.score(u[100,100,:,:])

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\pysindy\pysindy.py", line 799, in score
    x_dot_predict = self.model.predict(x)

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\sklearn\utils\metaestimators.py", line 113, in <lambda>
    out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)  # noqa

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\sklearn\pipeline.py", line 469, in predict
    Xt = transform.transform(Xt)

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\pysindy\feature_library\weak_pde_library.py", line 492, in transform
    x_full = np.reshape(

  File "<__array_function__ internals>", line 180, in reshape

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\numpy\core\fromnumeric.py", line 298, in reshape
    return _wrapfunc(a, 'reshape', newshape, order=order)

  File "C:\Users\u0149745\Anaconda3\envs\spirals\lib\site-packages\numpy\core\fromnumeric.py", line 57, in _wrapfunc
    return bound(*args, **kwds)

ValueError: cannot reshape array of size 200 into shape (1,300,300,50,2)

Interestingly, this works perfectly fine, if I apply the normal PDE library on the same data set. Can you give me a hint where the problem could lay?

@akaptano
Copy link
Collaborator

This is actually by design, although we could easily add some code to fix this. The problem here with the weak form is that the code is now computing integral ( x_dot * w) = integral (w_dot * x) , rather than x_dot itself, and then doing the regression. The problem with score is that what it usually does is compute |x_dot_true - x_dot_pred|, but x_dot_true is some strange integral quantity with different dimensions than normal. I'll look into fixing this, although I show a "hack" to get around this in the Example 12 notebook.

Best,
Alan

@akaptano akaptano added the bug Something isn't working label Jan 25, 2022
@znicolaou
Copy link
Collaborator

I'm working on some performance improvements to the WeakPDELibrary now in the weak_optimization branch--it should be easy to make the score function transform the data to the integrals in the weak case. I will add it with my next commit!

@akaptano
Copy link
Collaborator

akaptano commented Jan 25, 2022

@znicolaou
A note here:

  1. If we compare the weak form of x_dot between the true and predicted models, this is really a different metric than what is usually reported by model.score. That is a bit frustrating since it may confuse users or indicate a good/bad integral(x_dot) even if the fit of x_dot is actually rather poor.
  2. If we compare the derivative forms of x_dot, then we have to decide which differentiation scheme to use to generate them. Usually this is specified (or default) by the user, but in the weak form case we might just have to default to Finite Differences. This has the advantage of being the same as model.score with other libraries, but this could also be misleading, since the weak form might be a good fit while the derivative form isn't.

It might be a good idea to report both metrics with the weak form models.
Any thoughts on this?

@znicolaou
Copy link
Collaborator

@akaptano

You're right, there are some ambiguities about what the score means in the weak case.

  1. I think it would be very unlikely for the integral(x_dot) to have a "small" error if the model is not accurate, but the meaning of the score would probably not be clear to the user. Some normalization over the number of domains K & the number of grid points in the spatiotemporal grid may be possible to achieve a ballpark similarity between the scores in the PDELibrary and the WeakPDELibrary for non-noisy data.
  2. Using the derivative form with FiniteDifferences is not a bad option, but it may be misleading for noisy data (which is exactly what the weak form is trying to account for).

I'll run some benchmarks to see what the differences are eventually.

@BMP-TUD
Copy link
Author

BMP-TUD commented Jan 26, 2022

Dear Alan and Zachary,

thank you on your remarks and answering my questions so fast. Indeed, I forgot that you presented a work around in example 12. And thanks for sharing your ideas and issues with the weak form score estimation,

Best wishes,
Bartosz

@akaptano
Copy link
Collaborator

akaptano commented May 3, 2022

Addressed in the new release, closing this now.

@akaptano akaptano closed this as completed May 3, 2022
jpcurbelo pushed a commit to jpcurbelo/pysindy_fork that referenced this issue May 9, 2024
* Added AR-LSTM

* Update docs/source/usage/config.rst

Co-authored-by: Martin Gauch <15731649+gauchm@users.noreply.github.com>

* Update neuralhydrology/modelzoo/arlstm.py

Co-authored-by: Martin Gauch <15731649+gauchm@users.noreply.github.com>

* Update neuralhydrology/utils/config.py

Co-authored-by: Martin Gauch <15731649+gauchm@users.noreply.github.com>

* PR changes 1

* CUDA-LSTM in AR-LSTM

* removed redundancy in head of arlstm

* more thorough AR error checking.

* syntax error in check of lagged inputs

* removed check against negative lagged integers.

* added ARLSTM documentation

* Update docs/source/usage/models.rst

Co-authored-by: Martin Gauch <15731649+gauchm@users.noreply.github.com>

* Update docs/source/usage/models.rst

Co-authored-by: Martin Gauch <15731649+gauchm@users.noreply.github.com>

* Update docs/source/usage/models.rst

Co-authored-by: Martin Gauch <15731649+gauchm@users.noreply.github.com>

* added doc for arlstm

* arlstm docs

* Update docs/source/api/neuralhydrology.modelzoo.arlstm.rst

Co-authored-by: Frederik Kratzert <kratzert@users.noreply.github.com>

* Update models.rst

Co-authored-by: Martin Gauch <15731649+gauchm@users.noreply.github.com>
Co-authored-by: Frederik Kratzert <kratzert@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants