Enhanced subdomain integration for the weak form library #153

znicolaou · 2022-01-20T06:32:34Z

I implemented significant improvements to the weak form library! The performance increase depends on the number of spatial dimensions and the space-time resolution, but the example notebook 12_weakform_SINDy_examples.ipynb achieves better results than the old version in about 1/10th the runtime.

The old implementation interpolated the input onto a regular grid of points within each domain cell, as specified by num_pts_per_domain. These interpolated values were multiplied by corresponding derivatives of the polynomial test functions and integrated using the trapezoid rule to produce the output features.

The crucial observation here is that the linear interpolation and trapezoid rule approximates an integral of a piecewise polynomial in the large num_pts_per_domain limit. There is no need to interpolate the data--the integral of the polynomials can be performed exactly on each interval separating space-time points. By deriving the exact expressions for the integrals of the derivatives of the test functions at the space-time points of the input data, we can convert the integral into a weighted sum over the input data. This reduces the number of evaluation to the number of space-time points in the domain cells while achieving the accuracy of the num_pts_per_domain to infinity limit. It also allows for vectorized evaluation of each domain-cell integral.

In the new code, we find the indices self.inds_k[k] of the spatiotemporal grid that lie within the domain cell k (which was recentered and resized from self.H_xt to self.H_xt_k[k], as determined by the first a last points along each axis). The weights weights[k] are determined from the derivative orders of each term and self.xtilde_k[k], which is a rescaled version of the coordinates XT_k[k]=self.spatiotemporal_grid[np.ix_(*self.inds[k])]. The values of x_k[k]=x_shaped[np.ix_(*self.inds[k])] are weighted and summed to evaluate the integrals np.sum(x_k[k]*weights[k]).

I am pretty happy with the code now, but there are opportunities for improvement still. Since the size of self.inds_k[k] varies with k, it's challenging to further vectorize, but it's probably possible to do so. If we performed cumsum on a flattened version of the ragged array x_k*weights, then the sum within each cell could be determined by differences with some clever indexing. This is a little challenging to write, so I haven't seen if it would help. Also, the current implementation uses randomly placed domain cells, but it may be worth looking into slicing the whole spatiotemporal grid into a regular grid of cells.

…ed sums and vectorized somewhat

codecov-commenter · 2022-01-20T07:01:18Z

Codecov Report

Merging #153 (0748ec5) into master (9792beb) will decrease coverage by 1.16%.
The diff coverage is 86.79%.

@@            Coverage Diff             @@
##           master     #153      +/-   ##
==========================================
- Coverage   94.96%   93.79%   -1.17%     
==========================================
  Files          33       33              
  Lines        3178     3322     +144     
==========================================
+ Hits         3018     3116      +98     
- Misses        160      206      +46

Impacted Files	Coverage Δ
pysindy/utils/__init__.py	`100.00% <ø> (ø)`
pysindy/utils/odes.py	`98.93% <0.00%> (ø)`
pysindy/feature_library/base.py	`92.51% <44.44%> (-3.17%)`	⬇️
pysindy/pysindy.py	`88.07% <65.06%> (-6.10%)`	⬇️
pysindy/utils/base.py	`89.76% <76.92%> (-3.22%)`	⬇️
pysindy/feature_library/weak_pde_library.py	`96.37% <97.98%> (-0.60%)`	⬇️
pysindy/feature_library/generalized_library.py	`96.99% <100.00%> (+0.05%)`	⬆️
pysindy/optimizers/constrained_sr3.py	`91.94% <100.00%> (ø)`
pysindy/optimizers/frols.py	`97.61% <100.00%> (ø)`
pysindy/optimizers/sr3.py	`94.20% <100.00%> (ø)`
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9792beb...0748ec5. Read the comment docs.

akaptano · 2022-01-20T16:50:52Z

Hey Zach, are you free for a quick zoom call today? I'm free all day and think it will be easier for me to understand exactly what you did here. Looks amazing though!

znicolaou · 2022-01-20T18:06:04Z

Sure, happy to chat after lunch today-how about we say 2?

akaptano · 2022-01-20T18:20:33Z

sounds good! I'll shoot you a zoom link on the Slack if that works

znicolaou · 2022-01-21T05:25:47Z

FYI, after benchmarking a little, at least half of the runtime is spent calculating the weights now, which is done in _set_up_grids when you create the WeakPDELibrary object. This is currently accomplished by multiplying the weights along each axis with python loops, and I expect there is a lot of opportunity for vectorization there, but those ragged arrays make it a bit hard.

akaptano · 2022-01-21T23:08:26Z

sounds good. Heads up other work has picked up substantially, so it might take me a few weeks to get to this. At least you can use it in the meantime for your coarse-grain models. Let me know if there is some aspect that is time-sensitive and I
l'll try to get to that sooner.

znicolaou · 2022-01-22T18:42:20Z

Ok, sounds good! I will keep using it and may make a few changes to this branch as I go.

znicolaou · 2022-02-04T23:54:39Z

@akaptano I saw all those issues with the generalized and tensor libraries on the issues -- looks like there's some organizing to do! Also, noticed the branch on the temporal derivatives enhancement, which seems like a good idea. This last week got busy, but hopefully I can get back to finishing up this enhancement soon. I'll try to sort out the places in the tensor/generalized libraries where n_samples needs to be set, and make sure it works with multiple_trajectories too!

akaptano · 2022-02-05T01:09:18Z

Thanks a lot, and I will try to help in a week or two. Just thought we might as well combine those changes with this pull request, and make a new release when we are confident we have resolved those errors + the new weak form stuff is working well.

akaptano · 2022-04-24T18:17:26Z

First of all, amazing job and thanks very much for these additions!!!

Pushed some minor changes to the notebook and the weak library.

Some notes and things to do for both of us (although I will leave better documenting the new techniques for you):

set_up_grids has gotten much more substantial. Could you add some documentation in set_up_grids and elsewhere, adding comments and describing the high-level view of what you are doing with the gridding?
Add descriptions for _left_weights and _right_weights. Add comments above nested for loops around line 868 (over dx_k_j) describing what is done here.
WeakPDELibrary class description should be extended significantly to detail exactly how you are doing the weak form now.
Fix Issue model.score does not work with WeakPDELibrary #155
Fix Issue [BUG] Iteration problem when using ensemble SINDy and GenerlizedLibrary #158 (looks like this was already done?)
Double check addressed the problem brought in Issue Weak SINDY for higher order ODEs #159 "Just a heads up for the weak formulation update... currently the TensoredLibrary does not allow for weak terms (and therefore neither does the GeneralizedLibrary if any of the libraries are tensored together) and this would be a good fix. The fix is straightforward... basically just need to make sure "n_samples" is correct, since this number changes in the weak form. I made a hack-ish fix to this branch so that it works for Bhumika's application."
Maybe save this for a different pull request... get rid of SINDy-PI and incorporate a flag that decides whether or not to include the temporal derivatives in the PDELibrary and WeakPDELibrary (and deals with avoiding the trivial fit u_dot = d/dt u). This could also address Issue Extrapolating 1D PDE with discovered PDE and constraint of known timepoints #168.
Double check this is compatible/fixed Issue Request for advice about shape of control_features when using PDELibrary #164
Fix or example that addresses Issue Including derivatives of non-linear terms in PDELibrary #170

Please update the pull request as you check off items here

…ed some of the timing and extras in the notebook.

…g issue with click package. See discussion here https://stackoverflow.com/questions/71673404/importerror-cannot-import-name-unicodefun-from-click

… now is more stringent about linting. This commit attempts to get the pysindy code up to the new standards.

akaptano · 2022-04-25T00:05:59Z

Number 8 on the list completed, although other issues with generalizedlibraries + ensembling + control inputs probably are lingering.

znicolaou · 2022-04-25T00:15:31Z

Thanks for all the organizing, Alan! I just finished up adding comments and descriptions for numbers 1-3, and pushing in a moment. I think I'll check out number 4 and 9 either later tonight or tomorrow!

akaptano · 2022-04-25T00:52:07Z

Awesome job, I'll be done with the changes getting 6 and 8 working in a moment. I would put items 7 and 9 on the low-priority list, although I think item 7 is big payoff for minimal extra work.

…e with weak form in the tensor library.

…sindy into weak_optimization

znicolaou · 2022-04-25T03:40:43Z

I took a first pass at 4 above, implementing the score and predict in the weak form. I've only done the weak form equivalents (not using the coefficients in the derivative form to make a score more comparable to the PDELibrary score, as discussed in #155 ). I'm sure some more work needs to be done for the GeneralizedLibrary case and the case with control inputs (I'm not sure if the WeakPDELibrary has been tested at all with control yet). The SINDy.simulate also needs updating.

akaptano · 2022-04-25T04:05:43Z

Cool I'll try to tie up some of these loose ends tomorrow night. Go team!

…and control

znicolaou · 2022-04-25T21:02:44Z

Okay, I did a pretty big push to organized points 1-4 above!

The score and predict functions are working adequately in the weak case and pass the unit tests, but pysindy.py is a bit disorganized in general. In particular, to evaluate the derivatives x_dot in the fit and score functions if it is not provided by the user, there are many checks and reshapings for weak and generalized libraries, discrete time, and multiple trajectories that are almost identical in various functions. It would probably be better to expand the validate_input and validate_control_variables functions to consolidate all these checks and reshapings. The cases where the input has only one input_feature and the user provides a 1d array of shape n_t rather than an array of shape (nt, 1) is also handled with various hacks now. Updating the validate input functions and performing them at the beginning of each function would make things much cleaner.

akaptano · 2022-04-26T01:26:07Z

Completely agree with what you said that at some point we will need to invest the work to get all the validation/reshaping fixed. Model.score looks good but model.predict is predicting only the weak form of x_dot (which is fine for model.predict but not helpful if someone wants to take their fitted model and predict a new trajectory. Same issue with model.simulate).

I think we could get more useful versions of model.predict and model.simulate (which relies on model.predict) working by building a non-weak model from weak form (defaulting to finite differences -- this is basically the "hack" I do in the example 12 notebook but we would be hiding everything in the backend). What is annoying about this is that the PySINDy predict function calls the sklearn predict function through self.model.predict. So somehow we have to fix this sklearn "Pipeline" object to return the right thing (or circumnavigate it if the weak form is being used). I am worried about this issue because we recommend users use this method for noisy data but it is quite unwieldy to take it and run with it. Of course if PDEs are being identified, users have to go elsewhere for numerical solvers anyways.

This might be more than we bargained for, so maybe let's fix item 5 if it is still an issue, and consider doing item 7 before the merge. But I think we are getting very close!

znicolaou · 2022-04-26T02:29:51Z

Yeah, sounds good to defer some reorganizing beyond this pull request. I think we've made good progress, and as long as we get things functional, we should move forward.

I also agree predict and simulate in the weak form should be expanded. Note that score in the weak form is currently relying on predict as well. In the score case, it may make sense to report the score based on the weak features, since it will reflect the "denoised" score that the weak form is designed for. But using the weak form for simulate would be confusing indeed. We may want flags to enable multiple behaviors in the weak case, but maybe we should deal with that in the future.

akaptano · 2022-04-26T03:50:30Z

This sounds good, although I'll give the SINDy-PI incorporation into the PDE libraries a go. The SINDy-PI part of the code is really underdeveloped and think I'll see how far I can push things in a few hours this week.

Anything else you would like to do/check before the merge? Looks like code coverage has decreased slightly so might be worth some more unit tests of all the validation/reshaping.

znicolaou · 2022-04-26T16:16:03Z

I think I am pretty much done for the moment-I just added a little more detail to the class description yesterday. Some more unit tests would probably be good for the reshaping in the weak and control cases, but not sure I have bandwidth in the next week or two for it. Maybe we organize a few pending tasks and start a new branch after merging to work in?

Enhanced subdomain integration for the weak form library

* Fix metrics df function * Fix docstring

eliminated interpolation in the weak form in favor of analytic weight…

5038797

…ed sums and vectorized somewhat

znicolaou added the enhancement New feature or request label Jan 20, 2022

znicolaou requested review from briandesilva and akaptano January 20, 2022 06:32

znicolaou assigned znicolaou and akaptano Jan 20, 2022

added binomial factor for higher mixed derivative orders

03199cf

added acknowledgement for ZGN to paper

05c7e87

znicolaou added 3 commits January 20, 2022 13:34

cleaned up code and added some comments

fe82e95

removed old unused code

899468d

recycled function evaluation

b20099f

znicolaou added 5 commits January 24, 2022 18:20

improved weight calculation and vectorized a bit

e25dac2

more vectorization

0be5400

organized a bit

d0803a2

added small fix to generalized library for the weak case

44fdedb

fixed linting issue

de2e190

znicolaou mentioned this pull request Jan 28, 2022

[BUG] Iteration problem when using ensemble SINDy and GenerlizedLibrary #158

Closed

fixed typo...

3673517

Alan Kaptanoglu added 2 commits April 24, 2022 11:19

Added some minor polishing changes to the notebook and library. Chopp…

2f63caf

…ed some of the timing and extras in the notebook.

Fixed the simple linting errors.

45435b8

Alan Kaptanoglu added 2 commits April 24, 2022 16:14

Added newest version of black python package in order to avoid lintin…

8d7fe0a

…g issue with click package. See discussion here https://stackoverflow.com/questions/71673404/importerror-cannot-import-name-unicodefun-from-click

Updated to newest version of black python package last commit, but it…

75a7b2b

… now is more stringent about linting. This commit attempts to get the pysindy code up to the new standards.

documented the new weak library

1fb7850

znicolaou added 3 commits April 24, 2022 17:26

merged weak_pde_library

f445942

merged weak_pde_library, but saved this time

6fe1299

fixed linting

a87932f

Alan Kaptanoglu and others added 6 commits April 24, 2022 20:00

Believe I fixed issues using control inputs with PDE/weakPDE and issu…

a292ba6

…e with weak form in the tensor library.

Merge branch 'weak_optimization' of https://github.com/dynamicslab/py…

f8530de

…sindy into weak_optimization

Merged with Zacks changes. Nice progress today.

d8fe104

added weak form score and predict

264b3d0

Merge branch 'weak_optimization' of https://github.com/dynamicslab/py…

509da85

…sindy into weak_optimization

fixed linting

98617f2

znicolaou added 2 commits April 25, 2022 11:42

debugged score and predict for discrete_time, multiple_trajectories, …

9927c42

…and control

added more weak documentation

0748ec5

akaptano merged commit d4e64c4 into master Apr 30, 2022

akaptano deleted the weak_optimization branch April 30, 2022 20:14

jpcurbelo pushed a commit to jpcurbelo/pysindy_fork that referenced this pull request Apr 30, 2024

Merge pull request dynamicslab#153 from dynamicslab/weak_optimization

4f71431

Enhanced subdomain integration for the weak form library

jpcurbelo pushed a commit to jpcurbelo/pysindy_fork that referenced this pull request May 9, 2024

Fix metrics df (dynamicslab#153)

c6fab75

* Fix metrics df function * Fix docstring

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhanced subdomain integration for the weak form library #153

Enhanced subdomain integration for the weak form library #153

znicolaou commented Jan 20, 2022

codecov-commenter commented Jan 20, 2022 •

edited

akaptano commented Jan 20, 2022

znicolaou commented Jan 20, 2022

akaptano commented Jan 20, 2022

znicolaou commented Jan 21, 2022

akaptano commented Jan 21, 2022

znicolaou commented Jan 22, 2022

znicolaou commented Feb 4, 2022

akaptano commented Feb 5, 2022

akaptano commented Apr 24, 2022 •

edited

akaptano commented Apr 25, 2022

znicolaou commented Apr 25, 2022

akaptano commented Apr 25, 2022

znicolaou commented Apr 25, 2022

akaptano commented Apr 25, 2022

znicolaou commented Apr 25, 2022

akaptano commented Apr 26, 2022 •

edited

znicolaou commented Apr 26, 2022

akaptano commented Apr 26, 2022

znicolaou commented Apr 26, 2022

Enhanced subdomain integration for the weak form library #153

Enhanced subdomain integration for the weak form library #153

Conversation

znicolaou commented Jan 20, 2022

codecov-commenter commented Jan 20, 2022 • edited

Codecov Report

akaptano commented Jan 20, 2022

znicolaou commented Jan 20, 2022

akaptano commented Jan 20, 2022

znicolaou commented Jan 21, 2022

akaptano commented Jan 21, 2022

znicolaou commented Jan 22, 2022

znicolaou commented Feb 4, 2022

akaptano commented Feb 5, 2022

akaptano commented Apr 24, 2022 • edited

akaptano commented Apr 25, 2022

znicolaou commented Apr 25, 2022

akaptano commented Apr 25, 2022

znicolaou commented Apr 25, 2022

akaptano commented Apr 25, 2022

znicolaou commented Apr 25, 2022

akaptano commented Apr 26, 2022 • edited

znicolaou commented Apr 26, 2022

akaptano commented Apr 26, 2022

znicolaou commented Apr 26, 2022

codecov-commenter commented Jan 20, 2022 •

edited

akaptano commented Apr 24, 2022 •

edited

akaptano commented Apr 26, 2022 •

edited