-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix numerical instability #149
Fix numerical instability #149
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Thanks @iangrooms, this sounds really exciting! 🚀 It is a bit difficult for me to review this PR because there are no notes on how the new method exactly operates. Ian, it would be helpful if you could share notes (if available) and/or provide more comments in the code. That way, I could try to match up theory and code. In addition, I am wondering if we can leverage the verification tests that @rabernat set up in #79 (where we would need verifications not only for kernels but for full filter operators). The rationale is the following: This PR swaps out the central filtering algorithm which this package is based on for another algo. Don't we want to test that the new algo leads to the same/similar filtered fields as the old one? |
I think I'll update the Filter Theory section to describe how it works. In exact arithmetic it should produce exactly the same answer as the old method, but the new method is more stable to roundoff errors. Sort of like re-arranging the order of the filter steps.
I probably have to look more closely at this. I guess we could put in some kind of test to compare the result of the new algorithm to the old one, but eventually I think we should just drop the old one and use the new one. |
Sounds great! I will wait with the review until the changes to the filter theory are pushed.
Yes, exactly. As outlined in #79, the idea is to have tests like these:
where |
Codecov Report
@@ Coverage Diff @@
## master #149 +/- ##
==========================================
+ Coverage 97.98% 98.32% +0.33%
==========================================
Files 9 9
Lines 1044 1014 -30
==========================================
- Hits 1023 997 -26
+ Misses 21 17 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
I added a section on the theory that underpins the new iterative algorithm for applying the filter to data. @NoraLoose can you take a look? |
.. math:: \mathbf{A} = -\left(\frac{2}{s_{\text{max}}}\Delta + \mathbf{I}\right) | ||
|
||
be the discrete Laplacian :math:`\Delta` with a rescaling and a shift. | ||
In principle one could apply the filter to a vector of data :math:`\mathbf{f}` by computing the vectors :math:`T_i(\mathbf{A})\mathbf{f}` and then taking a linear combination with weights :math:`c_i`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reformulate to avoid confusion with the "vector" Laplacian further down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions for how to do that are welcome. We don't want to be so loose with terminology that it's confusing, but at the same time we don't want to be pedantic.
gcm_filters/filter.py
Outdated
T_minus_2 = T_minus_1.copy() | ||
T_minus_1 = T_minus_0.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T_minus_2 = T_minus_1.copy() | |
T_minus_1 = T_minus_0.copy() | |
T_minus_2 = T_minus_1 | |
T_minus_1 = T_minus_0 |
gcm_filters/filter.py
Outdated
uT_minus_2 = uT_minus_1.copy() | ||
uT_minus_1 = uT_minus_0.copy() | ||
vT_minus_2 = vT_minus_1.copy() | ||
vT_minus_1 = vT_minus_0.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uT_minus_2 = uT_minus_1.copy() | |
uT_minus_1 = uT_minus_0.copy() | |
vT_minus_2 = vT_minus_1.copy() | |
vT_minus_1 = vT_minus_0.copy() | |
uT_minus_2 = uT_minus_1 | |
uT_minus_1 = uT_minus_0 | |
vT_minus_2 = vT_minus_1 | |
vT_minus_1 = vT_minus_0 |
gcm_filters/filter.py
Outdated
uT_minus_2 = ufield_bar.copy() | ||
vT_minus_2 = vfield_bar.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uT_minus_2 = ufield_bar.copy() | |
vT_minus_2 = vfield_bar.copy() | |
uT_minus_2 = ufield_bar | |
vT_minus_2 = vfield_bar |
gcm_filters/filter.py
Outdated
field_bar += ( | ||
temp_l * 2 * np.real(s_b) / np.abs(s_b) ** 2 | ||
+ temp_b * 1 / np.abs(s_b) ** 2 | ||
T_minus_2 = field_bar.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T_minus_2 = field_bar.copy() | |
T_minus_2 = field_bar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm responding here for all of the .copy()
changes. I'm afraid to remove these because Python might just have T_minus_2
and field_bar
point to the same thing, such that every time we update T_minus_2
it also updates field_bar
, which we don't want.
gcm_filters/filter.py
Outdated
1: {"offset": 2.2, "factor": 0.6, "exponent": 2.5, "max_filter_factor": 67}, | ||
2: {"offset": 3.2, "factor": 0.7, "exponent": 2.7, "max_filter_factor": 77}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need max_filter_factor
? Do we still want to throw numerical instability warnings? What are these values based on?
Looks great @iangrooms! I left some minor comments above. Two bigger comments are:
|
Co-authored-by: Nora Loose <NoraLoose@users.noreply.github.com>
Thanks for the code review! In response to the two big comments:
|
The point of the verification tests would be that we don't have to "believe". They would catch any bugs in the derivation / implementation of the new algorithm, which we may have overseen by simply looking at the code.
Sure, we can make the verification tests its own issue and PR. But I think those should be resolved first, before we can merge this PR. I can help with these tests. |
I ran the old and new algorithms on the numerical instability notebook. I saved the result of the old algorithm ( |
Hi @iangrooms, we merged #153 so we can finalize this PR too. Here are some final minor things that have to be done / discussed:
|
I'll merge the new tests. In the meantime, I do plan to remove the numerical instability notebook as well as any related comments in the docs, and the 'factoring the Gaussian.' It's a big update, so I think it makes sense to bump the version number. The old version will still be available if anyone needs it. |
If we remove the "Factoring the Gaussian" section in the docs, we should probably also delete the parameter |
Removes discussion of the factored/iterated Gaussian filter from the docs Removes the numerical instability notebook Slightly updates the theory section of the docs
I've removed the numerical instability notebook, removed the factored Gaussian from the docs and code, and merged the latest updates to testing. For some reason building the docs fails (all other tests pass); I noticed that this happened after I pulled in the latest changes from the master branch and before I made my own changes. Not sure how to fix it. Can @NoraLoose or @rabernat take a look? |
I can build the docs locally for this branch. Not sure what is going on either. |
I am guessing what happened is that some version has changed in the readthedocs environment, since we are not pinning specific versions. It looks a bit like this: pydata/pydata-sphinx-theme#511 I will try to dig deeper. If you want, you can merge this PR and we will fix the docs in a separate PR. |
I can also build the docs locally, so if you're both agreed then I think it's OK to merge this PR and fix the docs separately. |
From the beginning we've had problems with numerical stability, i.e. roundoff error, that accumulates during the iterative application of the polynomial filter to data. See, for example, #33, #124, and #135, and the things we've tried to improve this, e.g. #67, #130, and #134.
This PR uses exactly the same polynomial approximation of the target filter, but applies the polynomial to the data in a completely different way. The old way was based on factoring the polynomial in terms of its roots and then iteratively applying each factor to the data. The new method is based on an algorithm for evaluating a polynomial using its coordinates in the Chebyshev basis, the algorithm being based on the three-term recurrence for Chebyshev polynomials.
For now I have left the "Numerical Instability" example notebook in, showing that the new code can filter 1/4 degree vorticity data to 10 degrees with no problems. In fact I have not been able to break the new code in any example I've tried including the Taper filter with very large filter factors.