Add smooth_radar_mask functionality from steps blending to pca_kalman_filter_blending #528

RubenImhoff · 2025-11-18T09:20:47Z

The current reduced-space ensemble kalman filter blending is working well in our test setup at KNMI. However, we notice some edge artefacts sometimes in the forecast, especially for the first lead times where the nowcast is dominant, e.g. (I believe this was the probability of >0.1mm/h rainfall for all members):

What we think happens is that the transition nowcast - NWP at the edge of the radar domain can be sharp and because the advection step is now all the way at the end of the forecast (probability matching and the Kalman filter correction are before it), you also get discontinuities in the nowcast rainfall if a field is right at the edge of the domain. The smooth_radar_mask_range option that we have already implemented in the STEPS blending procedure might be a solution. This PR implements this option in the pca_kalman_filter_blending code.

@m-rempel, I couldn't add you as a reviewer, but maybe good if you have a look at it as well. :)

…ens_kalman_filter blending

codecov · 2025-11-18T09:32:43Z

Codecov Report

❌ Patch coverage is 95.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.16%. Comparing base (fe3a98a) to head (07725ca).
⚠️ Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
pysteps/blending/pca_ens_kalman_filter.py	95.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #528      +/-   ##
==========================================
+ Coverage   84.14%   84.16%   +0.02%     
==========================================
  Files         168      168              
  Lines       14507    14540      +33     
==========================================
+ Hits        12207    12238      +31     
- Misses       2300     2302       +2

Flag	Coverage Δ
unit_tests	`84.16% <95.00%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

mats-knmi · 2025-11-19T11:58:25Z

pysteps/blending/pca_ens_kalman_filter.py

+        # forecast outside the radar domain. Therefore, fill these
+        # areas with NWP, if requested.
+        if self.__forecast_state.config.smooth_radar_mask_range != 0:
+            nan_indices = np.isnan(


I would call this nan_mask in stead of nan_indices, as this contains a boolean mask array.

It doesn't work yet

…dar mask

m-rempel · 2025-11-19T19:24:28Z

Hi @RubenImhoff !

Thank you for that nice additional work! Please keep in mind, that I've followed the strategy to store all the nan values in the variable no_data_mask and replaced these nan values within the forecast loop with thr_min. The idea behind was that the Kalman filter does all the weighting---even beyond the radar domain. So, please test whether there are actually nan values within your defined mask.

Regarding your example, actually, I've not tested the case where the domain edge is equal or smaller than the radar coverage, because in my cases thr_min values are replicated at the edges instead of precipitation values.

I will have a closer look on that tomorrow.

RubenImhoff · 2025-11-20T07:58:42Z

Hi @m-rempel, thanks for the feedback! Yes, we also saw that the majority of the problem originates from the fact that with the extrapolation_kwargs["map_coordinates_mode"] = "nearest" method, the rainfall at the edges of the domain is copied and kept during the advection steps. This results in the error you see above.

For now, the way to fix it is, is to use extrapolation_kwargs["map_coordinates_mode"] = "constant", but that can give nans at the edges if extrap_kwargs["interp_order"] = 3 (as in your example), or hard transitions with a fixed value. What was your reason to go for extrap_kwargs["interp_order"] = 3"?

Our solution for now is to try some smoothing at the edges, but that may not be necessary anymore if we find a better solution to fill the transition at the edges well. A fixed value would do or perhaps a smart resampling. What do you think?

m-rempel · 2025-11-20T09:21:08Z

Hi @RubenImhoff , I have to interpolate the NWC fields at each forecast time step to do the combination since I use the intrinsic motion of the NWP precipitation fields. Therefore my intention for the bicubic interpolation was to reduce interpolation artefacts within quasi-stationary situations.

May be one could think about an improved way to do the position allocation of each precipitation value for the PCA to overcome that forecast step-wise interpolation. However, I don't have an idea to do that in a sufficient amount of time.

pysteps/blending/pca_ens_kalman_filter.py

Co-authored-by: mats-knmi <145579783+mats-knmi@users.noreply.github.com>

RubenImhoff · 2025-11-20T10:10:19Z

Hi @RubenImhoff , I have to interpolate the NWC fields at each forecast time step to do the combination since I use the intrinsic motion of the NWP precipitation fields. Therefore my intention for the bicubic interpolation was to reduce interpolation artefacts within quasi-stationary situations.

May be one could think about an improved way to do the position allocation of each precipitation value for the PCA to overcome that forecast step-wise interpolation. However, I don't have an idea to do that in a sufficient amount of time.

Thanks, @m-rempel! Then we continue with trying to make work this minor improvement at the edges and if we can think of an even better idea to fix this, we can implement that later.

m-rempel · 2025-11-20T10:48:27Z

Hi @RubenImhoff and @mats-knmi , can you give my latest changes a try in your environment? The smooth mask should work now for nwc_prediction_btf.

RubenImhoff · 2025-11-20T12:30:13Z

Thanks, @m-rempel! @mats-knmi is back in the office on Monday, so we will have to wait for his test case for a little longer.
In the meantime, we saw that the blending moves faster to NWP with the current setup, possibly because we are throwing in NWP at the nans at the edges. I was thinking, would it make sense to change self.__forecast_state.nwc_prediction[self.__ens_member][nan_mask] = nwp[nan_mask] to a constant and to fill it with NWP once we are in the 'if' statement with smooth_radar_mask_range?

m-rempel · 2025-11-20T12:51:22Z

Yep @RubenImhoff , that would be a better strategy. I'll change this today in the evening.

m-rempel · 2025-11-20T18:58:32Z

It's a little bit more complex than I thought, but it works now for timesteps > 2. The actual observation doesn't contain the NWP data and timesteps 1 and 2 have the same data due to the pure advection step without any correction at the beginning.

RubenImhoff · 2025-11-21T13:53:14Z

Really nice work, @m-rempel! We will try it out early next week. :)
Let's see if that does the trick.

We also saw that, in our test case for the Netherlands, the ensemble spread of the blended foreccast is much smaller than of the ensemble nowcasts for the same case (for the first hour or so). Is that something you are familiar with? If it is fixable with a parameter setting, we will try that out. If not, we will make an issues.

m-rempel · 2025-11-21T14:39:26Z

Yep, I saw that also @RubenImhoff. There are at least two reasons for that: First, the spread of the analysis ensemble is in general smaller than the spread of background and observation ensemble. Second, spread generation due to STEPS is suppressed by areas without precipitation in NWP forecast as well as by the probability matching at each timestep.

For the latter, I have some improvements.

RubenImhoff · 2025-11-21T14:41:21Z

Good to know, @m-rempel!

mats-knmi · 2025-11-25T07:25:36Z

Yep, I saw that also @RubenImhoff. There are at least two reasons for that: First, the spread of the analysis ensemble is in general smaller than the spread of background and observation ensemble. Second, spread generation due to STEPS is suppressed by areas without precipitation in NWP forecast as well as by the probability matching at each timestep.

For the latter, I have some improvements.

Ah this is nice to hear. I was looking into a specific case for us (one where we had almost 0 spread in the nowcast untill 2-3 hours lead time when NWP fully took over) and it seems like in that case the second issue you mentioned was the biggest contributor. So if you can improve on that that would be great.

…dar_mask_range is larger than zero

RubenImhoff · 2025-11-26T14:59:37Z

@m-rempel, @mats-knmi and I added the functionality to also advect the domain mask. When doing so in combination with your changes, everything works as expected. I have, for now, changed the advection of the domain mask to being conditional on the smooth_radar_mask_range being larger than zero. However, when we were discussing about it, it seemed odd to us to keep overlaying a static radar domain mask in a blended product, especially when the radar data is advecting out of that domain, while there is NWP data to fill it up. What are your thoughts on this? Other than that, we are good to go, if all tests pass. :)

m-rempel · 2025-11-26T19:39:50Z

Hi @RubenImhoff and @mats-knmi, my idea for that static radar domain mask arose from the utilization of radar products in observation space such as reflectivity. For the latter, the simulated counterpart is also only computed within the observational domain. Therefore, in my opinion, it is more user-friendly to keep the domain constant.

For retrieved products in model space, and especially in the case of filling up with NWP data, it makes more sense to advect the mask. Thank you to bring up this question!

mats-knmi · 2025-11-28T12:34:49Z

Hi @RubenImhoff and @mats-knmi, my idea for that static radar domain mask arose from the utilization of radar products in observation space such as reflectivity. For the latter, the simulated counterpart is also only computed within the observational domain. Therefore, in my opinion, it is more user-friendly to keep the domain constant.

For retrieved products in model space, and especially in the case of filling up with NWP data, it makes more sense to advect the mask. Thank you to bring up this question!

While I do agree that keeping the mask constant is more user friendly, it is still less accurate even for just a radar nowcast. The radar data will drift out of your original mask and now new data will come in to replace it.

I would propose to just merge this PR as is and only advect the mask when we use this smooth option, since this is the only case in which it actually is an issue.

m-rempel · 2025-12-04T11:41:42Z

Hi @mats-knmi, I agree with you that this can be less accurate for a combined forecast. It is definitely less accurate for a pure radar nowcast when one set no echo values instead of no data values. Visually and for subsequent forecast products in a process chain based on that extrapolations.

However, the question about accuracy isn't that clear when it comes to forecast verification. Do you focus about the forecast (extrapolation) quality within the entire observation domain or within a common domain of forecast (extrapolation) and observation. With the latter you may artificially increase the quality of that extrapolation.

Regarding a combined forecast, the main concern are inconsistencies at the domain edges in both cases when (a) observation and NWP output are valid on the same domain and (b) NWP output is on a larger domain than the observation.

For (a), ideally, the ensemble spread at the respective domain edges is sufficiently large to include NWP information even at a early forecast timestep. However, I confess that this is currently not the case. Ensemble spread is computed only on a more or less small subdomain, due to the Lien criterion. This only includes grid boxes where NWC as well as NWP forecast precipitation in at least ten ensemble members.

For (b), either one doesn't use any mask and rely only on the Kalman filter resulting in a strong increase of precipitation values outside the observation domain at a certain point in forecast time and may lead to strong inconsistencies when there is precipitation at the domain edges in observation. Or one work entirely on the observation domain and adds NWP information by such a linear blending at the domain edges.

Since we have case (b) and considering all these points, I have to admit that such a linear blending of NWP information at the edges of the advected observation domain makes perfect sense. With this, one uses all available information with a minimum of inconsistencies at the domain edges.

feat: add smooth_radar_mask functionality from steps blending to pca_…

b6dd04b

…ens_kalman_filter blending

RubenImhoff requested a review from mats-knmi November 18, 2025 09:20

RubenImhoff self-assigned this Nov 18, 2025

mats-knmi reviewed Nov 19, 2025

View reviewed changes

mats-knmi previously approved these changes Nov 19, 2025

View reviewed changes

fix: ensure no nans are left after forecast step and adjust smooth ra…

da9eb61

…dar mask

dnerini requested a review from m-rempel November 19, 2025 19:49

mats-knmi reviewed Nov 20, 2025

View reviewed changes

pysteps/blending/pca_ens_kalman_filter.py Outdated Show resolved Hide resolved

mats-knmi reviewed Nov 20, 2025

View reviewed changes

pysteps/blending/pca_ens_kalman_filter.py Outdated Show resolved Hide resolved

RubenImhoff and others added 2 commits November 20, 2025 11:08

fix: indices when filling with NWP

9088868

Co-authored-by: mats-knmi <145579783+mats-knmi@users.noreply.github.com>

fix: fill nans in mask

120d850

Co-authored-by: mats-knmi <145579783+mats-knmi@users.noreply.github.com>

m-rempel added 2 commits November 20, 2025 10:43

fix: correct mask for nan values outside radar domain

4ab35b5

fix: avoid overwriting of NWP values in backtransform()

702e43e

fix: working smooth radar mask for timesteps > 2

0f8b510

fix: resolve minor codacy issue

ced9c4d

fix: working smooth radar mask for all timesteps

5f2f74e

m-rempel approved these changes Nov 21, 2025

View reviewed changes

mats-knmi and others added 2 commits November 26, 2025 15:50

advect domain mask

2122c44

fix: remove unnecessary line and only advect domain mask if smooth_ra…

07725ca

…dar_mask_range is larger than zero

mats-knmi approved these changes Nov 28, 2025

View reviewed changes

RubenImhoff merged commit fa1c5d2 into master Nov 28, 2025
10 checks passed

Add smooth_radar_mask functionality from steps blending to pca_kalman_filter_blending #528

Add smooth_radar_mask functionality from steps blending to pca_kalman_filter_blending #528

Uh oh!

Conversation

RubenImhoff commented Nov 18, 2025

Uh oh!

codecov bot commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mats-knmi Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

m-rempel commented Nov 19, 2025

Uh oh!

RubenImhoff commented Nov 20, 2025

Uh oh!

m-rempel commented Nov 20, 2025

Uh oh!

Uh oh!

Uh oh!

RubenImhoff commented Nov 20, 2025

Uh oh!

m-rempel commented Nov 20, 2025

Uh oh!

RubenImhoff commented Nov 20, 2025

Uh oh!

m-rempel commented Nov 20, 2025

Uh oh!

m-rempel commented Nov 20, 2025

Uh oh!

RubenImhoff commented Nov 21, 2025

Uh oh!

m-rempel commented Nov 21, 2025

Uh oh!

RubenImhoff commented Nov 21, 2025

Uh oh!

mats-knmi commented Nov 25, 2025

Uh oh!

RubenImhoff commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

m-rempel commented Nov 26, 2025

Uh oh!

mats-knmi commented Nov 28, 2025

Uh oh!

Uh oh!

m-rempel commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Nov 18, 2025 •

edited

Loading

RubenImhoff commented Nov 26, 2025 •

edited

Loading