Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement STEPS blending scheme #212

Closed
11 of 12 tasks
cvelascof opened this issue Jul 13, 2021 · 18 comments · Fixed by #255
Closed
11 of 12 tasks

Implement STEPS blending scheme #212

cvelascof opened this issue Jul 13, 2021 · 18 comments · Fixed by #255
Assignees
Labels
enhancement New feature or request

Comments

@cvelascof
Copy link
Member

cvelascof commented Jul 13, 2021

A great addition to pySTEPS should be to include a module and support routines to blend multiple sources of forecast rainfall.
For example, blending a radar rainfall nowcast with a Numerical Weather Prediction (NWP) rainfall forecast or multiple sources of rainfall estimates.

The branch steps_blending (https://github.com/pySTEPS/pysteps/tree/steps_blending) is created to develop a version of STEPS blending scheme as proposed by Seed et al, 2013 (https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/wrcr.20536)

Proposed modules and routines:

Modules:

  • blend_multiple_power_spectrum: weighted combination of multiple power spectrum
  • blend_multiple_optical_flow_fields: weighted combination of multiple optical flow fields
  • calculate_weights: Calculate blending weights for STEPS blending
  • blend_cascades: weighted combination of cascade levels based on STEPS weights
  • blend_means_sigmas: weighted combination of input means and standard deviations using STEPS weights
  • recompose_cascade: recompose rainfall field from blended cascade fields
  • main module: to drive and integrate multiple sources of rainfall

Support:

  • Docs
  • Tests for all modules
  • Examples using notebooks

Data - case studies:

  • radar rainfall datasets
  • NWP rainfall datasets

This is just a first trial to describe what would be needed to complete this feature.
Early versions of some of those modules are already in the branch but they need more work and development.

@dnerini
Copy link
Member

dnerini commented Jul 13, 2021

Hello @cvelascof, very happy to see this important branch back on track, I look forward to the new blending module!

@dnerini
Copy link
Member

dnerini commented Jul 14, 2021

Do I get it right that all methods will assume that NWP and radar data come at the same spatio-temporal resolution as well as extent?

Also, do you aim at blending with an ensemble or a deterministic NWP?

@cvelascof
Copy link
Member Author

It is easier to assume to start that NWP and radar data come at the same spatial projection and at the same spatial-temporal resolution. First version should combine one deterministic NWP with the radar nowcast. The critical aspect now is to define an structure in the module that allows us to add more complications to this first version later on, for example more than one NWP, or an ensemble NWP.

@RubenImhoff
Copy link
Contributor

The idea is indeed to start simple, but to later allow for different spatio-temporal resolutions, extents and ensemble members.

@dnerini
Copy link
Member

dnerini commented Jul 14, 2021

Very good, I like the approach!

It would be easier if the blending module could fit into the existing data model, but I'm personally very much open to explore alternatives if needed.

@ladc
Copy link
Contributor

ladc commented Jul 20, 2021

Hi everyone, Indeed, good approach to start simple (while keeping in mind the desired features). Things will need refactoring anyway. Also it would be good to (eventually) have an interface that's generic enough to allow for more input fields than simply precipitaiton. I'm thinking for example about machine learning / deep learning based blending methods which require multiple predictors. Also some NWP models can output stratiform and convective precipitation separately, and that could also be exploited. But these are worries for later.

@ladc
Copy link
Contributor

ladc commented Jul 20, 2021

Wout (@wdewettin) is progressing well with the linear blending. But indeed the linear blending method will need a totally different interface than advanced blending methods. The problem is that the blending generally is not simply a post-processing step that combines a radar-based nowcast with NWP. For the linear blending this is indeed the case - we calculate a nowcast, and then combine the two forecast sources post-hoc. However the STEPS scale- and skill-dependent blending needs to be done inside the STEPS nowcast.

  • One approach is to re-implement the STEPS nowcast allowing for another precipitation source (NWP), but then we risk having a lot of duplicate code.
  • Another approach would be to make the STEPS nowcast function more modular so that much of the code can be used both in the nowcast and in the blended nowcast.
  • A third option is to extend the STEPS nowcast function to return the blended forecast if the NWP argument is passed and the original one if NWP it is not provided.

Keeping in mind potential machine learning-based nowcast methods, I think the line between blending and extrapolation-based nowcast will be blurred anyway, as multiple predictors (including, but not limited to NWP precip) can be used.

Thoughts?

@dnerini
Copy link
Member

dnerini commented Jul 20, 2021

Hey Lesley

You raise some very good points here.
Personally, I'd like to take this opportunity to refactor the main steps method: we need to try to break it down into smaller pieces, so that we can more easily test and reuse it elsewhere (ie. your option 2).

I also like option 3, although this risks making the whole implementation even more complicated. But from the point of view of the application, this seems the most natural way: add as many sources of information you have available to make your nowcast better. This relates to your last point concerning machine learning: NWP forecasts should be seen as additional predictors to the prediction problem.

We should also consider eventually going for an object-oriented approach if we realize this could make our life simpler (initially we decided to avoid classes but this doesn't need to stay like this).

This was referenced Jul 23, 2021
@RubenImhoff
Copy link
Contributor

Hi everyone,

We (@ladc, @wdewettin and me) discussed today how to implement the STEPS blending scheme. The implementation of the blending weights by Seed et al. 2013 was hard to follow for us, as it seems that equations 7 and 8 do not entirely resemble what is described in the text and what we see in the C-implementation of STEPS. If anyone exactly knows how to implement this, we would be happy to hear it and see if we can and will change to that.

However, this weights calculation method is particularly made for blending with more than two (that is nowcast and NWP) models, e.g. multiple NWP models (by focussing on covariations). We were wondering if we will make blended nowcasts this way. I can imagine that if we have multiple NWP models or ensemble NWP models, that we use every model/ensemble member separately in a blending procedure. This means that we will not first blend all models with the nowcast and add the noise cascade, but instead that every model is blended independently with the nowcast.

Therefore, we propose to stay for now (except when the discussion here results in a different conclusion) with the calculation of the weights as initially proposed by Bowler et al. (2006). @cvelascof already made a good start with this in the steps_blending branch. I am continuing with this today and will make one adjustment compared to Bowler et al. (2006): the weights have to sum to 1.0. Hence, the weight of the noise cascade is 1.0 - the sum of the other weights. @cvelascof, no worries, I still have your initial code, so if we really have to change it back, that will be possible.

@dnerini
Copy link
Member

dnerini commented Aug 12, 2021

Hi @RubenImhoff
Unfortunately I'm not very familiar with the Seed et al 2013 paper either (we could get in touch with Alan directly... does he have a github account?^^ ), but from what I remember, they introduced this covariance term which might prove very important for blending (@aitaten knows a lot about this), especially when working with NWP RUC systems that use radar data assimilation.
Regarding the case of multiple NWP runs, I'm not sure that we want to blend them sequentially as you mentioned, precisely because we cannot assume that they are independent to each other.
This said, I'm perfectly fine with starting with a simpler implementation, especially at this point when there are a lot of important changes to be made.
I'm sure @loforest would also have valuable inputs on this.

@RubenImhoff
Copy link
Contributor

I totally agree, @dnerini! Let's see what the others think (and whether they know how the Seed et al 2013 paper is really implemented concerning the weights ^^). For now, I'll just implement the Bowler et al. (2006) method and we can continue from there.

@cvelascof, I'm going to change the noise weight back to the original formulation in eq. 13 in Bowler et al. (2006), instead of making it the missing noise weight in summing all weights to one. The reason for this is that we saw that the weights can sum up to more than 1.0 (also without the noise weight). After some digging in the C-code, we found out that the merged cascade is first renormalized before the recomposition with mu and sigma takes place. So, that will work out. Some things are missing in the papers and technical notes, so that (as you can see) keeps us busy, haha. ;)

@RubenImhoff RubenImhoff added the enhancement New feature or request label Aug 12, 2021
@aitaten
Copy link
Contributor

aitaten commented Aug 12, 2021

Hi! I just had a quick look to Seed et al., 2013 and what I understand is that they take into account the covariance matrix (Eq 8) as a form of having into account the variability of the model (0,0) and radar forecast (1,1) and how much of this variability is common/shared (1,0 = 0,1). As @dnerini said this is important for NWP with radar assimilation, but it can also be important to take into account improvements due to variance reduction. So I would say it is not constraint to more than 2 models as it is shown in the M=2 example in the very same paper.

Said that, I don't know how this paper is implementing the weights :( ... please can you point me where the weights are computed now so I can try to see if I can figure out where the covariance matrices can be added (I am still figuring out the pySTEPS package). Thanks!

@ladc
Copy link
Contributor

ladc commented Aug 12, 2021

Yes this is also what we understood, taking into account the covariances seems like a sensible thing to do (regardless of radar data assimilation). The matrix in Eqn. (8) doesn't seem to be a covariance matrix to me, though. I'm also a bit puzzled by how the \rho_e and the other \rhos are defined exactly (Lagrangian autocorrelation for the radar extrapolation component?).
Screenshot from 2021-08-12 15-22-54
Screenshot from 2021-08-12 15-23-02

Also note that the weights, apart from the addition of the covariance matrix, appear to be implemented quite differently here than in the Bowler 2006 paper which defines the weights in Eqn. (11)-(13) in terms of ratios of explained variance.
Screenshot from 2021-08-12 15-24-47

@RubenImhoff
Copy link
Contributor

Hi everyone, another question w.r.t. the blending. I am working on the main module now (getting there, actually!) and we were wondering what to do with the probability matching. In the original pysteps code, the prob. matching takes place prior to the extrapolation of the recomposed precipitation fields. However, in the blending procedure, the extrapolation step takes place on all cascade levels prior to the blending with NWP. Using the blended precipitation field for the prob. matching with the initial radar observations does not make much sense.
Do you have any ideas when and how to perform the prob. matching within the blending procedure? We haven't found a clear procedure in the earlier papers.

@loforest
Copy link
Member

Hello :-)
I try to give my input on some of the topics hoping that it helps.

  • Probability matching: @RubenImhoff you are right that it does not make sense to always use as target distribution the original radar field. One promising approach is to build a target distribution by sampling from both the radar and NWP fields, like in the EnKF paper of Daniele. Particular care should be given to account for rainfall leaving the domain to avoid biasing the final field.
  • Blending weights: I also remember that the STEPS 1 version implenents the Bowler 2006 weighting scheme. I also had the instinct to normalize the weights to 1 but Alan told me that there was a reason for it to be like that (preserving variance). I can contact Alan and ask about the implementation of the weights to account for covariances. I am quite sure they have it in STEPS 2 (UKMO version) and also STEPS 3 (the new super-fast version of the BOM). @cvelascof what do you think?

@ladc
Copy link
Contributor

ladc commented Aug 23, 2021

If I understand correctly (please correct me if I'm wrong here), the goal of the probability matching is to use a reference rain rate cumulative distribution function and remap the forecast rain rates to match this cdf. The main motivation was to reduce the spurious appearance of drizzle throughout the previously non-rainy areas due to the noise injection. A second reason to perform it was to correct the intensification of rain as it leaves the domain (the mean is added to the recomposed cascade, even if some rain has already left the domain, leading to a spurious intensification of the remaining rain).

So given this information I see several options (although they may all be flawed in some way):

  • Use the cdf of the masked composite for the PM and apply it for the advected and blended forecast, over the whole domain. A disadvantage I see here is that if intense precipitation is forecast by the NWP, this might be erroneously tuned down by the PM. This I believe is what's done in the original STEPS code.
  • Use the cdf of the composite with NODATA replaced by NWP forecasts as a reference. Similar issues can arise of course.
  • Use the time-synchronous cdf of the NWP to perform the PM of the blended nowcast. This might be very computationally intensive and also in the case of a poor NWP forecast, the cdf might represent the observed one badly for short lead times, leading to weird behaviour that users will hate.

I think PM works well with a pure advection nowcast, but I'm not sure how much sense it makes to perform PM when new rainfall can be produced, or reduced, by NWP. I believe probability matching was finally abandoned (replaced by scale and shift) for the STEPS blending algorithm at BOM, and I suppose with good reason ;-) Not sure if this new method is described in the literature?

@loforest
Copy link
Member

I now remember that PM was turned off in the latest implementation. Thanks @ladc for reminding me! Indeed, PM was used, among others, to remove the strong bias arising from the cascade parameters and dBR mean being fixed and not accounting for rainfall leaving the domain. If we cannot avoid doing PM we should use the approach of @dnerini, which will start from the radar cdf and finally converge to the NWP cdf. The assumption is of course that the NWP cdf is kind of ok, which is not true for lower resolution models. Otherwise, we should do shift and scale, though I remember vaguely that Alan added a third parameter to that scheme. We should also ensure not to get too high rainrates (saturate to a given value?).

@RubenImhoff RubenImhoff mentioned this issue Aug 24, 2021
9 tasks
@RubenImhoff RubenImhoff linked a pull request Aug 24, 2021 that will close this issue
9 tasks
@RubenImhoff RubenImhoff removed a link to a pull request Aug 24, 2021
9 tasks
@ladc
Copy link
Contributor

ladc commented Aug 24, 2021

Thanks for the information Loris! We've been brainstorming about the optimal solution to the aforementioned problems and we would propose to use a kind of Lagrangian blended probability matching instead. As a reference for the CDF we would use a deterministic extrapolation, blended with NWP, using the blending weights from the second cascade level (as is also done for the velocity field blending). The idea is to use the weights obtained in the STEPS scale- and skill-dependent blending algorithm to perform a straightforward (but skill-dependent) blending without any cascade decomposition or noise injection, so there's no introduction of spurious drizzle or enhancement of precipitation as it leaves the domain. Therefore we expect the CDF to be a representative baseline. Then the precipitation intensities in the STEPS nowcasts, which are blended per cascade level and recomposed, are matched with the reference CDF. Are we missing something here, or does this sound like it could work?

@RubenImhoff RubenImhoff linked a pull request Jan 19, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants