netCDF exporter does not contain any time steps #200

RubenImhoff · 2021-01-05T12:55:25Z

I may have a potential issue:
I have a test script for pysteps to make ensemble nowcasts, just to check whether everything works fine. It saves the nowcasts in a netcdf, but the netcdf does not contain any timesteps in the rainfall data when using pysteps 1.4. It seems that this is caused by the changes in ./nowcasts/steps.py. If I use steps.py from version 1.3, the script works fine and the output netcdf contains two timesteps with the ensemble nowcast in it. I can't figure out why. Perhaps my script is just outdated, but maybe it is an actual bug. I have put the script in the comment below, it uses the KNMI data from the pysteps data repository, so it should be relatively easy to run.

RubenImhoff · 2021-01-05T12:57:27Z

#!/bin/env python

"""
Ensemble nowcast with pySTEPS. Files are saved as netcdf.

More info: https://pysteps.github.io/
"""
import os
#os.environ['PROJ_LIB'] = r'c:\Anaconda3\pkgs\proj4-5.1.0-hfa6e2cd_1\Library\share'

import csv
import datetime
import matplotlib.pylab as plt
import netCDF4
import numpy as np
import pprint
import sys
import time

start = time.time()

import pysteps as stp

###########################################################
# Initial settings
###########################################################

os.chdir('c:\\Users\\imhof_rn\\pysteps')

data_source = "knmi"

# Verification settings (not needed)
verification = {
    "experiment_name"   : "pysteps_radar",
    "overwrite"         : True,            # to recompute nowcasts
    "v_thresholds"      : [0.1, 1.0, 5.0],       # [mm/h]                 
    "v_leadtimes"       : [10, 20, 30, 45, 60],     # [min]
    "v_accu"            : None,             # [min]
    "seed"              : 42,               # for reproducibility
    "doplot"            : True,            # save figures
    "dosaveresults"     : True              # save verification scores to csv
}

# Forecast settings
forecast = {
    "n_lead_times"      : 6,       # timesteps per nowcast (36 = 3 hour lead time)
    "r_threshold"       : 0.1,      # rain/no rain threshold [mm/h]
    "unit"              : "mm/h",   # mm/h or dBZ (this is not the unit of the input data, that's defined in the importer)
    "transformation"    : "dB",     # None or dB (more options nowadays, see the ReadTheDocs)
    "adjust_domain"     : None      # None or square
}

# The experiment set-up
## this includes tuneable parameters
experiment = {
    ## the events           event start     event end       update cycle  data source   (NOTE, see line 160: right now, the loop makes the last nowcast at time = end time - timesteps * interval (3 hours here))
    "data"              : [("201008260200", "201008260300", 5,           "knmi")],
                     #      ("201701311030", "201701311300", 30,           "mch"),
                      #     ("201609281530", "201609281800", 30,           "fmi"),
                       #    ("201705091130", "201705091400", 30,           "fmi")],
    
    ## the methods
    "oflow_method"      : ["lucaskanade"],      # e.g. lucaskanade, darts
    "adv_method"        : ["semilagrangian"],   # semilagrangian, eulerian
    "nwc_method"        : ["steps"],            # (Only option right now - anvil should be possible in the newest release)
    "noise_method"      : ["nonparametric"],    # parametric, nonparametric, ssft (better not to use parametric. ssft is better but about three times slower)
    "decomp_method"     : ["fft"],
    
    ## the parameters (buffer not yet set here, but standard set to 5 or 10 in the latest version)
    "n_ens_members"     : [2],                 # Number of ensemble members           
    "ar_order"          : [2],                  # Generally 2
    "n_cascade_levels"  : [8],                  # Generally 8, 6 for S-PROG
    "noise_adjustment"  : ['auto'],         
    "conditional"       : [False],
    "precip_mask"       : [True],
    "mask_method"       : ["incremental"],      # obs, incremental, sprog
    "prob_matching"     : ["cdf"],
}

# Note: it is possible to define a number of workers (for parallel computing) now. 
# Right now, this is hard-coded to 1 in the nowcast loop.

###########################################################
# Get the correct paths
###########################################################

root_path = stp.rcparams.data_sources[data_source]["root_path"]
path_fmt = stp.rcparams.data_sources[data_source]["path_fmt"]
fn_pattern = stp.rcparams.data_sources[data_source]["fn_pattern"]
fn_ext = stp.rcparams.data_sources[data_source]["fn_ext"]
importer_name = stp.rcparams.data_sources[data_source]["importer"]
importer_kwargs = stp.rcparams.data_sources[data_source]["importer_kwargs"]
timestep = stp.rcparams.data_sources[data_source]["timestep"]

# Conditional parameters
## parameters that can be directly related to other parameters
def cond_pars(pars):
    for key in list(pars):
        if key == "oflow_method":
            if pars[key].lower() == "darts":  pars["n_prvs_times"] = 9 # For DARTS, more previous time steps are necessary
            else:                             pars["n_prvs_times"] = 3 # So, in total 4 radar images are used to construct the motion field. Perhaps good to place this in the inital settings at some point.
        elif key.lower() == "n_cascade_levels":
            if pars[key] == 1 : pars["bandpass_filter"] = "uniform"
            else:               pars["bandpass_filter"] = "gaussian"
        elif key.lower() == "nwc_method":
            if pars[key] == "extrapolation" : pars["n_ens_members"] = 1
    return pars
    
# Prepare the list of all parameter sets 
parsets = [[]]
for _, items in experiment.items():
    parsets = [parset+[item] for parset in parsets for item in items]

# Now loop through all parameter sets
for n, parset in enumerate(parsets):
    
    # Build parameter set
    
    p = {}
    for m, key in enumerate(experiment.keys()):
        p[key] = parset[m]
    ## apply conditional parameters
    p = cond_pars(p)
    ## include all remaining parameters
    p.update(verification)
    p.update(forecast)
    
    print("************************")
    print("* Parameter set %02d/%02d: *" % (n+1, len(parsets)))
    print("************************")
    
    pprint.pprint(p)
    
    # If necessary, build path to results
    path_to_experiment = os.path.join(stp.rcparams.outputs["path_outputs"], p["experiment_name"])
    # subdir with event date
    path_to_nwc = os.path.join(path_to_experiment, '-'.join([p["data"][0], p["data"][3]]))
    for key, item in p.items():
		# include only variables that change
        if len(experiment.get(key,[None])) > 1 and key.lower() is not "data":
            path_to_nwc = os.path.join(path_to_nwc, '-'.join([key, str(item)]))
    try:
        os.makedirs(path_to_nwc)
    except FileExistsError:
        pass
        
    # **************************************************************************
    # NOWCASTING
    # ************************************************************************** 
    
    # Loop forecasts within given event using the prescribed update cycle interval
  
    if p["v_accu"] is None:
        p["v_accu"] = timestep
    
    # Loop forecasts for given event
    startdate   = datetime.datetime.strptime(p["data"][0], "%Y%m%d%H%M")
    enddate     = datetime.datetime.strptime(p["data"][1], "%Y%m%d%H%M")
    countnwc = 0
    while startdate + datetime.timedelta(minutes = p["n_lead_times"]*timestep) <= enddate:
        try:
            # filename of the nowcast netcdf
            outfn = os.path.join(path_to_nwc, "%s_nowcast.netcdf" % startdate.strftime("%Y%m%d%H%M"))
        
            ## check if results already exists
            run_exist = False
            if os.path.isfile(outfn):
                fid = netCDF4.Dataset(outfn, 'r')
                if fid.dimensions["time"].size == p["n_lead_times"]:
                    run_exist = True
                    if p["overwrite"]:
                        os.remove(outfn)
                        run_exist = False    
                else:
                    os.remove(outfn)
                    
            if run_exist:
                print("Nowcast %s_nowcast already exists in %s" % (startdate.strftime("%Y%m%d%H%M"),path_to_nwc))
    
            else:
                countnwc += 1
                print("Computing the nowcast (%02d) ..." % countnwc)
                
                print("Starttime: %s" % startdate.strftime("%Y%m%d%H%M"))
                
                ## redirect stdout to log file
                logfn =  os.path.join(path_to_nwc, "%s_log.txt" % startdate.strftime("%Y%m%d%H%M")) 
                print("Log: %s" % logfn)
                orig_stdout = sys.stdout
                f = open(logfn, 'w')
                sys.stdout = f
                
                print("*******************")
                print("* %s *****" % startdate.strftime("%Y%m%d%H%M"))
                print("* Parameter set : *")
                pprint.pprint(p)
                print("*******************")
                
                print("--- Start of the run : %s ---" % (datetime.datetime.now()))
                
                ## time
                t0 = time.time()
            
                # Read inputs
                print("Read the data...")
                
                ## find radar field filenames
                input_files = stp.io.find_by_date(startdate, root_path, path_fmt, fn_pattern,
                                                  fn_ext, timestep, p["n_prvs_times"])
                
        
                ## read radar field files
                importer    = stp.io.get_method(importer_name, "importer")
                R, _, metadata = stp.io.read_timeseries(input_files, importer, **importer_kwargs)
                metadata0 = metadata.copy()
                metadata0["shape"] = R.shape[1:]
                
                print("Metadata is:")
                print(metadata0)
                
                # Prepare input files
                print("Prepare the data...")
                
                ## if requested, make sure we work with a square domain
                reshaper = stp.utils.get_method(p["adjust_domain"])
                R, metadata = reshaper(R, metadata)
        
                ## if necessary, convert to rain rates [mm/h]    
                converter = stp.utils.get_method("mm/h")
                R, metadata = converter(R, metadata)
                
                ## threshold the data
                R[R < p["r_threshold"]] = 0.0
                metadata["threshold"] = p["r_threshold"]
                
                ## convert the data
                converter = stp.utils.get_method(p["unit"])
                R, metadata = converter(R, metadata)
                    
                ## transform the data
                transformer = stp.utils.get_method(p["transformation"])
                R, metadata = transformer(R, metadata)
                
                ## set NaN equal to zero
                R[~np.isfinite(R)] = metadata["zerovalue"]
                
                # Compute motion field
                oflow_method = stp.motion.get_method(p["oflow_method"])
                UV = oflow_method(R)
                
                ####
                # Perform the nowcast       
                ####
        
                ## define the callback function to export the nowcast to netcdf (#TODO: place the function outside the loop)
                converter   = stp.utils.get_method("mm/h")
                def export(X):
                    ## convert to mm/h
                    X,_ = converter(X, metadata)
                    # readjust to initial domain shape
                    X,_ = reshaper(X, metadata, inverse=True)
                    # export to netcdf
                    stp.io.export_forecast_dataset(X, exporter)
                
                ## initialize netcdf file
                incremental = "timestep" if p["nwc_method"].lower() == "steps" else None
                exporter = stp.io.initialize_forecast_exporter_netcdf(outpath = path_to_nwc, outfnprefix = startdate.strftime("%Y%m%d%H%M"), startdate = startdate,
                                  timestep = timestep, n_timesteps = p["n_lead_times"], shape = metadata0["shape"], 
                                  n_ens_members = p["n_ens_members"], metadata = metadata0, incremental=incremental)
                
                ## start the nowcast
                nwc_method = stp.nowcasts.get_method(p["nwc_method"])
                R_fct = nwc_method(R, UV, p["n_lead_times"], p["n_ens_members"],
                                p["n_cascade_levels"], kmperpixel=metadata["xpixelsize"], 
                                timestep=timestep, R_thr=metadata["threshold"], 
                                extrap_method=p["adv_method"], 
                                decomp_method=p["decomp_method"], 
                                bandpass_filter_method=p["bandpass_filter"], 
                                noise_method=p["noise_method"], 
                                noise_stddev_adj=p["noise_adjustment"],
                                ar_order=p["ar_order"],conditional=p["conditional"], 
                                probmatching_method=p["prob_matching"], 
                                mask_method=p["mask_method"], 
                                callback=export, 
                                return_output=False, seed=p["seed"],
                                measure_time = True, num_workers = 1)
                
                ## save results
                stp.io.close_forecast_files(exporter)
                R_fct = None
                
                # save log
                print("--- End of the run : %s ---" % (datetime.datetime.now()))
                print("--- Total time : %s seconds ---" % (time.time() - t0))
                sys.stdout = orig_stdout
                f.close()
                
            # next forecast
            startdate += datetime.timedelta(minutes = p["data"][2])
        
        except ValueError:
            print('No nowcast for this date')
            
            # next forecast
            startdate += datetime.timedelta(minutes = p["data"][2])
            
            
end = time.time()
print('Total process took', (end - start)/3600.0, 'hours')

pulkkins · 2021-01-05T21:30:01Z

Transition between pysteps 1.3 and 1.4 introduced a bug in the callback function used in steps and sseps. The arrays supplied to the callback had incorrect shapes. The problem should be fixed with commit cf34300.

RubenImhoff · 2021-01-06T07:25:46Z

Thanks a lot, @pulkkins! I'll test it and close the issue if everything works.

RubenImhoff · 2021-01-06T08:00:43Z

Works perfectly! I'm closing this issue now.

dnerini · 2021-01-06T15:14:30Z

@pulkkins, @RubenImhoff thanks a lot for solving this issue. I'm re-opening it because I'd like to include a specific test for the netcdf exporter, so to make sure that we'd detect the same problem in case.

RubenImhoff · 2021-01-07T07:28:40Z

Good idea, Daniele!

RubenImhoff assigned dnerini, pulkkins and aperezhortal Jan 5, 2021

RubenImhoff added the bug Something isn't working label Jan 6, 2021

RubenImhoff closed this as completed Jan 6, 2021

dnerini reopened this Jan 6, 2021

dnerini unassigned aperezhortal and pulkkins Jan 6, 2021

dnerini added this to the release v1.4.1 milestone Jan 6, 2021

dnerini mentioned this issue Jan 7, 2021

Test steps netcdf exporter #202

Merged

2 tasks

dnerini closed this as completed in #202 Jan 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

netCDF exporter does not contain any time steps #200

netCDF exporter does not contain any time steps #200

RubenImhoff commented Jan 5, 2021

RubenImhoff commented Jan 5, 2021

pulkkins commented Jan 5, 2021

RubenImhoff commented Jan 6, 2021

RubenImhoff commented Jan 6, 2021

dnerini commented Jan 6, 2021

RubenImhoff commented Jan 7, 2021

netCDF exporter does not contain any time steps #200

netCDF exporter does not contain any time steps #200

Comments

RubenImhoff commented Jan 5, 2021

RubenImhoff commented Jan 5, 2021

pulkkins commented Jan 5, 2021

RubenImhoff commented Jan 6, 2021

RubenImhoff commented Jan 6, 2021

dnerini commented Jan 6, 2021

RubenImhoff commented Jan 7, 2021