Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netCDF exporter does not contain any time steps #200

Closed
RubenImhoff opened this issue Jan 5, 2021 · 6 comments · Fixed by #202
Closed

netCDF exporter does not contain any time steps #200

RubenImhoff opened this issue Jan 5, 2021 · 6 comments · Fixed by #202
Assignees
Labels
bug Something isn't working

Comments

@RubenImhoff
Copy link
Contributor

I may have a potential issue:
I have a test script for pysteps to make ensemble nowcasts, just to check whether everything works fine. It saves the nowcasts in a netcdf, but the netcdf does not contain any timesteps in the rainfall data when using pysteps 1.4. It seems that this is caused by the changes in ./nowcasts/steps.py. If I use steps.py from version 1.3, the script works fine and the output netcdf contains two timesteps with the ensemble nowcast in it. I can't figure out why. Perhaps my script is just outdated, but maybe it is an actual bug. I have put the script in the comment below, it uses the KNMI data from the pysteps data repository, so it should be relatively easy to run.

@RubenImhoff
Copy link
Contributor Author

#!/bin/env python

"""
Ensemble nowcast with pySTEPS. Files are saved as netcdf.

More info: https://pysteps.github.io/
"""
import os
#os.environ['PROJ_LIB'] = r'c:\Anaconda3\pkgs\proj4-5.1.0-hfa6e2cd_1\Library\share'

import csv
import datetime
import matplotlib.pylab as plt
import netCDF4
import numpy as np
import pprint
import sys
import time

start = time.time()

import pysteps as stp

###########################################################
# Initial settings
###########################################################

os.chdir('c:\\Users\\imhof_rn\\pysteps')

data_source = "knmi"

# Verification settings (not needed)
verification = {
    "experiment_name"   : "pysteps_radar",
    "overwrite"         : True,            # to recompute nowcasts
    "v_thresholds"      : [0.1, 1.0, 5.0],       # [mm/h]                 
    "v_leadtimes"       : [10, 20, 30, 45, 60],     # [min]
    "v_accu"            : None,             # [min]
    "seed"              : 42,               # for reproducibility
    "doplot"            : True,            # save figures
    "dosaveresults"     : True              # save verification scores to csv
}

# Forecast settings
forecast = {
    "n_lead_times"      : 6,       # timesteps per nowcast (36 = 3 hour lead time)
    "r_threshold"       : 0.1,      # rain/no rain threshold [mm/h]
    "unit"              : "mm/h",   # mm/h or dBZ (this is not the unit of the input data, that's defined in the importer)
    "transformation"    : "dB",     # None or dB (more options nowadays, see the ReadTheDocs)
    "adjust_domain"     : None      # None or square
}

# The experiment set-up
## this includes tuneable parameters
experiment = {
    ## the events           event start     event end       update cycle  data source   (NOTE, see line 160: right now, the loop makes the last nowcast at time = end time - timesteps * interval (3 hours here))
    "data"              : [("201008260200", "201008260300", 5,           "knmi")],
                     #      ("201701311030", "201701311300", 30,           "mch"),
                      #     ("201609281530", "201609281800", 30,           "fmi"),
                       #    ("201705091130", "201705091400", 30,           "fmi")],
    
    ## the methods
    "oflow_method"      : ["lucaskanade"],      # e.g. lucaskanade, darts
    "adv_method"        : ["semilagrangian"],   # semilagrangian, eulerian
    "nwc_method"        : ["steps"],            # (Only option right now - anvil should be possible in the newest release)
    "noise_method"      : ["nonparametric"],    # parametric, nonparametric, ssft (better not to use parametric. ssft is better but about three times slower)
    "decomp_method"     : ["fft"],
    
    ## the parameters (buffer not yet set here, but standard set to 5 or 10 in the latest version)
    "n_ens_members"     : [2],                 # Number of ensemble members           
    "ar_order"          : [2],                  # Generally 2
    "n_cascade_levels"  : [8],                  # Generally 8, 6 for S-PROG
    "noise_adjustment"  : ['auto'],         
    "conditional"       : [False],
    "precip_mask"       : [True],
    "mask_method"       : ["incremental"],      # obs, incremental, sprog
    "prob_matching"     : ["cdf"],
}

# Note: it is possible to define a number of workers (for parallel computing) now. 
# Right now, this is hard-coded to 1 in the nowcast loop.

###########################################################
# Get the correct paths
###########################################################

root_path = stp.rcparams.data_sources[data_source]["root_path"]
path_fmt = stp.rcparams.data_sources[data_source]["path_fmt"]
fn_pattern = stp.rcparams.data_sources[data_source]["fn_pattern"]
fn_ext = stp.rcparams.data_sources[data_source]["fn_ext"]
importer_name = stp.rcparams.data_sources[data_source]["importer"]
importer_kwargs = stp.rcparams.data_sources[data_source]["importer_kwargs"]
timestep = stp.rcparams.data_sources[data_source]["timestep"]

# Conditional parameters
## parameters that can be directly related to other parameters
def cond_pars(pars):
    for key in list(pars):
        if key == "oflow_method":
            if pars[key].lower() == "darts":  pars["n_prvs_times"] = 9 # For DARTS, more previous time steps are necessary
            else:                             pars["n_prvs_times"] = 3 # So, in total 4 radar images are used to construct the motion field. Perhaps good to place this in the inital settings at some point.
        elif key.lower() == "n_cascade_levels":
            if pars[key] == 1 : pars["bandpass_filter"] = "uniform"
            else:               pars["bandpass_filter"] = "gaussian"
        elif key.lower() == "nwc_method":
            if pars[key] == "extrapolation" : pars["n_ens_members"] = 1
    return pars
    
# Prepare the list of all parameter sets 
parsets = [[]]
for _, items in experiment.items():
    parsets = [parset+[item] for parset in parsets for item in items]

# Now loop through all parameter sets
for n, parset in enumerate(parsets):
    
    # Build parameter set
    
    p = {}
    for m, key in enumerate(experiment.keys()):
        p[key] = parset[m]
    ## apply conditional parameters
    p = cond_pars(p)
    ## include all remaining parameters
    p.update(verification)
    p.update(forecast)
    
    print("************************")
    print("* Parameter set %02d/%02d: *" % (n+1, len(parsets)))
    print("************************")
    
    pprint.pprint(p)
    
    # If necessary, build path to results
    path_to_experiment = os.path.join(stp.rcparams.outputs["path_outputs"], p["experiment_name"])
    # subdir with event date
    path_to_nwc = os.path.join(path_to_experiment, '-'.join([p["data"][0], p["data"][3]]))
    for key, item in p.items():
		# include only variables that change
        if len(experiment.get(key,[None])) > 1 and key.lower() is not "data":
            path_to_nwc = os.path.join(path_to_nwc, '-'.join([key, str(item)]))
    try:
        os.makedirs(path_to_nwc)
    except FileExistsError:
        pass
        
    # **************************************************************************
    # NOWCASTING
    # ************************************************************************** 
    
    # Loop forecasts within given event using the prescribed update cycle interval
  
    if p["v_accu"] is None:
        p["v_accu"] = timestep
    
    # Loop forecasts for given event
    startdate   = datetime.datetime.strptime(p["data"][0], "%Y%m%d%H%M")
    enddate     = datetime.datetime.strptime(p["data"][1], "%Y%m%d%H%M")
    countnwc = 0
    while startdate + datetime.timedelta(minutes = p["n_lead_times"]*timestep) <= enddate:
        try:
            # filename of the nowcast netcdf
            outfn = os.path.join(path_to_nwc, "%s_nowcast.netcdf" % startdate.strftime("%Y%m%d%H%M"))
        
            ## check if results already exists
            run_exist = False
            if os.path.isfile(outfn):
                fid = netCDF4.Dataset(outfn, 'r')
                if fid.dimensions["time"].size == p["n_lead_times"]:
                    run_exist = True
                    if p["overwrite"]:
                        os.remove(outfn)
                        run_exist = False    
                else:
                    os.remove(outfn)
                    
            if run_exist:
                print("Nowcast %s_nowcast already exists in %s" % (startdate.strftime("%Y%m%d%H%M"),path_to_nwc))
    
            else:
                countnwc += 1
                print("Computing the nowcast (%02d) ..." % countnwc)
                
                print("Starttime: %s" % startdate.strftime("%Y%m%d%H%M"))
                
                ## redirect stdout to log file
                logfn =  os.path.join(path_to_nwc, "%s_log.txt" % startdate.strftime("%Y%m%d%H%M")) 
                print("Log: %s" % logfn)
                orig_stdout = sys.stdout
                f = open(logfn, 'w')
                sys.stdout = f
                
                print("*******************")
                print("* %s *****" % startdate.strftime("%Y%m%d%H%M"))
                print("* Parameter set : *")
                pprint.pprint(p)
                print("*******************")
                
                print("--- Start of the run : %s ---" % (datetime.datetime.now()))
                
                ## time
                t0 = time.time()
            
                # Read inputs
                print("Read the data...")
                
                ## find radar field filenames
                input_files = stp.io.find_by_date(startdate, root_path, path_fmt, fn_pattern,
                                                  fn_ext, timestep, p["n_prvs_times"])
                
        
                ## read radar field files
                importer    = stp.io.get_method(importer_name, "importer")
                R, _, metadata = stp.io.read_timeseries(input_files, importer, **importer_kwargs)
                metadata0 = metadata.copy()
                metadata0["shape"] = R.shape[1:]
                
                print("Metadata is:")
                print(metadata0)
                
                # Prepare input files
                print("Prepare the data...")
                
                ## if requested, make sure we work with a square domain
                reshaper = stp.utils.get_method(p["adjust_domain"])
                R, metadata = reshaper(R, metadata)
        
                ## if necessary, convert to rain rates [mm/h]    
                converter = stp.utils.get_method("mm/h")
                R, metadata = converter(R, metadata)
                
                ## threshold the data
                R[R < p["r_threshold"]] = 0.0
                metadata["threshold"] = p["r_threshold"]
                
                ## convert the data
                converter = stp.utils.get_method(p["unit"])
                R, metadata = converter(R, metadata)
                    
                ## transform the data
                transformer = stp.utils.get_method(p["transformation"])
                R, metadata = transformer(R, metadata)
                
                ## set NaN equal to zero
                R[~np.isfinite(R)] = metadata["zerovalue"]
                
                # Compute motion field
                oflow_method = stp.motion.get_method(p["oflow_method"])
                UV = oflow_method(R)
                
                ####
                # Perform the nowcast       
                ####
        
                ## define the callback function to export the nowcast to netcdf (#TODO: place the function outside the loop)
                converter   = stp.utils.get_method("mm/h")
                def export(X):
                    ## convert to mm/h
                    X,_ = converter(X, metadata)
                    # readjust to initial domain shape
                    X,_ = reshaper(X, metadata, inverse=True)
                    # export to netcdf
                    stp.io.export_forecast_dataset(X, exporter)
                
                ## initialize netcdf file
                incremental = "timestep" if p["nwc_method"].lower() == "steps" else None
                exporter = stp.io.initialize_forecast_exporter_netcdf(outpath = path_to_nwc, outfnprefix = startdate.strftime("%Y%m%d%H%M"), startdate = startdate,
                                  timestep = timestep, n_timesteps = p["n_lead_times"], shape = metadata0["shape"], 
                                  n_ens_members = p["n_ens_members"], metadata = metadata0, incremental=incremental)
                
                ## start the nowcast
                nwc_method = stp.nowcasts.get_method(p["nwc_method"])
                R_fct = nwc_method(R, UV, p["n_lead_times"], p["n_ens_members"],
                                p["n_cascade_levels"], kmperpixel=metadata["xpixelsize"], 
                                timestep=timestep, R_thr=metadata["threshold"], 
                                extrap_method=p["adv_method"], 
                                decomp_method=p["decomp_method"], 
                                bandpass_filter_method=p["bandpass_filter"], 
                                noise_method=p["noise_method"], 
                                noise_stddev_adj=p["noise_adjustment"],
                                ar_order=p["ar_order"],conditional=p["conditional"], 
                                probmatching_method=p["prob_matching"], 
                                mask_method=p["mask_method"], 
                                callback=export, 
                                return_output=False, seed=p["seed"],
                                measure_time = True, num_workers = 1)
                
                ## save results
                stp.io.close_forecast_files(exporter)
                R_fct = None
                
                # save log
                print("--- End of the run : %s ---" % (datetime.datetime.now()))
                print("--- Total time : %s seconds ---" % (time.time() - t0))
                sys.stdout = orig_stdout
                f.close()
                
            # next forecast
            startdate += datetime.timedelta(minutes = p["data"][2])
        
        except ValueError:
            print('No nowcast for this date')
            
            # next forecast
            startdate += datetime.timedelta(minutes = p["data"][2])
            
            
end = time.time()
print('Total process took', (end - start)/3600.0, 'hours')

@pulkkins
Copy link
Member

pulkkins commented Jan 5, 2021

Transition between pysteps 1.3 and 1.4 introduced a bug in the callback function used in steps and sseps. The arrays supplied to the callback had incorrect shapes. The problem should be fixed with commit cf34300.

@RubenImhoff
Copy link
Contributor Author

Thanks a lot, @pulkkins! I'll test it and close the issue if everything works.

@RubenImhoff RubenImhoff added the bug Something isn't working label Jan 6, 2021
@RubenImhoff
Copy link
Contributor Author

Works perfectly! I'm closing this issue now.

@dnerini
Copy link
Member

dnerini commented Jan 6, 2021

@pulkkins, @RubenImhoff thanks a lot for solving this issue. I'm re-opening it because I'd like to include a specific test for the netcdf exporter, so to make sure that we'd detect the same problem in case.

@dnerini dnerini reopened this Jan 6, 2021
@dnerini dnerini added this to the release v1.4.1 milestone Jan 6, 2021
@RubenImhoff
Copy link
Contributor Author

Good idea, Daniele!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants