Skip to content

Commit

Permalink
Feature #2833 pcp_combine_missing (#2886)
Browse files Browse the repository at this point in the history
* Per #2883, add -input_thresh command line option to configure allowable missing input files.

* Per #2883, update pcp_combine usage statement.

* Per #2883, update existing pcp_combine -derive unit test example by adding 3 new missing file inputs at the beginning, middle, and end of the file list. The first two are ignored since they include the MISSING keyword, but the third without that keyword triggers a warning message as desired. The -input_thresh option is added to only require 70% of the input files be present. This should produce the exact same output data.

* Per #2883, update the pcp_combine logic for the sum command to allow missing data files based on the -input_thresh threshold. Add a test in unit_pcp_combine.xml to demonstrate.

* Update docs/Users_Guide/reformat_grid.rst

Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>

* Per #2883, update pcp_combine usage statement in the code to be more simliar to the User's Guide.

* Per #2883, switch to using derive_file_list_missing as the one containing missing files and recreate derive_file_list as it had existed for the test named pcp_combine_derive_VLD_THRESH.

* Per #2883, move initialization inside the same loop to resolve SonarQube issues.

* Per #2883, update sum_data_files() to switch from allocating memory to using STL vectors to satisfy SonarQube.

* Per #2883, changes to declarations of variables to satisfy SonarQube.

* Per #2883, address more SonarQube issues

* Per #2883, backing out an unintended change I made to tcrmw_grid.cc. This change belongs on a different branch.

* Per #2883, update logic of parse_file_list_type() function to handle python input strings. Also update pcp_combine to parse the type of input files being read and log non-missing python input files expected.

---------

Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
  • Loading branch information
JohnHalleyGotway and georgemccabe committed May 15, 2024
1 parent 79ac568 commit a960cc6
Show file tree
Hide file tree
Showing 4 changed files with 284 additions and 161 deletions.
11 changes: 7 additions & 4 deletions docs/Users_Guide/reformat_grid.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ The usage statement for the Pcp-Combine tool is shown below:
out_file
[-field string]
[-name list]
[-input_thresh n]
[-vld_thresh n]
[-log file]
[-v level]
Expand Down Expand Up @@ -79,13 +80,15 @@ Optional Arguments for pcp_combine

4. The **-name list** option is a comma-separated list of output variable names which override the default choices. If specified, the number of names must match the number of variables to be written to the output file.

5. The **-vld_thresh n** option overrides the default required ratio of valid data for at each grid point for an output value to be written. The default is 1.0.
5. The **-input_thresh n** option overrides the default required ratio of valid input files. This option does not apply to the -subtract command where exactly two valid inputs are required. The default is 1.0.

6. The **-log file** option directs output and errors to the specified log file. All messages will be written to that file as well as standard out and error. Thus, users can save the messages without having to redirect the output on the command line. The default behavior is no log file.
6. The **-vld_thresh n** option overrides the default required ratio of valid data at each grid point for an output value to be written. The default is 1.0.

7. The **-v level** option indicates the desired level of verbosity. The contents of "level" will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity above 1 will increase the amount of logging.
7. The **-log file** option directs output and errors to the specified log file. All messages will be written to that file as well as standard out and error. Thus, users can save the messages without having to redirect the output on the command line. The default behavior is no log file.

8. The **-compress level** option indicates the desired level of compression (deflate level) for NetCDF variables. The valid level is between 0 and 9. The value of "level" will override the default setting of 0 from the configuration file or the environment variable MET_NC_COMPRESS. Setting the compression level to 0 will make no compression for the NetCDF output. Lower number is for fast compression and higher number is for better compression.
8. The **-v level** option indicates the desired level of verbosity. The contents of "level" will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity above 1 will increase the amount of logging.

9. The **-compress level** option indicates the desired level of compression (deflate level) for NetCDF variables. The valid level is between 0 and 9. The value of "level" will override the default setting of 0 from the configuration file or the environment variable MET_NC_COMPRESS. Setting the compression level to 0 will make no compression for the NetCDF output. Lower number is for fast compression and higher number is for better compression.

Required Arguments for the pcp_combine Sum Command
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
36 changes: 31 additions & 5 deletions internal/test_unit/xml/unit_pcp_combine.xml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,19 @@
</output>
</test>

<test name="pcp_combine_sum_GRIB1_MISSING">
<exec>&MET_BIN;/pcp_combine</exec>
<param> \
20120409_00 3 20120412_15 12 \
&OUTPUT_DIR;/pcp_combine/nam_2012040900_F087_APCP12.nc \
-pcpdir &DATA_DIR_MODEL;/grib1/nam \
-input_thresh 0.75
</param>
<output>
<grid_nc>&OUTPUT_DIR;/pcp_combine/nam_2012040900_F087_APCP12.nc</grid_nc>
</output>
</test>

<test name="pcp_combine_sum_GRIB1_MULTIPLE_FIELDS">
<exec>&MET_BIN;/pcp_combine</exec>
<param> \
Expand Down Expand Up @@ -292,22 +305,26 @@
<!-- - multiple -field options -->
<!-- -->
<test name="pcp_combine_derive_MULTIPLE_FIELDS">
<exec>echo "&DATA_DIR_MODEL;/grib1/arw-fer-gep1/arw-fer-gep1_2012040912_F024.grib \
<exec>echo "MISSING \
&DATA_DIR_MODEL;/grib1/arw-fer-gep1/arw-fer-gep1_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/arw-fer-gep5/arw-fer-gep5_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/arw-sch-gep2/arw-sch-gep2_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/arw-sch-gep6/arw-sch-gep6_2012040912_F024.grib \
MISSING/optional/path/to/missing/file \
&DATA_DIR_MODEL;/grib1/arw-tom-gep0/arw-tom-gep0_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/arw-tom-gep7/arw-tom-gep7_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/nmm-fer-gep4/nmm-fer-gep4_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/nmm-fer-gep8/nmm-fer-gep8_2012040912_F024.grib" \
> &OUTPUT_DIR;/pcp_combine/derive_file_list; \
&DATA_DIR_MODEL;/grib1/nmm-fer-gep8/nmm-fer-gep8_2012040912_F024.grib \
&DATA_DIR_MODEL;/path/to/missing/file" \
> &OUTPUT_DIR;/pcp_combine/derive_file_list_missing; \
&MET_BIN;/pcp_combine</exec>
<param> \
-derive mean,stdev,vld_count \
&OUTPUT_DIR;/pcp_combine/derive_file_list \
&OUTPUT_DIR;/pcp_combine/derive_file_list_missing \
-field 'name="TMP"; level="Z2";' \
-field 'name="UGRD"; level="Z10";' \
-field 'name="VGRD"; level="Z10";' \
-input_thresh 0.7 \
&OUTPUT_DIR;/pcp_combine/derive_2012040912_F024_MULTIPLE_FIELDS.nc
</param>
<output>
Expand All @@ -322,7 +339,16 @@
<!-- - multiple -field options -->
<!-- -->
<test name="pcp_combine_derive_VLD_THRESH">
<exec>&MET_BIN;/pcp_combine</exec>
<exec>echo "&DATA_DIR_MODEL;/grib1/arw-fer-gep1/arw-fer-gep1_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/arw-fer-gep5/arw-fer-gep5_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/arw-sch-gep2/arw-sch-gep2_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/arw-sch-gep6/arw-sch-gep6_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/arw-tom-gep0/arw-tom-gep0_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/arw-tom-gep7/arw-tom-gep7_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/nmm-fer-gep4/nmm-fer-gep4_2012040912_F024.grib \
&DATA_DIR_MODEL;/grib1/nmm-fer-gep8/nmm-fer-gep8_2012040912_F024.grib" \
> &OUTPUT_DIR;/pcp_combine/derive_file_list; \
&MET_BIN;/pcp_combine</exec>
<param> \
-derive mean,stdev,vld_count \
&OUTPUT_DIR;/pcp_combine/derive_file_list \
Expand Down
9 changes: 8 additions & 1 deletion src/libcode/vx_data2d_factory/parse_file_list.cc
Original file line number Diff line number Diff line change
Expand Up @@ -213,11 +213,18 @@ GrdFileType ftype = FileType_None;

for ( int i=0; i<file_list.n(); i++ ) {

//
// check for python inputs
//

bool is_python = (file_list[i].find(conf_val_python_xarray) == 0) ||
(file_list[i].find(conf_val_python_numpy) == 0);

//
// skip missing files
//

if( !file_exists(file_list[i].c_str()) ) continue;
if( !file_exists(file_list[i].c_str()) && !is_python ) continue;

//
// get the current file type
Expand Down
Loading

0 comments on commit a960cc6

Please sign in to comment.