Skip to content
This repository has been archived by the owner on Sep 1, 2022. It is now read-only.

From ncML to netCDF file using netCDF-Java: FillValue is not handled properly #1036

Open
risquez opened this issue Feb 15, 2018 · 7 comments
Assignees

Comments

@risquez
Copy link

risquez commented Feb 15, 2018

I am generating an empty netCDF file using the ncML description as input. However, I cannot define a FillValue.

This issue could be related to "Ncml mishandling signedness" #923.

I follow these steps:

  1. The following ncML code is the input description. It only defines 2 variables (short and unsigned short), and tries to set their FillValue attribute to "1". Note: the same problem is applicable to other data types as double, float, int, etc).
<?xml version="1.0" encoding="UTF-8"?>
<ncml:netcdf xmlns:ncml="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">

   <ncml:variable name="my_short" shape="" type="short">
      <ncml:attribute name="_FillValue" type="short" value="1"/>
   </ncml:variable>

   <ncml:variable name="my_ushort" shape="" type="short">
      <ncml:attribute name="_Unsigned" value="true" />
      <ncml:attribute name="_FillValue" type="short" value="1" isUnsigned="true" />
   </ncml:variable>

</ncml:netcdf>
  1. Generate netCDF using this command:

java -Xmx1g -classpath netcdfAll-4.6.11.jar ucar.nc2.write.Nccopy --input test_FillValue.ncml --output test_FillValue.nc --format netcdf4

(Note: I am using the latest netCDF-Java library, 4.6.11 from 4/Dec/2017; and Java 1.8.0).

  1. I dump the netCDF file with the usual command (netCDF-C version 4.3.2):

ncdump -s test_FillValue.nc

and the output is:

netcdf test_FillValue {
variables:
        short my_short ;
                my_short:_FillValue = 1s ;
                my_short:_Endianness = "little" ;
        ushort my_ushort ;
                my_ushort:_Unsigned = "true" ;
                my_ushort:_Endianness = "little" ;
// global attributes:
                :_Format = "netCDF-4" ;
data:
 my_short = -32767 ;
 my_ushort = 32769 ;
}

The value for short equals NC_FILL_SHORT=-32767, however NC_FILL_USHORT=65535, not 32769.
https://www.unidata.ucar.edu/software/netcdf/docs/netcdf_8h_source.html

And in any case, the "_FillValue" is defined in the ncML (="1"), therefore it should overwrite any default value.

Summarizing:

  • FillValue attributes cannot be defined for any data type using the netCDF-Java library and ncML as an input (no problem with other attributes, or attributes starting by underscore like "_example").
  • When the FillValue is defined, automatically all data is populated with that value (and therefore it requires some processing time). Am I correct?
  • Default FillValue attributes for unsigned data types do not match their default values.
@lesserwhirls
Copy link
Collaborator

@DennisHeimbigner - can you take a look at this one? I seem to remember we fixed something like this awhile back, but I can't seem to find it.

@risquez
Copy link
Author

risquez commented Feb 21, 2018 via email

@cwardgar
Copy link
Contributor

@risquez You need to set the /netcdf@enhance attribute to "all". Enhancement is responsible for processing fill values, missing values, signededness conversion, etc. See this. If the /netcdf@enhance attribute is not explicitly set in a NcML document, no enhancement is done. I'm not sure why that's the default; "all" seems more useful, and less likely to be a gotcha for users. I'll bring it up in our meeting tomorrow.

So, suppose I have this NcML:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" enhance="all">
   <variable name="my_short" shape="" type="short">
       <attribute name="_FillValue" type="short" value="1"/>
   </variable>

   <variable name="my_ushort" shape="" type="short">
      <attribute name="_Unsigned" value="true" />
      <attribute name="_FillValue" type="short" value="1" isUnsigned="true" />
   </variable>
</netcdf>

If I use NetCDF-Java v4.6.11 to generate a NetCDF-4 file from it, as you have done, I get:

$ ncdump 1036_4.nc4 
netcdf \1036_4 {
variables:
	short my_short ;
		my_short:_FillValue = 1s ;
	ushort my_ushort ;
		my_ushort:_Unsigned = "true" ;
data:
 my_short = _ ;
 my_ushort = 1 ;
}

Better, but my_ushort's _FillValue is still mishandled. Next I'll try on NetCDF-Java v5.0.0, which includes the signedness changes I made in #934:

$ ncdump 1036_5.nc4 
netcdf \1036_5 {
variables:
	short my_short ;
		my_short:_FillValue = 1s ;
	ushort my_ushort ;
		my_ushort:_Unsigned = "true" ;
		my_ushort:_FillValue = 1US ;
data:
 my_short = _ ;
 my_ushort = _ ;
}

Bingo. Everything seems to be working there (the underscore indicates that a datum matches the fill value declared for a variable). On version 5, you can also simplify your NcML somewhat. This will work just as well:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" enhance="all">
   <variable name="my_short" shape="" type="short">
       <attribute name="_FillValue" type="short" value="1"/>
   </variable>

   <variable name="my_ushort" shape="" type="ushort">
      <attribute name="_FillValue" type="ushort" value="1" />
   </variable>
</netcdf>

Note that we haven't yet released v5.0, but we're getting close. If you'd like to experiment with a recent snapshot build, you can grab one here.

@risquez
Copy link
Author

risquez commented Feb 22, 2018

I tried both suggestions in my computer (add "enhanced" attribute and try the latest snapshot). Both work well as you indicate. Christian, thank you very much for your support.

@risquez
Copy link
Author

risquez commented Mar 12, 2018

I would like to recall this issue, just to summarize its status.

My problem is:

I understand that I have to choose:

  • Either I define enhancement="all" in the ncML (as indicated above), and therefore FillValue is Ok, but enumerations are not Ok (explanation here).
  • Or I do not indicate enhancement="all" in the ncML, and therefore FillValue is not Ok, but enumerations are Ok (this was my situation before raising this issue).

Could you please confirm that my understanding is correct? Both (FillValue and enumerations) cannot be as I need at the same time.

@cofinoa
Copy link
Contributor

cofinoa commented Mar 12, 2018

@risquez
try with enhance="ScaleMissing"it is supposed only to apply scale/offset and missing enhancements and not the enums. The opposite should be enhance="ConvertEnums", i.e. convert enums to strings and no scale/offset and missing enhancements

@erget
Copy link

erget commented Apr 12, 2018

This issue is still alive and well for us at EUMETSAT, as when NetcdfDataset.Enhance is set to ScaleMissingDefer, netCDF-Java "promotes" enums to strings, although ScaleMissingDefer shouldn't touch enums.

Using ScaleMissing isn't an option, because the data should not be converted.

@risquez made a nice overview using:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">

   <variable name="my_short" shape="" type="short">
       <attribute name="_FillValue" type="short" value="1"/>
   </variable>

   <variable name="my_ushort" shape="" type="short">
      <attribute name="_Unsigned" value="true" />
      <attribute name="_FillValue" type="short" value="1" isUnsigned="true" />
   </variable>
</netcdf>

Setting different values for enhance results in:

Attribute NetcdfDataset.Enhance not set NetcdfDataset.Enhance == "ScaleMissing" NetcdfDataset.Enhance == "all"
valid_range yes no no
add_offset yes no no
scale_factor yes no no
_FillValue no no no
_unsigned yes yes yes
Resulting data type native short native unsigned short native float

As you can see, the attributes are preserved correctly, but no _FillValue is applied, without enhancements. Using enhancements breaks other parts of the data as well.

At the moment, this is a barrier for us to use netCDF-Java in our operations for Meteosat Third Generation.

@lesserwhirls lesserwhirls reopened this Apr 12, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants