-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dap4.test.TestNc4Iosp failure on Jenkins #126
Comments
This appears to be an example of the |
It appears that variables primary_cloud and secondary_cloud should be using typedef cloud_class_t instead of creating their own typedefs. Current CDM ncdump on ~netcdf-java/dap4/d4tests/src/test/data/resources/testfiles/test_atomic_types.nc:
|
At the moment Im not seeing the error. The Data Object "primary_cloud" has a Datatype Message inside of it that duplicates cloud_class_t, complete with duplicated enumValue/enumName map . I dont (yet) see any reference to cloud_class_t that indicates it should use that. So Ive either missed it, or theres some conventions eg for searching for duplicate Datatype Message that is undocumented or I didnt notice. Its puzzling why hdf5 would duplicate the value/name map. The c library ncdump for this file is (dap4/d4tests/src/test/data/resources/testfiles/test_atomic_types.ncdump):
|
Note
which cdm doesnt reproduce:
I think that was a conscious decision on my part to have opaque types be anonymous in the CDM (using the KISS principle). The netcdf4 implementors ignored the CDM data model, which, ahem, predated netcdf4 on hdf5, and simply followed the hdf5 data model. This and other discrepencies are probably the source of Dennis' comment "the handling of netcdf4/hdf5 translation to CDM is incomplete". I think I ignored the changes they made, if they didnt seem critical, at the cost that the ncdumps would differ. Whereas Dennis (probably rightly so) is testing that they are the same. So, maybe should be reevaluated at this point, or maybe we can live with it, with Dennis' tests always using the jni interface to nectdf4. |
Correct. The CDM and netcdf-4 models differ in many respects |
Do you have any insight into why the enum variables in test_atomic_types.nc duplicate the typedef ( including the value/name maps) inside of itself, instead of referencing a common datatype? |
I see the same thing with
That is, the variable vo has its own data type, and im not seeing the reference to the stand alone type. OTOH I see hdf5 sample files using enums that dont have a stand-alone DataTypeMessage. So I think the general question may be "how does netcdf4 implement user defined type? Where does a variable using one have a reference to it? Is this documented anywhere?" |
In my opinion, I think we should properly map the CDM to the data model that's expected based on usage of netCDF-Java, as long as properly is well defined. Is it the case that the CDM objects lack the necessary data and metadata to compare with what the C library provides?
If it's not documented, then that's a problem with the spec in general that would only be revealed by having multiple implementations. That's one reason why I resist using the C library to do everything HDF/netCDF4 I/O related. |
Added TestEnumTypedef |
CDM has the correct semantics, but it doesnt implement user-defined types. Instead of "factoring out" the type of a variable to a standalone type, it duplicates the type information in the variable. Just to keep things confusing, so does the internal representation in hdf5 (at least with the example files Im looking at). There must be a way to find the reference to the standalone type, since thats what the netcdf4 library does. However, the CDM data model (aka CDMDM ;^) doesnt have user defined types. For reading, this only shows up in ncdump output. For EnumTypedefs, theres no reason not to detect the stand alone type and use it if it exists. As you see in the example above, one only needs cloud_class_t, not 3 separate identical ones. So I consider that a bug to fix. The question of adding user defined types in the CDM is TBD. Would be good to accumulate real-world examples. Also, are they CF compliant? |
I agree that having an independent implementation for reading HDF5 is important and Im willing for now to keep the Java code updated and fix bugs as a Service to Humanity. As long as theres an easy way for a user to switch between the Java and JNA implementations, I thinks its a win. Is there a command line argument for controlling which iosp is used? |
Thanks for the clarification.
You’re a data format saint. 😇 I’ll follow closely and learn all that I can.
Not currently, but could certainly create one. Bigger question - what’s the right level to insert the JNA? Currently we use JNA for reading/writing netCDF-4 through netCDF-C. We could add a new JNA based HDF5 IOSP and call into HDF C? Or, the new IOSP could use the java wrappers produced by the HDF5 build and skip a directly based JNA IOSP.
According to the list of data types supported by next version CF (1.8) and the guidance here, I would say that in general, no. |
Bigger question - what’s the right level to insert the JNA? Currently we
use JNA for reading/writing netCDF-4 through netCDF-C. We could add a new
JNA based HDF5 IOSP and call into HDF C? Or, the new IOSP could use the
java wrappers produced by the HDF5 build and skip a directly based JNA IOSP.
Can the netcdf-c library read any hdf5 or just netcdf4 ?
…On Thu, Jan 2, 2020 at 11:37 AM Sean Arms ***@***.***> wrote:
Thanks for the clarification.
I agree that having an independent implementation for reading HDF5 is
important and Im willing for now to keep the Java code updated and fix bugs
as a Service to Humanity.
You’re a data format saint. 😇 I’ll follow closely and learn all that I
can.
As long as theres an easy way for a user to switch between the Java and
JNA implementations, I thinks its a win. Is there a command line argument
for controlling which iosp is used?
Not currently, but could certainly create one.
Bigger question - what’s the right level to insert the JNA? Currently we
use JNA for reading/writing netCDF-4 through netCDF-C. We could add a new
JNA based HDF5 IOSP and call into HDF C? Or, the new IOSP could use the
java wrappers produced by the HDF5 build and skip a directly based JNA IOSP.
The question of adding user defined types in the CDM is TBD. Would be good
to accumulate real-world examples. Also, are they CF compliant?
According to the list of data types supported by next version CF (1.8)
<http://cfconventions.org/cf-conventions/cf-conventions.html#_data_types>
and the guidance here
<cf-convention/cf-conventions#191 (comment)>,
I would say that in general, no.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#126?email_source=notifications&email_token=AAEVZQF7AA75NPZGDP42O23Q3YX6TA5CNFSM4IZAIHRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH7BNIQ#issuecomment-570300066>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEVZQA47YV5KL5PMJVC6NTQ3YX6TANCNFSM4IZAIHRA>
.
|
Just netcdf-4 as far as I understand, although better flexibility in the C library is a desired feature. I think the thing to do is implement a new HDF5 IOSP (directly using JNA, or indirectly by using JHI5), and at least verify that the netCDF-4 spec (perhaps better named the netCDF HDF convention), remains documented thoroughly enough. I would be tempted to go with using direct JNA calls over JHI5. That would give us an “on par” implementation in terms of netcdf-4 reading in addition to the current completely independent inplimentation. |
I need to think about all that more. But just to clarify, Im just
volunteering to update the pure Java code.
…On Tue, Jan 7, 2020 at 7:12 PM Sean Arms ***@***.***> wrote:
Can the netcdf-c library read any hdf5 or just netcdf4 ?
Just netcdf-4 as far as I understand, although better flexibility in the C
library is a desired feature.
I think the thing to do is implement a new HDF5 IOSP (directly using JNA,
or indirectly by using JHI5), and at least verify that the netCDF-4 spec
(perhaps better named the netCDF HDF convention), remains documented
thoroughly enough.
I would be tempted to go with using direct JNA calls over JHI5. That would
give us an “on par” implementation in terms of netcdf-4 reading in addition
to the current completely independent inplimentation.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#126?email_source=notifications&email_token=AAEVZQHK4DMOAF4CF4TGBDTQ4UZB7A5CNFSM4IZAIHRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIK6IFY#issuecomment-571859991>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEVZQFK24J4WDAKEPB3XATQ4UZB7ANCNFSM4IZAIHRA>
.
|
FYI |
Totally. Any JNA based HDF5 IOSP will be my special level of...well...there's a south park joke in here somewhere...ah, yes, Detroit.
Does it expose enough of HDF5 to build a full HDF5 IOSP? If not, then calls to the HDF5 library are probably the right level for the IOSP. |
Not even close. |
## Description of Changes re: Issue Unidata#126 The code for handling enum types in Group.java is incorrect. When creating a new enum type, it is appropriate to search only the current group for a conflicting name and this is what the current code does. But when an enum typed variable searches for the matching enum type, it must search not only the current group but all parent groups and the current code does not do that. The fix consists of two similar parts. 1. Modify Group.findEnumeration to take an extra boolean parameter to control if search is this group only or to search up the group's parents. 2. Modify Group.Builder.findEnumTypedef to act like \Unidata#1 but to search the sequence of parent Group.Builder instances. As a consequence, this PR modifies a number of other files to adapt to the modified signatures. ## Misc. Other Changes: 1. Fix the handling of special attributes so that they are accessible. ## Note This same problem appears to affect the opaque type also, but fixing that may be a bit more difficult because CDM appears to convert opaque types to byte typess. ## PR Checklist <!-- This will become an interactive checklist once the PR is opened --> - [ ] Link to any issues that the PR addresses - [ ] Add labels - [ ] Open as a [draft PR](https://github.blog/2019-02-14-introducing-draft-pull-requests/) until ready for review - [ ] Make sure GitHub tests pass - [ ] Mark PR as "Ready for Review"
## Description of Changes re: Issue Unidata#126 The code for handling enum types in Group.java is incorrect. When creating a new enum type, it is appropriate to search only the current group for a conflicting name and this is what the current code does. But when an enum typed variable searches for the matching enum type, it must search not only the current group but all parent groups and the current code does not do that. The fix consists of two similar parts. 1. Modify Group.findEnumeration to take an extra boolean parameter to control if search is this group only or to search up the group's parents. 2. Modify Group.Builder.findEnumTypedef to act like \Unidata#1 but to search the sequence of parent Group.Builder instances. As a consequence, this PR modifies a number of other files to adapt to the modified signatures. ## Note This same problem appears to affect the opaque type also, but fixing that may be a bit more difficult because CDM appears to convert opaque types to byte typess. ## PR Checklist <!-- This will become an interactive checklist once the PR is opened --> - [x] Link to any issues that the PR addresses - [ ] Add labels - [x] Open as a [draft PR](https://github.blog/2019-02-14-introducing-draft-pull-requests/) until ready for review - [x] Make sure GitHub tests pass - [ ] Mark PR as "Ready for Review"
* ckp * Fix EnumTypedef name problem ## Description of Changes re: Issue #126 The code for handling enum types in Group.java is incorrect. When creating a new enum type, it is appropriate to search only the current group for a conflicting name and this is what the current code does. But when an enum typed variable searches for the matching enum type, it must search not only the current group but all parent groups and the current code does not do that. The fix consists of two similar parts. 1. Modify Group.findEnumeration to take an extra boolean parameter to control if search is this group only or to search up the group's parents. 2. Modify Group.Builder.findEnumTypedef to act like \#1 but to search the sequence of parent Group.Builder instances. As a consequence, this PR modifies a number of other files to adapt to the modified signatures. * 1. Remove unused import in H5HeaderNew 2. Add overloaded functions to Group.java to restore access to original versions of findEnumTypedef and findEnumeration. * Force re-test * It turns out that I missed the error in the code in H5headerNew that attempts to match an enum typed variable with the proper enumeration type declaration. The problem and fix (as described by a comment in H5headerNew) is a bit of a hack. It should be fixed in H5Object.read(). Unfortunately, information is not being passed down so that the proper fix can be applied early in the HDF5->CDM translation. Fixing this would affect a lot of function signatures. Also modified TestEnumTypedef.java to test two cases: 1. the actual enum type def is in the same group as the variable that uses it. 2. the actual enum type def is in a parent group of the variable that uses it ## Misc. Other Changes * Suppress printing of _NCProperties to simplify text-based comparison testing. * ## Addendum 2 Sigh! Apparently NetcdfFile.java defaulted to using H5iosp instead of H5iospNew. This meant that many of my changes were being bypassed. So, modify NetcdfFile to default to H5iospNew. * Undo change to NetcdfFile.java' * test4 * NCProperties fix * ## Additional modifications * NetcdfFile.java: convert to use H5iospNew (needed by TestH5iosp.java) * H5headerNew.java: provide get function for accessing the btree (needed by TestDataBTree.java) * H5iospNew.java: make getRandomAccessFile() method public (needed by tests) * CompareNetcdf2.java: Add a constructor to specify if attribute name comparison should ignore case or not. It turns out that some tests require case sensitive name matching. Specifically TestCoordSysCompare.java and TestN3iospCompare.java * Apply Spotless * remove debugging --------- Co-authored-by: haileyajohnson <hailey.johnson@ufl.edu>
## Description of Changes re: Issue Unidata#126 The code for handling enum types in Group.java is incorrect. When creating a new enum type, it is appropriate to search only the current group for a conflicting name and this is what the current code does. But when an enum typed variable searches for the matching enum type, it must search not only the current group but all parent groups and the current code does not do that. The fix consists of two similar parts. 1. Modify Group.findEnumeration to take an extra boolean parameter to control if search is this group only or to search up the group's parents. 2. Modify Group.Builder.findEnumTypedef to act like part 1 but to search the sequence of parent Group.Builder instances. As a consequence, this PR modifies a number of other files to adapt to the modified signatures. ## Note This same problem appears to affect the opaque type also, but fixing that may be a bit more difficult because CDM appears to convert opaque types to byte typess. ## PR Checklist <!-- This will become an interactive checklist once the PR is opened --> - [x] Link to any issues that the PR addresses - [ ] Add labels - [x] Open as a [draft PR](https://github.blog/2019-02-14-introducing-draft-pull-requests/) until ready for review - [x] Make sure GitHub tests pass - [x] Mark PR as "Ready for Review" ## Addendum 1 It turns out that I missed the error in the code in H5headerNew that attempts to match an enum typed variable with the proper enumeration type declaration. The problem and fix (as described by a comment in H5headerNew) is a bit of a hack. It should be fixed in H5Object.read(). Unfortunately, information is not being passed down so that the proper fix can be applied early in the HDF5->CDM translation. Fixing this would affect a lot of function signatures. Also modified TestEnumTypedef.java to test two cases: 1. the actual enum type def is in the same group as the variable that uses it. 2. the actual enum type def is in a parent group of the variable that uses it ## Misc. Other Changes * Suppress printing of _NCProperties to simplify text-based comparison testing. ## Addendum 2 (12/19/2022) * NetcdfFile.java: convert to use H5iospNew (needed by TestH5iosp.java) * H5headerNew.java: provide get function for accessing the btree (needed by TestDataBTree.java) * H5iospNew.java: make getRandomAccessFile() method public (needed by tests) * CompareNetcdf2.java: Add a constructor to specify if attribute name comparison should ignore case or not. It turns out that some tests require case sensitive name matching. Specifically TestCoordSysCompare.java and TestN3iospCompare.java ## Addendum 3 (2/25/2023) Make additional changes to fix Jenkins failures. * Revert Group.attributes() * Revert the use of H5iospNew in NetcdfFile. * Add to Group a method that looks for a matching enumeration purely structurally, as opposed to name matching. This corresponds to the same method in Group Builder. * Rebuild the EnumTypedef search algorithm in H5headerNew and H5header to handle cases in Jenins only enum tests. * Revert various calls to Group.findEnumeration(). * Add some test case data: ref_anon_enum.h5 and test_enum_type.nc * Convert TestEnumTypedef to JUNIT.parameter form. * Revert TestH5iosp. * The URL used in TestSequence is no longer valid, so change to an equivalent url on the remotetest server.
We have a new failure on Jenkins for
dap4.test.TestNc4Iosp.testNc4Iosp
. This started showing up when the way we load IOSPs changed with PR 101 (see https://github.com/Unidata/netcdf-java/pull/101/files for the changes).My suspicion is that the change caused the dap4 library to use the
Hdf5Iosp
instead of theNc4Iosp
. The code should handle both cases, but I think what we see is that there is a difference in the way the way the two IOSPs see variable metadata (in this case, something about the wayenum
is handled). Here is the output from Jenkins:The text was updated successfully, but these errors were encountered: