Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gdl testsuite fails with 4.4.1 - ../../libsrc/nc_hashmap.c:106: rehashVar: Assertion `count == hm->count' failed #282

Closed
8 tasks
opoplawski opened this issue Jul 5, 2016 · 6 comments

Comments

@opoplawski
Copy link
Contributor

Environment Information

  • What platform are you using? (please provide specific distribution/version in summary)
    • [x ] Linux
    • Windows
    • OSX
    • Other
    • NA
  • 32 and/or 64 bit?
    • [x ] 32-bit
    • 64-bit
  • What build system are you using?
    • [x ] autotools (configure)
    • cmake
  • Can you provide a sample netCDF file or C code to recreate the issue?
    • Yes (please attach to this issue, thank you!)
    • No
    • [ x] Not at this time

Summary of Issue

Since updating Fedora to netcdf 4.4.1 from 4.4.0, I'm seeing this test failure in gdl on 32-bit only:

116: Adding dim, var, rename dim, var
116: -quiet: ../../libsrc/nc_hashmap.c:106: rehashVar: Assertion `count == hm->count' failed.

This assertion appears to have been added as part of commit 1f9eb80

Steps to reproduce the behavior

gdl/idl code is:

pro nct_create_complex2, f1

catch, error
if(error eq 0) then begin
    id=ncdf_create( f1, /clobber)
    d1=ncdf_dimdef(id, "dim1", 10)
    v1=ncdf_vardef(id, "dim1", d1,/float)
    v2=ncdf_vardef(id, "var1", d1,/float)


    ncdf_control, id, /endef
    ncdf_varput, id, v1, findgen(10)
    ncdf_varput, id, v2, findgen(10)*2.0


    ncdf_close, id

    id=ncdf_open(f1,/nowrite)
    inq=ncdf_inquire(id)

    n=strarr(inq.nvars)
    for i=0, inq.nvars-1 do begin
        v=ncdf_varinq(id,i)
        n[i]=v.name
    endfor

    m=strarr(inq.ndims)
    for i=0, inq.ndims-1 do begin
        ncdf_diminq,id, i,name,s
        m[i]=name
    endfor
    print, "var: ",n
    print, "dim: ",m

    ncdf_varget,id, 0,q
    print, n[0],": ",q
    ncdf_close,id

print, "Adding dim, var, rename dim, var"

    id=ncdf_open(f1, /write)
    ncdf_control,id,/redef
    ncdf_varrename, id, 0,"dim_new1"
    ncdf_dimrename,id, 0, "dim_new1"

    d2=ncdf_dimdef(id, "dim2", 20)
    v3=ncdf_vardef(id, "dim2", d2,/float)
    v4=ncdf_vardef(id, "var2", [d1,d2],/float)

    ncdf_control, id,/endef
    ncdf_varput, id, v3, findgen(20)
    ncdf_varput, id, v4, findgen(10,20)

    ncdf_varget, id, 0, q
    print, "start"

Crash must happen somewhere between the last two prints. At the abort:

(gdb) print count
$1 = 3
(gdb) print *hm
$4 = {table = 0x572769c0, size = 11, count = 2}

On entry to rehashVar():

2: *hm = {table = 0x57295680, size = 5, count = 3}

Then hm->size gets set to 11 on line 92. table entries:

(gdb) print table[4]
$26 = {data = 1, flags = 1, key = 756578944}
(gdb) print table[3]
$27 = {data = 1, flags = 1, key = 3503044073}
(gdb) print table[2]
$28 = {data = 0, flags = 0, key = 0}
(gdb) print table[1]
$29 = {data = 2, flags = 1, key = 3922301096}

It looks like that one of the calls to NC_hashmapAddVar when copying over the elements doesn't update count in the new table. Not sure why.

@gsjaardema
Copy link
Contributor

Try the following:

diff --git a/libsrc/var.c b/libsrc/var.c
index a510ba5..7729a7e 100644
--- a/libsrc/var.c
+++ b/libsrc/var.c
@@ -740,14 +740,14 @@ NC3_rename_var(int ncid, int varid, const char *unewname)
            return NC_ENOMEM;
        if(NC_indef(ncp))
        {
+               /* Remove old name from hashmap; add new... */
+               NC_hashmapRemoveVar(&ncp->vars, old->cp);
+
                newStr = new_NC_string(strlen(newname),newname);
                free(newname);
                if(newStr == NULL)
                        return(-1);
                varp->name = newStr;
-
-               /* Remove old name from hashmap; add new... */
-               NC_hashmapRemoveVar(&ncp->vars, old->cp);
                NC_hashmapAddVar(&ncp->vars, varid, newStr->cp);
                free_NC_string(old);

@@ -755,13 +755,14 @@ NC3_rename_var(int ncid, int varid, const char *unewname)
        }

        /* else, not in define mode */
+       /* Remove old name from hashmap; add new... */
+       NC_hashmapRemoveVar(&ncp->vars, old->cp);
+
        status = set_NC_string(varp->name, newname);
        free(newname);
        if(status != NC_NOERR)
                return status;

-       /* Remove old name from hashmap; add new... */
-       NC_hashmapRemoveVar(&ncp->vars, old->cp);
        NC_hashmapAddVar(&ncp->vars, varid, varp->name->cp);

        set_NC_hdirty(ncp);

The problem is that the name was being updated prior to the old variable being removed from the hashmap. It checks whether the key and the name of the variable being removed match, but since the name had already been updated, the names did not match so the variable was not removed. The patch above removes the variable from the hashmap first, then updates the name, and then adds the variable with the new name to the hashmap.

Here is a c program that reproduces the gdi/idl code above and shows the same issue:

#include <netcdf.h>
int main()
{
  int  status;
  int  id;
  int  rh_id, varid, v1, v2, v3, v4;
  int  dimids[2];


  nc_create("foo.nc", NC_CLOBBER, &id);
  nc_redef(id);

  status = nc_def_dim(id, "dim1", 10, &dimids[0]);
  status = nc_def_var(id, "dim1", NC_FLOAT, 1, dimids, &v1);
  status = nc_def_var(id, "var1", NC_FLOAT, 1, dimids, &v2);

  nc_close(id);

  nc_open("foo.nc", NC_WRITE, &id);

  nc_redef(id);
  nc_rename_var(id, v1,"dim_new1");
  nc_rename_dim(id, dimids[0], "dim_new1");

  status = nc_def_dim(id, "dim2", 20, &dimids[1]);
  nc_def_var(id, "dim2", NC_FLOAT, 1, &dimids[1], &v3);
  nc_def_var(id, "var2", NC_FLOAT, 2, dimids,    &v4);

  nc_close(id);
}

@gsjaardema
Copy link
Contributor

Note that we probably need the same restructure on the NC3_rename_dim code. Not sure why it didn't trip the assert also.

@gsjaardema
Copy link
Contributor

diff --git a/libsrc/dim.c b/libsrc/dim.c
index ef7dfb3..7d904ee 100644
--- a/libsrc/dim.c
+++ b/libsrc/dim.c
@@ -475,10 +475,12 @@ NC3_rename_dim( int ncid, int dimid, const char *unewname)
                free(newname);
                if(newStr == NULL)
                        return NC_ENOMEM;
-               dimp->name = newStr;

                /* Remove old name from hashmap; add new... */
                NC_hashmapRemoveDim(&ncp->dims, old->cp);
+
+               dimp->name = newStr;
+
                NC_hashmapAddDim(&ncp->dims, dimid, newStr->cp);
                free_NC_string(old);

@@ -487,13 +489,15 @@ NC3_rename_dim( int ncid, int dimid, const char *unewname)

        /* else, not in define mode */

+
+       /* Remove old name from hashmap; add new... */
+       NC_hashmapRemoveDim(&ncp->dims, old->cp);
+
        status = set_NC_string(dimp->name, newname);
        free(newname);
        if(status != NC_NOERR)
                return status;

-       /* Remove old name from hashmap; add new... */
-       NC_hashmapRemoveDim(&ncp->dims, old->cp);
        NC_hashmapAddDim(&ncp->dims, dimid, dimp->name->cp);

@WardF
Copy link
Member

WardF commented Jul 7, 2016

Thanks for the bug report, @opoplawski and thanks for the C test and patches, @gsjaardema . I'm catching up on other things now post 4.4.1 release, but will be incorporating the test into our set of tests ASAP. I'll also look at adding gdl to our docker-based regression tests, along with things like the python interface and NCO.

@WardF WardF self-assigned this Jul 7, 2016
@opoplawski
Copy link
Contributor Author

gdl tests pass with this patch. Thanks.

@WardF
Copy link
Member

WardF commented Jul 13, 2016

The fix has been tested and merged. Thanks!

@WardF WardF closed this as completed Jul 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants