New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return NaN coordinate immediately when receiving NaN input #949

Merged
merged 7 commits into from May 24, 2018

Conversation

Projects
None yet
2 participants
@kbevers
Member

kbevers commented Apr 23, 2018

Ideally NaNs shouldn't get special treatment but since double to int
conversion is undefined behavior we are forced to treat NaNs as
something special. Therefor we return a NaN coordinate when ever we
encounter a NaN in one of the dimensions of an input coordinate. The
error level is not changed since this all math operations on NaNs should
return a NaN.

This policy of returning NaNs immediately should also eliminate a bunch
of potential double to int conversion bugs.

Alternative version of #943

@kbevers kbevers force-pushed the kbevers:return-nans-quickly branch from dd1b2be to a111d43 Apr 23, 2018

@kbevers kbevers force-pushed the kbevers:return-nans-quickly branch 2 times, most recently from f760ded to ff89a5e Apr 24, 2018

@cffk

This comment has been minimized.

Contributor

cffk commented Apr 24, 2018

C99 has the function, lround, which rounds a double to a long returning
an "implementation-defined value" (LONG_MIN on my system) if the double
does not fit in a long. Thus if we just replaced

int(floor(double))

with

lround(floor(double))

we would no longer have to include special tests for NaN. Of course for
non-C99 systems we would need to define lround

long lround(double x) {
  x = x < 0 ? ceil(x - 0.5) : floor(x + 0.5);
  return fabs(x) < -(double)LONG_MIN ? (long)x : LONG_MIN;
}

@schwehr Could you verify that lround(NaN) is safe (doesn't cause the
program to abort)?

@kbevers

This comment has been minimized.

Member

kbevers commented Apr 24, 2018

You mean (int)floor(double), right? We would have to find all occurrences of that to be on the safe side. I can easily grep a handful of those but I don't find all of them, for instance these lines from PJ_isea.c (that sparked the whole issue in the first place):

    rx = floor(x + 0.5);
    ix = (int)rx;
    ry = floor(y + 0.5);
    iy = (int)ry;
    rz = floor(z + 0.5);
    iz = (int)rz;

I am good with the lround solution granted that we can create a safe non-C99 version of it.

@cffk

This comment has been minimized.

Contributor

cffk commented Apr 24, 2018

Yes, I meant (inf)floor(double). (I was thinking C++.) I think that double to int conversions are only needed in a handful of situations: computing a UTM zone, getting the right face of an icosahedron...

@kbevers

This comment has been minimized.

Member

kbevers commented Apr 24, 2018

I get these cases from grepping a bit:

λ grep "(int)[[:space:]]*floor" *.c
PJ_robin.c:    i = isnan(lp.phi) ? -1 : (int)floor(dphi * C1);
PJ_robin.c:        i = isnan(lp.phi) ? -1 : (int)floor(lp.phi * NODES);
PJ_unitconvert.c:    year = (int)floor(decimalyear);
PJ_unitconvert.c:    int year  = (int)floor( yyyymmdd / 10000 );
PJ_unitconvert.c:    int month = (int)floor((yyyymmdd - year*10000) / 100);
PJ_unitconvert.c:    int day   = (int)floor( yyyymmdd - year*10000 - month*100);
nad_intr.c:     indx.lam = isnan(t.lam) ? 0 : (int)floor(t.lam);
nad_intr.c:     indx.phi = isnan(t.phi) ? 0 : (int)floor(t.phi);
pj_apply_vgridshift.c:        grid_ix = (int) floor(grid_x);
pj_apply_vgridshift.c:        grid_iy = (int) floor(grid_y);

and then the rest of the (int) typecasts:

λ grep -v "(int)[[:space:]]*floor" *.c | grep (int)
PJ_healpix.c:        return pnpoly((int)sizeof(healpixVertsJit)/
PJ_healpix.c:        return pnpoly((int)sizeof(rhealpixVertsJit)/
PJ_horner.c:    int n = (int)horner_number_of_coefficients(order);
PJ_horner.c:        n = 2*(int)order + 2;
PJ_horner.c:    h->order = (int)order;
PJ_isea.c:    ix = (int)rx;
PJ_isea.c:    iy = (int)ry;
PJ_isea.c:    iz = (int)rz;
PJ_isea.c:     * (int)sidelength * 2 + 1 might be just as good
PJ_isea.c:    maxcoord = (int) (sidelength * 2.0 + 0.5);
PJ_isea.c:        sidelength = (int) (pow(g->aperture, g->resolution / 2.0) + 0.5);
PJ_isea.c:    hexes = (int) (pow(g->aperture, g->resolution) + 0.5);
PJ_isea.c:        height = (int) (pow(g->aperture, (g->resolution - 1) / 2.0));
PJ_isea.c:        sn = ((int) di->x) * height;
PJ_isea.c:        sn += ((int) di->y) / height;
PJ_isea.c:        sidelength = (int) (pow(g->aperture, g->resolution / 2.0) + 0.5);
PJ_isea.c:        sn = (int) ((quad - 1) * hexes + sidelength * di->x + di->y + 2);
PJ_isea.c:    hex->x = ((int)v.x << 4) + quad;
PJ_isea.c:    d = (int)v.x;
PJ_isea.c:    i = (int)v.y;
PJ_isea.c:        int offset = (int)(pow(3.0, g->resolution - 1) + 0.5);
PJ_isea.c:    sidelength = (int) (pow(g->aperture, g->resolution / 2.0) + 0.5);
PJ_pipeline.c:    P->opaque->steps = (int)steps;
PJ_pipeline.c:    argc = (int)argc_params (P->params);
PJ_unitconvert.c:    day = (int)(mjd - mjd_iter + 1);
cct.c:            print (PJ_LOG_ERROR, "Read error in record %d\n", (int) o->record_index);
cct.c:            print (PJ_LOG_NONE, "# Record %d UNREADABLE: %s", (int) o->record_index, buf);
cct.c:                   (int) o->record_index, buf, pj_strerrno (proj_errno(P)));
geod_set.c:                     n_S = (int)(geod_S / del_S + .5);
geodesic.c:  q = r == r ? (int)(floor(r / 90 + (real)(0.5))) : 0;
gie.c:    ntrips = (int) (endp==args? 100: fabs(ans));
gie.c:        fprintf (T.fout, "     FAILURE in %s(%d):\n", opt_strip_path (T.curr_file), (int) F->lineno);
gie.c:    fprintf (T.fout, "     FAILURE in %s(%d):\n", opt_strip_path (T.curr_file), (int) F->lineno);
gie.c:        fprintf (T.fout, "     FAILURE in %s(%d):\n     Too few args: %s\n", opt_strip_path (T.curr_file), (int) F->lineno, args);
gie.c:    fprintf (T.fout, "     FAILURE in %s(%d):\n",    opt_strip_path (T.curr_file), (int) F->lineno);
gie.c:            delim, (int) T.operation_lineno, pj_strerrno(proj_errno(T.P)),
gie.c:            delim, (int) T.operation_lineno
gie.c:    int i = (int) proj_atof (args);
gie.c:    ret = (int) pj_atof (err_const);
gie.c:    if (0==fgets (G->next_args, (int) G->next_args_size - 1, G->f))
gie.c:        if (0==fgets (G->next_args, (int) G->next_args_size - 1, G->f))
mk_cheby.c:    if (!(ncu = (int *)vector1(nu + nv, sizeof(int)))) {
nad_init.c:    for( id_end = (int)strlen(ct->id)-1; id_end > 0; id_end-- )
nad_init.c:        swap_words( ct->cvs, 4, (int)a_size * 2 );
nad_init.c:    for( id_end = (int)strlen(ct->id)-1; id_end > 0; id_end-- )
pj_gridinfo.c:                (int)header_size );
pj_malloc.c:    p->baz = pj_calloc (10, sizeof(int));
pj_open_lib.c:            fname[n = (int)strlen(fname)] = DIR_CHAR;
pj_open_lib.c:        fname[n = (int)strlen(fname)] = DIR_CHAR;
pj_param.c:    l = (int) strlen(opt);
pj_pr_list.c:                   l = (int)strlen(t->param) + 1;
pj_pr_list.c:        l = (int)strlen(t->param) + 1;
proj_4D_api.c:    P = pj_init_ctx (ctx, (int) argc, argv);
proj_etmerc.c:        zone = (int)(floor ((adjlon (P->lam0) + M_PI) * 30. / M_PI));
proj_strtod.c:    printf ("res = %20.15g. Rest = [%s],  errno = %d\n", res, endptr, (int) errno);
rtodms.c:       min = (int)fmod(r, 60.);
rtodms.c:       deg = (int)r;

It is not a huge task to go through each of them and change them to behave as we want them to. I can do that one of the coming days.

@cffk

This comment has been minimized.

Contributor

cffk commented Apr 24, 2018

You will note that several of these (int) casts are actually rounds. Thus in geodesic.c

q = r == r ? (int)(floor(r / 90 + (real)(0.5))) : 0

becomes the simpler

q = lround(r / 90);

You still need to worry about how ties are handled (no problem in geodesic.c, however). And in this case, I don't need to worry about what arbitrary value gets returned with a NaN.

@cffk

This comment has been minimized.

Contributor

cffk commented Apr 24, 2018

Here's my implementation of lround...

#include <math.h>
#include <limits.h>

double roundx(double x) {
  /* The handling of corner cases is copied from boost; see
   *   https://github.com/boostorg/math/pull/8
   * with improvements to return -0.0 when appropriate */
  double t;
  if (x == 0)
    return x;               /* Retain sign of 0 */
  else if (0 < x && x <  0.5)
    return +0.0;
  else if (0 > x && x > -0.5)
    return -0.0;
  else if (x > 0) {
    t = ceil(x);
    return 0.5 < t - v ? t - 1 : t;
  } else {                  /* Includes NaN */
    t = floor(x);
    return 0.5 < x - t ? t + 1 : t;
  }
}

long lroundx(double x) {
  /* Default value for overflow + NaN + (x == LONG_MIN) */
  long r = LONG_MIN;
  x = roundx(x);
  if (fabs(x) < -(double)LONG_MIN) /* Assume (double)LONG_MIN is exact */
    r = (int)x;
  return r;
}

@kbevers kbevers force-pushed the kbevers:return-nans-quickly branch 12 times, most recently from e3df7a1 to d9f0d6c May 6, 2018

@kbevers kbevers force-pushed the kbevers:return-nans-quickly branch from d9f0d6c to 58cbb9f May 8, 2018

@kbevers kbevers force-pushed the kbevers:return-nans-quickly branch 2 times, most recently from 7b7804b to dc1802e May 23, 2018

@kbevers kbevers force-pushed the kbevers:return-nans-quickly branch from dc1802e to 6db260f May 23, 2018

@kbevers

This comment has been minimized.

Member

kbevers commented May 23, 2018

@cffk I finally got round to fixing the build errors in this. I plan on merging this tomorrow so that it can be included in PROJ 5.1.0RC1 (which I intend to release tomorrow afternoon CEST). If you have the time, could you give this a quick look and check if everything is sane?

There's some stupid things regarding pj_isnan because it is used by gie and hence need to be exported in the Windows DLL. That makes it necessary to declare the function at all times and not only when C99 math is unavailable.

@cffk

This comment has been minimized.

Contributor

cffk commented May 23, 2018

Replacing (int)floor(x) by lround(floor(x)) is fine. But there are cases, e.g., maxcoord = (int) (sidelength * 2.0 + 0.5); where the use of lround changes from truncation to rounding. If you know the quantity is non-negative, you can fix by throwing in a call to floor.

@kbevers

This comment has been minimized.

Member

kbevers commented May 24, 2018

With http://c-faq.com/fp/round.html in mind, I think the cases in PJ_isea can handled by removing the + 0.5 and using lround, so that

int offset = (int)(pow(3.0, g->resolution - 1) + 0.5)

becomes

long offset = lround(pow(3.0, g->resolution - 1))

The original intension must have been to round the numbers to the nearest integer value.

@cffk

This comment has been minimized.

Contributor

cffk commented May 24, 2018

Indeed.. The problem is now how ties are treated: lround(x) rounds away from zero, floor(x+0.5) rounds up. These are the same if x >= 0. So the exercise is to check each case. (And perhaps in some cases, it's immaterial how ties are treated.)

Handle double to int typecasts in ISEA better
Originally the code was doing double to int conversions like

    y = (int)(x + 0.5)

which results in rounding when typecasting. In an earlier attempt to
avoid buffer overflows in integer typecasts this was changed to

    y = lround(x + 0.5)

which doesn't give the origial results. We fix that here by instead
doing

    y = lround(x)

It is safe to so as long as x is positive.
@kbevers

This comment has been minimized.

Member

kbevers commented May 24, 2018

I have checked all of the cases and I believe they are all cases of positive numbers, so this should be safe.

@cffk

This comment has been minimized.

Contributor

cffk commented May 24, 2018

OK, looks good.

@kbevers kbevers added this to the 5.1.0 milestone May 24, 2018

@kbevers

This comment has been minimized.

Member

kbevers commented May 24, 2018

OK, looks good.

Awesome. Thanks for supervising this!

@kbevers kbevers merged commit a45ad8c into OSGeo:master May 24, 2018

3 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.02%) to 77.585%
Details

@kbevers kbevers deleted the kbevers:return-nans-quickly branch Jul 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment