Skip to content

rasterize() loses precision on large integer burn values (float64 cast) #3056

Description

@brendancol

Describe the bug

rasterize() casts burn values to float64 before rasterization, so large integer burn values lose precision even when dtype is an integer type. The internal raster and the merge contract are float64 throughout: internal props are float64 (xrspatial/rasterize.py near line 102), GeoDataFrame columns are cast with .astype(np.float64) (near lines 2773 and 2783), and (geometry, value) pairs go through float() (near line 2797). Integers above 2**53 are not exactly representable in float64, so the cast rounds them without warning.

This hits identifiers that often get burned into rasters: zone IDs, parcel IDs, category IDs, and uint64 identifiers.

Reproduction

import numpy as np
from shapely.geometry import box
from xrspatial.rasterize import rasterize

v = 2**53 + 1  # 9007199254740993
r = rasterize([(box(0, 0, 5, 5), v)], width=4, height=4, fill=0, dtype=np.int64)
print(int(r.values[0, 0]))  # 9007199254740992 -- off by one

Expected behavior

When the output dtype is an integer type, a burn value that cannot be represented exactly as float64 should not be silently corrupted. Reject it with a clear ValueError that names the offending value, the same way the existing guard rejects a NaN fill against an integer dtype.

Additional context

The internal pipeline is float64 end to end, so preserving integer props throughout would be a large change. Rejecting unrepresentable integer values when the output dtype is integer is the smaller fix, and it follows the precedent of the NaN-fill-vs-integer-dtype guard already in the function.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions