Skip to content

polygonize: GeoDataFrame output drops input attrs['transform'] while keeping attrs['crs'] #2536

@brendancol

Description

@brendancol

Describe the bug

polygonize(raster, return_type="geopandas") reads raster.attrs['crs'] and puts it on the returned GeoDataFrame, but never reads raster.attrs['transform']. The geometry coordinates stay in raw pixel space (0,0)-(nx,ny). The GeoDataFrame ends up with a CRS that claims projected space while the geometries are not in that space.

This is worse than the missing-CRS bug fixed in #2149. There the GeoDataFrame came back with crs=None, which a downstream caller can detect. Here the metadata says "EPSG:3857" and lies about the data, so spatial joins, file writes, and reprojections will silently misalign by the raster's origin offset.

The xrspatial geotiff reader stores a rasterio-ordered 6-tuple in attrs['transform'] whenever has_georef=True (see xrspatial/geotiff/_attrs.py lines 782-783, format (pixel_width, 0.0, origin_x, 0.0, pixel_height, origin_y) per xrspatial/geotiff/_coords.py:218-234). That tuple is exactly what polygonize's _transform_points already consumes, so the auto-detection path is short.

Expected behavior

When the caller did not pass an explicit transform= argument and the raster carries attrs['transform'] (or has rio.transform()), polygonize should auto-detect and apply that transform, just like it already does for CRS. An explicit transform= argument always wins over the attr (same precedence rule the geotiff writer uses).

Reproduction

import numpy as np
import xarray as xr
from xrspatial.polygonize import polygonize

raster = xr.DataArray(
    np.array([[1, 1, 2], [1, 1, 2], [2, 2, 2]], dtype=np.int32),
    dims=('y', 'x'),
    attrs={
        'crs': 'EPSG:3857',
        'transform': (10.0, 0.0, 1_000_000.0, 0.0, -10.0, 5_000_000.0),
    },
)
gdf = polygonize(raster, return_type='geopandas')
print(gdf.crs)             # EPSG:3857
print(gdf.geometry.bounds) # minx/miny in 0..3 pixel range, not near 1e6, 5e6

Observed:

   minx  miny  maxx  maxy
0   0.0   0.0   2.0   2.0
1   0.0   0.0   3.0   3.0

Expected (with the transform applied):

        minx       miny       maxx       maxy
0  1000000.0  4999980.0  1000020.0  5000000.0
1  1000000.0  4999970.0  1000030.0  5000000.0

Additional context

Resolution order for the proposed auto-detect helper, parallel to _detect_raster_crs:

  1. raster.attrs['transform'] (xrspatial.geotiff convention, 6-tuple)
  2. raster.rio.transform() if rioxarray is installed (convert from Affine to the rasterio-ordered tuple polygonize expects)
  3. Fall back to None (current behaviour)

Found during the polygonize metadata-propagation sweep on 2026-05-27 (Cat 1 attrs preservation: the function reads one attr and drops the sibling attr needed to make the first one meaningful).

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiAPI design and consistencybugSomething isn't workingconversion tools

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions