Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .claude/sweep-api-consistency-state.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
module,last_inspected,issue,severity_max,categories_found,notes
geotiff,2026-05-18,2106,MEDIUM,3,"Sweep 2026-05-18 (deep-sweep-api-consistency-geotiff-2026-05-18-1779164255). 1 MEDIUM Cat 3 finding fixed in this branch: open_geotiff(max_cloud_bytes=...) was the only kwarg on the public reader/writer surface without a Python type annotation. Docstring already declared ``int or None``; the surface and the docs disagreed. Fix adds ``int | None`` to the annotation; default stays the module-internal _MAX_CLOUD_BYTES_SENTINEL. Regression test in test_open_geotiff_max_cloud_bytes_annot_2106.py pins the immediate gap and parametrises over every public reader/writer to catch future ungenerated annotations. Prior sweep findings (#1922/#1935 kwarg ordering, #2052 mask_nodata parity, #2097 GPU MinIsWhite, #2095 zero-band 3D writes, #1946 write_vrt path/vrt_path shim) all confirmed fixed. Cross-sibling return-type drift (Cat 2): write_vrt returns str while to_geotiff and write_geotiff_gpu return path which is str | BinaryIO -- inspected and still LOW (callers do not substitute writers; the return-type drift is documented in each writer's docstring). Cross-cutting cross-module drift (chunk_size in reproject vs chunks in geotiff; target_crs vs crs) documented but not filed per sweep template (cross-cutting). cuda-validated."
polygonize,2026-05-19,2148,HIGH,1;3,"Sweep 2026-05-19 (deep-sweep-api-consistency-polygonize-2026-05-19). 1 MEDIUM Cat 3 finding fixed in this branch (#2148): polygonize() was the only public vector/raster conversion function without a return type annotation. Sieve/contours/rasterize/clip_polygon all declare one. Fix adds a Union return annotation (numpy tuple | awkward tuple | geopandas GeoDataFrame | spatialpandas GeoDataFrame | geojson dict) using TYPE_CHECKING forward refs for optional deps, and expands the docstring Returns section to enumerate the per-return_type shapes. 1 HIGH Cat 1 finding NOT fixed in this PR -- cross-module rename: polygonize uses `connectivity` (int 4|8) while sieve uses `neighborhood` (int 4|8) for the identical rook/queen pixel-connectivity concept. Industry convention (GDAL, rasterio.features.sieve) favours `connectivity`; the deprecation shim belongs in sieve.py, not polygonize, so this is out of scope for the polygonize-scoped sweep branch. Documented here for the next sieve sweep pass. 1 LOW Cat 1 cross-cutting: polygonize/sieve/clip_polygon use `raster` while contours and many older modules use `agg` for the input DataArray -- library-wide drift, not filed per-module per sweep template. Cat 2 return-shape: polygonize returns tuple/GeoDataFrame/dict by return_type; consistent with contours' tuple/GeoDataFrame dispatch. No Cat 4 (no mutable defaults; connectivity=4 default matches sieve neighborhood=4 default). No Cat 5 (polygonize re-exported in xrspatial/__init__.py; no orphan API; no __all__ but consistent with module convention). cuda-validated: cupy backend accepts identical kwargs, smoke-tested with cupy DataArray on host with CUDA_AVAILABLE."
reproject,2026-05-10,1570,HIGH,2;5,"Filed cross-module attrs['vertical_crs'] type collision (string vs EPSG int) vs xrspatial.geotiff. Fixed in PR (TBD): reproject now writes EPSG int and preserves friendly token under vertical_datum. MEDIUM kwarg-order drift (transform_precision vs chunk_size) and missing type hints vs geotiff documented but not fixed (cosmetic, kwarg-only)."
27 changes: 24 additions & 3 deletions xrspatial/polygonize.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,12 @@
# x and y coordinates are monotonically increasing or decreasing.

from enum import Enum
from typing import Dict, List, Optional, Tuple, Union
from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple, Union

if TYPE_CHECKING:
import awkward as ak
import geopandas as gpd
import spatialpandas

import numba as nb
import numpy as np
Expand Down Expand Up @@ -1550,7 +1555,13 @@ def polygonize(
return_type: str = "numpy",
simplify_tolerance: Optional[float] = None,
simplify_method: str = "douglas-peucker",
):
) -> Union[
Tuple[List[Union[int, float]], List[List[np.ndarray]]],
Tuple[List[Union[int, float]], "ak.Array"],
"gpd.GeoDataFrame",
"spatialpandas.GeoDataFrame",
Dict[str, Any],
]:
"""
Polygonize creates vector polygons for connected regions of pixels in a
raster that share the same pixel value. It is a raster to vector
Expand Down Expand Up @@ -1610,7 +1621,17 @@ def polygonize(
Returns
-------
Polygons and their corresponding values in a format determined by
return_type.
``return_type``:

- ``"numpy"`` (default): ``(column, polygon_points)`` where ``column``
is a list of pixel values and ``polygon_points`` is a list of polygons,
each polygon a list of ``Nx2`` ``np.ndarray`` rings (exterior first,
then holes).
- ``"awkward"``: ``(column, ak.Array)`` of polygon coordinates.
- ``"geopandas"``: ``geopandas.GeoDataFrame`` with ``column_name`` and
``geometry`` columns.
- ``"spatialpandas"``: ``spatialpandas.GeoDataFrame``.
- ``"geojson"``: ``dict`` representing a GeoJSON ``FeatureCollection``.

Notes
-----
Expand Down
Loading