Skip to content

Epic: Library-wide feature tiers and production-readiness contract #2415

@brendancol

Description

@brendancol

Goal

Adopt the GeoTIFF feature-tier strategy across the full xarray-spatial
library so users can distinguish what is production-ready from what is useful
but still maturing.

The library now advertises 150+ functions across raster I/O, reprojection,
surface analysis, hydrology, flood, fire, interpolation, raster/vector
conversion, classification, focal operations, proximity, MCDA, and utility
layers. The README already documents backend availability (NumPy, Dask,
CuPy, and Dask+CuPy) and sometimes signals individual experimental tools.
It does not yet communicate one consistent answer to the more important
question: what behavior can a production user rely on?

GeoTIFF provides a working precedent:

  • tier strings with clear meanings;
  • a runtime source of truth (xrspatial.geotiff.SUPPORTED_FEATURES);
  • documentation that mirrors the runtime mapping;
  • narrow stable contracts backed by regression/release-gate tests;
  • explicit opt-in for particularly risky experimental or internal paths;
  • explicit unsupported combinations rather than overclaiming.

This epic generalizes that approach for every public tool and meaningful
subfeature without claiming that all existing functions or all their backends
are stable on day one.

Product Outcome

After this work, a user evaluating a workflow such as Dask hydrology, GPU
terrain generation, rasterization, kriging, or MCDA should be able to answer:

  • Is the tool stable, advanced, experimental, internal, or
    unsupported?
  • Does that tier apply to the core NumPy behavior only, or also to Dask/GPU
    implementations?
  • Are important modes, algorithms, options, data shapes, or integrations at a
    different tier from the tool's default path?
  • What correctness, parity, scalability, and interoperability behavior has
    actually been tested?
  • What must happen before an advanced or experimental path can be promoted?
  • Will a breaking change be deprecated, release-noted, or allowed to change
    while the feature matures?

The result is a user-facing production-readiness contract, not a marketing
badge or a claim that experimental tools are low quality.

Scope

In scope

  • Every public callable exported from xrspatial, public subpackages
    (hydro, mcda, interpolate, reproject, geotiff, and
    experimental), and documented .xrs accessor methods.
  • Every user-visible tool in the README function catalog, including functions
    which are documented but are not currently top-level exports.
  • Material subfeatures whose maturity can differ from their parent callable:
    algorithms/methods, input/output modes, distributed or GPU backends,
    remote/external integrations, metadata behavior, optional-dependency paths,
    and compatibility aliases/wrappers.
  • Documentation, API representation, test evidence, promotion/demotion rules,
    release-process integration, and contribution policy.

Out of scope

  • Rewriting implementations simply to make all tools stable.
  • Treating backend availability as proof of production readiness.
  • Tiering private helpers, internal kernels, or implementation-only utilities
    unless they define a public behavior contract.
  • Guaranteeing numerical identity between fundamentally different algorithms;
    the contract must state the relevant tolerance/behavior instead.
  • Replacing ordinary semantic-versioning and deprecation policy. Tiers make
    that policy legible; they do not remove it.

Tier Semantics

The library catalog should use five consistent status values:
stable, advanced, experimental, internal, and unsupported.
The first three describe adoptable public functionality at different levels
of maturity. internal and unsupported make boundaries visible so internal
implementation paths and deliberate non-features cannot be mistaken for
production promises.

stable

A supported production contract:

  • public behavior, accepted input domain, output semantics, and documented
    metadata/coordinate behavior are defined;
  • the promised backend(s) have focused regression tests in normal CI;
  • distributed/GPU parity or stated numerical tolerance is tested when those
    backends are part of the stable contract;
  • error behavior for unsafe or unsupported inputs is tested where material;
  • breaking changes require deprecation/release-note handling under project
    policy;
  • stable claims are checked during release readiness.

stable does not mean every optional mode or every backend of that callable
is stable. For example, tool.slope could be stable for CPU and Dask while
tool.slope.backend.cupy remains advanced until its parity evidence is
recorded.

advanced

A usable, documented feature for informed production adoption:

  • implementation exists and has meaningful tests;
  • supported input/mode limitations are documented;
  • users may deploy it when they can validate results and follow migration
    notes;
  • the surface, uncommon edge-case behavior, performance envelope, or
    cross-backend parity may not be fully pinned down;
  • breaking refinements before a stable promotion must be called out in
    release notes, but do not necessarily require the full stable guarantee.

experimental

An evaluation-stage feature:

  • it is visible so users can try it and provide feedback;
  • it has at least smoke/validation coverage appropriate to its current risk;
  • its behavior or API may change without a deprecation window;
  • production reliance is not recommended;
  • experimental status must be visible at the import/docs/README touchpoints,
    and an opt-in may be appropriate where accidental use creates a significant
    correctness, interoperability, cost, or safety risk.

internal

An internal path exists to support library implementation needs or one
deliberately narrow xrspatial use case, but is not an adoptable public
production contract:

  • it may be documented where users can encounter it, so its boundary is clear;
  • it must not be advertised as a normal public tool or as production-ready;
  • if it is callable through a public entry point, accidental use should be
    blocked by an explicit opt-in or an equally strong guard when misuse is
    consequential;
  • compatibility and external interoperability are not promised unless stated
    separately.

GeoTIFF JPEG-in-TIFF is the motivating existing example. At the library
policy level this should be called internal, not internal_only.

unsupported

An unsupported capability or combination is explicitly outside the
library's contract for the current release:

  • it is recorded when users could reasonably otherwise assume support from a
    broader tool or backend claim;
  • calling it should fail with an actionable error whenever returning a result
    would mislead users, lose information, or silently produce incorrect output;
  • a documentation-only unsupported entry is acceptable where there is no
    executable path to guard;
  • it cannot be presented as an adoptable tool until implemented, audited, and
    moved to an appropriate public tier.

Examples include meaningful method/backend combinations not implemented for a
tool, or GeoTIFF write combinations deliberately outside the current contract.

Classification Unit: Tool Plus Subfeatures

A single tier on a callable is insufficient when a function dispatches over
four backends or exposes algorithms with different confidence levels. Use a
two-level model:

  1. Core tool tier: the documented default or narrowest recommended
    production path for the callable, with its backend/input scope stated.
  2. Subfeature overrides: only for dimensions whose evidence or risks
    materially differ from that core contract.

Examples of subfeature dimensions to audit:

Dimension Representative examples
Backend numpy, dask, cupy, dask_cupy, CPU fallback behavior
Algorithm/method D8/D-infinity/MFD; resampling modes; kriging variograms; MCDA combination methods; classification methods
Execution mode lazy/chunked behavior, memory-bounded streaming, tiled processing, GPU strict/fallback mode
Input/output contract nodata/NaN semantics, dtype behavior, coordinates/CRS, multi-band/multidimensional input, vector geometry validity
Integration optional dependency, remote data/grid downloads, GeoTIFF/COG/VRT, DataArray/Dataset accessor surface
Risk-sensitive behavior stochastic seeding, approximate computation, overflow/precision, security/resource caps, external interoperability

Do not explode the registry into a row for every parameter value. Create an
override only when it changes what a production user may safely conclude.

Proposed Source Of Truth

Introduce a library-level machine-readable registry, provisionally:

from xrspatial import FEATURE_SUPPORT

FEATURE_SUPPORT["surface.slope"]
FEATURE_SUPPORT["surface.slope.backend.dask"]
FEATURE_SUPPORT["hydrology.flow_direction.mfd"]
FEATURE_SUPPORT["raster_vector.rasterize.backend.cupy"]

The implementation PR should settle the precise name/module and schema. The
schema must support more than a bare tier string because the GeoTIFF model is
being expanded across heterogeneous tools. Minimum fields:

Field Purpose
tier stable, advanced, experimental, internal, or unsupported
public_api Canonical import/callable or documented accessor path
category README/reference-doc grouping
scope Plain-language boundary of what the tier covers
subfeatures Backend/method/mode overrides, if any
evidence Test file(s), parity/golden-reference suite, benchmark/resource check, or tracked gap
limitations Known boundaries and unsupported combinations
owner_issue Audit/promotion/follow-up issue responsible for the classification

Requirements for the registry:

  • one canonical key naming convention with aliases pointing at canonical
    entries rather than duplicating tier decisions;
  • existing xrspatial.geotiff.SUPPORTED_FEATURES is either incorporated
    without breaking public consumers or bridged into the library registry;
  • the library-level vocabulary uses internal; the implementation plan must
    decide how GeoTIFF's existing runtime internal_only value migrates or is
    compatibility-aliased without silently breaking callers;
  • unsupported entries may describe deliberate combinations rather than
    callable implementations, but remain queryable/documentable through the
    same status model;
  • import-time footprint remains small: the registry cannot import optional
    GPU/geospatial dependencies or trigger compute/network work;
  • docs tables are rendered from or checked against this source of truth;
  • missing public APIs and unrecognized tier strings fail CI;
  • users can inspect status programmatically without parsing documentation.

Audit Rubric

Classification should be evidence-based and deliberately conservative. Every
tool audit records the following:

Area Questions
Public/API surface Is the documented import path intentional? Are aliases and unified wrappers clear? Is the signature/docstring coherent?
Correctness baseline Is behavior compared to a published algorithm, trusted implementation, hand-computed fixtures, invariants, or golden outputs?
Input contract Are dtype, NaN/nodata, coordinate/grid, dimensionality, invalid-input, and boundary behaviors specified and tested?
Backend contract Which README-advertised backends are native, fallback, untested, approximately equivalent, or parity-tested?
Dask behavior Is it actually lazy? Are chunk boundary results correct? Are graph/memory risks bounded for large rasters?
GPU behavior Does it have device-available CI/smoke coverage and defined tolerance versus CPU? Is fallback visible rather than silently changing expectations?
Numerical/algorithm modes Do mode-specific algorithms need separate tiers or tolerances?
Performance/resources Can ordinary production-sized inputs unexpectedly allocate unbounded memory, trigger downloads, or scale unexpectedly?
Safety/interoperability Are file/network/vector/metadata/optional-dependency interactions constrained and failure modes actionable?
Documentation Are tier, scope, limitations, examples, and production guidance visible where users encounter the tool?

Initial classification rules:

  • Do not infer stable from age, popularity, top-level export, or a green
    unit-test file.
  • Default an unaudited public tool to experimental or unclassified
    during migration; do not silently label it stable.
  • Mark a path internal only when it is intentionally not an adoptable
    public contract; mark a combination unsupported when users need an
    explicit negative answer rather than a maturity label.
  • A tool may become stable for its narrow NumPy path before the remaining
    advertised backends are promoted.
  • A CPU fallback on a GPU/Dask input is a documented execution mode, not
    native backend parity; classify it separately when it affects performance
    or scalability expectations.
  • For wrappers (for example unified hydrology entry points), the wrapper
    cannot have a stronger broad claim than the algorithms/backends it selects;
    its stable scope can explicitly identify the stable choices.

Workstreams By Library Area

The audit should be divided into owned, reviewable batches based on the
current README/reference layout:

Workstream Surfaces to classify
Existing precedent GeoTIFF/COG/VRT mapping and compatibility with the new library registry
Georeferencing and grid transforms Reproject, merge, resample; projection/datum/vertical-shift methods and backends
Terrain/surface/visibility Aspect, slope, hillshade, terrain metrics, viewshed/visibility, synthetic terrain/noise/erosion/bump
Hydrology and flood D8/D-infinity/MFD families, unified wrappers, HAND/TWI, flood and vegetation-driven flood tools
Raster/vector and spatial movement Rasterize, polygonize, contours, polygon clip, proximity, surface distance, cost/corridor, pathfinding
Neighborhood/image analysis Focal, bilateral, GLCM, edge detection, morphology, sieve, diffusion
Statistical/domain analytics Classification, multispectral, fire, KDE, multivariate, dasymetric, interpolation, MCDA
Cross-cutting public surfaces Utilities, accessors, diagnostics, dataset interactions, optional dependency and backend-fallback behavior

Each workstream should produce classification entries, supporting evidence
links, documented gaps, and focused follow-up issues for anything that needs
hardening before promotion.

Phased Implementation Plan

Phase 0: Establish policy and design the registry

Deliverables:

  • Add a short library-level feature-support policy document defining all five
    statuses: stable, advanced, experimental, internal, and
    unsupported.
  • Decide the registry location, schema, canonical key grammar, alias strategy,
    treatment of .xrs accessors, and whether an interim unclassified status
    is required during rollout.
  • Define what stable means for API changes, numerical tolerances, lazy
    behavior, fallback behavior, optional dependencies, and release blocking.
  • Decide how generated docs/reference tables consume or validate the
    registry.
  • Decide which risky experimental paths require explicit runtime opt-in
    versus conspicuous documentation only, and apply the same decision to
    publicly reachable internal paths.
  • Decide the compatibility migration from GeoTIFF's current
    internal_only runtime spelling to the library-wide internal spelling.

Acceptance criteria:

  • Maintainers agree on the tier definitions and audit rubric before
    individual tools are assigned production-readiness labels.
  • A schema example handles a straightforward tool, a multi-backend tool, a
    method family such as hydrology, a wrapper/alias, and the existing GeoTIFF
    registry.
  • The migration approach does not break
    xrspatial.geotiff.SUPPORTED_FEATURES.
  • The resulting policy unambiguously explains when to use internal versus
    unsupported.

Phase 1: Build complete public-surface inventory

Deliverables:

  • Generate/review an inventory from top-level exports, public subpackages,
    documented README rows, accessor methods, and reference pages.
  • Reconcile gaps where something is exported but not documented, documented
    but not exported, duplicated under aliases, or intentionally experimental.
  • Assign each inventory row a category owner and identify required
    subfeature overrides (backend, method, mode, integration).
  • Seed the registry with all inventory rows at unclassified during the
    migration, or conservatively experimental if no transitional label is
    adopted.

Acceptance criteria:

  • CI has a structural test that detects a newly exported/documented public
    tool lacking a registry entry.
  • Every existing README tool row maps to a canonical registry entry or an
    explicit documented exclusion.
  • Aliases and hydrology unified wrappers cannot produce conflicting claims
    with their canonical implementations.

Phase 2: Pilot outside GeoTIFF and validate the system

Choose several representative areas before batch-classifying the whole
library:

  • a simple numerically defined family with broad backend coverage, such as
    multispectral indices or morphology;
  • a backend/memory-sensitive family, such as focal operations or proximity;
  • a method-rich family, such as hydrology;
  • a geospatial integration family, such as reproject/resample.

Deliverables:

  • Classify the selected callable and subfeature rows using the audit rubric.
  • Add or identify evidence tests required to defend each initial tier.
  • Render the first library-wide status documentation and update relevant
    docstrings/README rows to make the tier visible.
  • Evaluate registry ergonomics, table readability, and whether the override
    level is too coarse or too granular.

Acceptance criteria:

  • The pilot demonstrates how a user distinguishes core tool status from a
    backend or algorithm override.
  • At least one real promotion gap is represented honestly rather than
    papered over by a broad label.
  • Maintainers approve the schema/documentation presentation before bulk
    migration.

Phase 3: Batch classify every tool family

Create child issues or PR batches for each workstream listed above. Each
batch must:

  • fill registry rows and subfeature overrides;
  • state the default/recommended path and production guidance;
  • cite current test evidence;
  • add focused tests for claims that are cheap and clearly intended already;
  • classify gaps conservatively and file promotion/hardening follow-ups rather
    than inflating the tier;
  • update corresponding reference documentation and user-facing catalog
    presentation.

Required review questions for every batch:

  • Are any README checkmarks being mistaken for parity/stability guarantees?
  • Is a Dask or Dask+CuPy claim backed by chunk-boundary/laziness evidence?
  • Is any GPU claim backed by a defined tolerance and device test strategy?
  • Are algorithm variants bundled together despite different evidence?
  • Are fallback modes and optional dependency behavior visible to users?
  • Does any feature write files, download resources, consume network inputs,
    or accept complex vector/metadata inputs that require additional
    failure-mode coverage?

Acceptance criteria:

  • Every public inventory row is classified.
  • No tool appears as production-ready without stated scope and evidence.
  • Known unstable or unsupported combinations are documented close to the
    callable and do not disappear inside a general tool label.

Phase 4: Documentation and discoverability rollout

Deliverables:

  • Add a library-level feature tier/reference page, linked prominently from
    README and docs index, with filters/tables by category and tier.
  • Add a compact tier column or badge to the README function catalog without
    obscuring the existing backend matrix.
  • Add tier/status and scope to public API docstrings or generated API
    reference output, especially for advanced/experimental paths.
  • Document how users inspect the registry programmatically.
  • Add guidance for production workflows: use stable scope by default,
    understand advanced limitations, validate experimental behavior locally.
  • Keep GeoTIFF pages linked as the detailed precedent rather than duplicating
    their full contract.

Acceptance criteria:

  • A user can find a tool's tier from the README, reference docs, and runtime
    API/status registry.
  • Docs and registry tiers cannot drift silently in CI.
  • Advanced/experimental limitations are stated in the same page/section as
    the tier, not buried only in an issue.

Phase 5: Evidence gates and release process

Deliverables:

  • Generalize the existing release_gate concept so stable non-GeoTIFF
    promises have lightweight, deterministic release-gate coverage or a
    documented CI suite that constitutes the gate.
  • Add registry shape/completeness tests and doc parity tests.
  • Establish backend parity/tolerance helpers so each family need not invent
    incompatible proof standards.
  • Add release checklist steps: run stable gates, inspect skips/xfails,
    reconcile classification changes, and mention promotions/demotions in
    release notes.
  • Define demotion handling when a stable or advanced promise is found false.

Acceptance criteria:

  • Every stable classification cites passing tests appropriate to its stated
    scope.
  • Stable gate failures block release or force an explicit corrected contract
    and release note; they cannot ship unnoticed.
  • Skipped optional/GPU evidence is handled according to a written rule rather
    than silently counted as proof.

Phase 6: Promotion backlog and ongoing governance

Deliverables:

  • Create follow-up issues for advanced/experimental features with concrete
    promotion requirements: missing parity matrices, reference comparisons,
    scalability limits, backend tests, error handling, or docs.
  • Add contribution guidance requiring new public tools/subfeatures and new
    advertised backends to include a tier classification and evidence plan.
  • Add a recurring release audit for tier promotions/demotions and stale
    experimental paths.
  • Define a policy for retiring or moving long-lived experimental features
    that do not progress.

Acceptance criteria:

  • New public functionality cannot be merged without a declared tier and
    visible status documentation.
  • Promotions are evidence-driven PRs updating registry, tests, docs, and
    release notes together.
  • The tier map remains a maintained product contract after the initial audit.

Test And Evidence Strategy

The initial audit may classify many features as advanced or experimental
because evidence is incomplete. It must not require exhaustive testing before
the registry exists. Evidence should scale with the promised tier:

Evidence type Stable expectation Advanced expectation Experimental expectation
Basic functional tests Required Required Smoke coverage
Invalid/boundary inputs Required for documented contract Important limitations covered Add for serious risks
Reference/invariant comparison Required where an algorithm has an objective reference/invariant At least key cases Optional/targeted
Backend parity/tolerance Required for each backend claimed by stable scope Document tested subset/gaps No parity claim unless stated
Dask laziness/chunk behavior Required when Dask is stable Cover principal mode Document uncertainty
GPU test execution Required for stable GPU scope through available GPU CI/release gate Smoke/parity where available, with skips visible Smoke where practical
Scalability/resource bound Required for memory/network/file-sensitive stable paths Known limits documented May be unknown but disclosed
Interoperability/security Required where the stable feature touches external formats or inputs Material checks/limits documented Opt-in/fail loudly where hazardous

Runtime And API Design Requirements

  • Do not add warnings on every normal call merely to announce that an
    advanced feature exists; that makes useful functionality noisy. Prefer
    discoverable registry/docs/docstrings.
  • Use runtime opt-ins or warnings only for paths where accidental adoption
    has concrete consequences: misleading scientific results, non-interoperable
    output, hidden high costs, security exposure, or data loss.
  • Do not demote a callable solely because one optional backend is less mature;
    represent backend-specific scope accurately.
  • Do not label a callable stable when its default dispatch can unexpectedly
    select an experimental implementation without explicit user choice.
  • Keep tier metadata lightweight and importable in minimal installations.

Deliverables Checklist

  • Library-wide tier policy defining stable, advanced, experimental,
    internal, and unsupported.
  • Canonical feature registry schema and public inspection API.
  • Automated inventory/completeness check for public tools and accessors.
  • Compatibility bridge or incorporation strategy for
    xrspatial.geotiff.SUPPORTED_FEATURES, including migration from
    internal_only to internal.
  • Pilot classification PRs across representative tool families.
  • Child audit issues/PRs for every library workstream.
  • Complete classification of each public tool and meaningful subfeature.
  • Library-level tier reference documentation and README discoverability.
  • Registry/docs parity checks.
  • Stable-contract tests/release-gate strategy beyond GeoTIFF.
  • Release checklist and release-note rules for promotions/demotions.
  • Contribution policy for new features and backend expansions.
  • Promotion backlog for non-stable paths.

Epic Acceptance Criteria

  • Every supported public tool/subfeature has a canonical, discoverable tier
    with explicit scope; internal paths and meaningful unsupported behavior are
    called out using the shared vocabulary.
  • Users can determine production worthiness without reading implementation
    code or searching closed issues.
  • A callable's backend availability matrix and its robustness tier no longer
    get conflated.
  • Every stable claim points to test evidence and participates in the agreed
    release contract.
  • Advanced and experimental claims state limitations and promotion work.
  • Runtime registry, README/reference docs, and API documentation agree, with
    automated checks preventing drift.
  • Existing GeoTIFF status remains compatible and becomes the first integrated
    example of the library-wide system, with an intentional compatibility path
    from internal_only to internal.
  • New public tools and backend/mode additions must declare tier/evidence as
    part of review.

Suggested Child Issues

  1. Policy/schema: define the five library statuses and design
    FEATURE_SUPPORT.
  2. Inventory: reconcile exports, subpackages, accessors, README rows, and
    reference pages.
  3. Compatibility: integrate/bridge xrspatial.geotiff.SUPPORTED_FEATURES
    and migrate/alias internal_only to internal.
  4. Pilot: tier morphology/multispectral plus docs rendering.
  5. Pilot: tier hydrology wrappers/algorithm variants and backend overrides.
  6. Pilot: tier reproject/resample/merge integration behavior.
  7. Audit terrain/surface/visibility.
  8. Audit hydrology/flood after the pilot pattern is accepted.
  9. Audit raster/vector, proximity, cost, and pathfinding.
  10. Audit focal/neighborhood/image and diffusion functions.
  11. Audit classification, multispectral/fire, KDE, interpolation,
    dasymetric, multivariate, and MCDA.
  12. Audit utilities, diagnostics, accessors, optional dependencies, and
    fallback behavior.
  13. Docs: add library feature-tier pages and update README/API visibility.
  14. CI/release: registry completeness, docs parity, stable gates, and
    release checklist.
  15. Governance: contribution rules and evidence-based promotion backlog.

Related Work

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiAPI design and consistencydocumentationImprovements or additions to documentationenhancementNew feature or requesttestsTest coverage and parity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions