Skip to content

refactor(waterdata): coherent error handling + safe cleanups for non-OGC getters#298

Merged
thodson-usgs merged 2 commits into
DOI-USGS:mainfrom
thodson-usgs:refactor/waterdata-non-ogc-getter-coherence
May 30, 2026
Merged

refactor(waterdata): coherent error handling + safe cleanups for non-OGC getters#298
thodson-usgs merged 2 commits into
DOI-USGS:mainfrom
thodson-usgs:refactor/waterdata-non-ogc-getter-coherence

Conversation

@thodson-usgs
Copy link
Copy Markdown
Collaborator

Summary

Coherence pass on the non-OGC getters (Samples, codes, ratings) — the backlog split out of PR #295's review. Two themes: error-handling coherence and safe cleanups.

All public signatures, kwargs (including the Samples camelCase, which mirrors the upstream API), and return shapes are unchanged. The one behavior-affecting change is the exception type on HTTP errors — flagged below for an explicit decision.

Error coherence (behavior: exception type)

The OGC and stats paths raise the module's typed errors (RateLimited on 429, ServiceUnavailable on 5xx, with Retry-After and USGS-aware messages via _raise_for_non_200). The direct-httpx.get getters did not — they used a bare response.raise_for_status() (generic httpx.HTTPStatusError). This PR routes them through _raise_for_non_200:

  • get_samples, get_samples_summary, get_codes (api.py)
  • ratings._search, ratings._download_and_parse (ratings.py)

So a caller can now except RateLimited / except ServiceUnavailable across the whole module.

⚠️ Behavior change: code that caught httpx.HTTPStatusError from these getters will no longer match (the typed errors are RuntimeError subclasses). This is the deliberate coherence change — happy to gate it behind discussion, or make the typed errors also subclass httpx.HTTPStatusError if you'd prefer strict backward-compat.

Safe cleanups (behavior-preserving)

  • _get_samples_csv helper — collapses the duplicated GET + read_csv + metadata tail shared by get_samples / get_samples_summary.
  • get_channel docstring driftchannel_flow was described as units (it's the discharge value); channel_flow_unit was undocumented; measurement_type carried last_modified's pasted RFC-3339 block.
  • _format_api_dates — resolve the local timezone only after the all-NA / duration / interval early-return guards (it was computed then discarded on those paths).

Tests

  • New typed-error regression tests (429 → RateLimited, 5xx → ServiceUnavailable).
  • 118 samples/ratings/utils tests pass; ruff clean.

Intentionally excluded (per scope + backward-compat)

  • No kwarg renames — the Samples camelCase mirrors the upstream API; a rename would break callers.
  • No get_codes return-shape change, no ssl_check additions (those were the "breaking" option).
  • The _aggregate_paginated_response / _combine_chunk_responses response-merge dedup is deferred to avoid conflicting with the in-flight chunking PR fix: repair WQP_Metadata.site_info and streamstats.Watershed #295.

🤖 Generated with Claude Code

thodson-usgs and others added 2 commits May 30, 2026 10:15
…OGC getters

Non-OGC getter error coherence + safe cleanups. All public signatures, kwargs
(incl. the upstream-driven Samples camelCase), and return shapes are unchanged
— the only behavior change is the exception TYPE on HTTP errors.

Error coherence:
- Route the direct-httpx getters (get_samples, get_samples_summary, get_codes,
  ratings._search/_download_and_parse) through the module's typed
  _raise_for_non_200 instead of a bare response.raise_for_status(), so a 429
  raises RateLimited and a 5xx raises ServiceUnavailable (with Retry-After and
  USGS-aware messages) — the same typed errors the OGC/stats path already
  raises. NOTE: this changes the exception type from httpx.HTTPStatusError;
  flagged for review.

Safe cleanups:
- Extract `_get_samples_csv` and call it from get_samples/get_samples_summary
  (collapses the duplicated GET + read_csv tail).
- Fix get_channel docstring drift: channel_flow was labeled as units (it's the
  discharge value), channel_flow_unit was undocumented, and measurement_type
  carried last_modified's pasted RFC-3339 block.
- _format_api_dates: resolve the local timezone only after the all-NA /
  duration / interval guards (skip the lookup when it's discarded).

Tests: add typed-error regression tests (429 -> RateLimited, 5xx ->
ServiceUnavailable). 118 samples/ratings/utils tests pass; ruff clean.

Deliberately NOT included (kept backward-compatible / out of scope): no kwarg
renames (Samples camelCase mirrors the upstream API), no get_codes return-shape
change, no ssl_check additions. The _aggregate_paginated_response /
_combine_chunk_responses response-merge dedup is deferred to avoid colliding
with the in-flight chunking PR (DOI-USGS#295).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@thodson-usgs thodson-usgs marked this pull request as ready for review May 30, 2026 16:18
@thodson-usgs thodson-usgs merged commit 484a86a into DOI-USGS:main May 30, 2026
8 checks passed
@thodson-usgs thodson-usgs deleted the refactor/waterdata-non-ogc-getter-coherence branch May 30, 2026 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant