chore(typing): set up mypy and tighten the package to mypy --strict#314
Merged
Conversation
9d5fbd7 to
75f50d3
Compare
The package ships a ``py.typed`` marker (advertising itself as typed to
downstream users) but nothing type-checked it. Add mypy and get a clean run.
Setup:
- [tool.mypy] in pyproject.toml: a lenient first-pass config
(ignore_missing_imports, target python_version 3.9), scoped to the
dataretrieval package.
- mypy<2 added to the [test] extra (<2 so it can still target 3.9).
- a type-check job in the CI workflow, parallel to the ruff lint job.
Fixes (mypy went from 78 errors to 0 on the tracked package):
- HTTPX_DEFAULTS annotated dict[str, Any] so **-splatting it into
httpx.get / httpx.AsyncClient type-checks -- cleared ~55 errors across
7 call sites at once.
- utils.py gains `from __future__ import annotations`: mypy (targeting
3.9) caught that the new `str | None` annotations there would be a
runtime error on 3.9, because this module -- unlike the rest of the
package -- lacked the future import.
- BaseMetadata.comment annotated `str | None` (was inferred `None`, which
rejected every subclass that assigns a comment string).
- _format_api_dates: accept Sequence[str | None] (covariant) so a
list[str] caller type-checks, and build the formatted list with an
early return so the final join sees list[str].
- _as_str_list: delegate to _normalize_str_iterable then wrap, so the
declared return type list[str] | None holds.
- _next_req_url: declare next_host / cur_host as `str | None`.
- ratings._search: build the query dict in a non-Optional local before
aliasing it to the loop's `params` (which toggles to None per page).
- nldi: drop the bool->str / Literal->str variable reuse; guard the basin
branch so feature_source / feature_id are non-None before get_basin.
- chunking: narrow the optional filter before _is_chunkable; fix a stale
`# type: ignore` error code.
The fixes are annotations and type-narrowing guards. The only runtime-visible
change is that nldi.search() now raises a clear ValueError up front when a
basin search is missing feature_source/feature_id, where the same condition
previously raised deeper inside get_basin. 259 tests pass across the affected
suites.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Builds on the lenient baseline: annotate the 71 previously-unannotated
functions, resolve the remaining strict findings, and flip the mypy config to
``strict = true`` (keeping only ``ignore_missing_imports`` for untyped
third-party libraries).
- nwis: type the ``@_deprecated`` decorator with a signature-preserving
``TypeVar`` (3.9-safe; ParamSpec would need 3.10), which un-erases the
decorated public getters; annotate the getters and helpers; defunct stubs
that only ``raise`` are typed ``-> NoReturn``.
- wqp / nldi / nadp / streamstats / samples / utils / waterdata{utils,api,
nearest,ratings,chunking}: precise parameter and return annotations
(``tuple[pd.DataFrame, <Metadata>]`` for the getters), parameterized bare
``dict`` generics, and a few justified ``cast``s where a callee's union
return is statically wider than the single concrete type a caller is
guaranteed (NLDI FeatureCollection endpoints, format-keyed StreamStats).
- __init__: ``get_ratings`` (nwis vs waterdata) and ``what_sites`` (nwis vs
wqp) collide across the star-imports; the modern definitions win by import
order. mypy can't model last-binding-wins, so the two re-binding mismatches
are silenced with an explanatory comment. (The collision itself is a
pre-existing public-namespace question, left as-is here.)
Annotations only -- no runtime behavior change. ``mypy --strict`` is clean and
the full test suite (469 passed, 2 skipped) is unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The type-check job ran ``pip install .[test]``, pulling pytest/coverage/ruff/ pytest-httpx just to run mypy. Add a minimal ``type-check`` extra (just ``mypy<2``, pinned once; ``test`` self-references it) and install ``.[type-check]`` instead — the job's install drops from 26 to 16 packages. mypy --strict is unchanged (0 errors). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
20a0088 to
8096a54
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The package ships a
py.typedmarker — it advertises itself as typed to downstream users — but nothing type-checked it. This adds mypy, gets a clean run, and tightens all the way tomypy --strict.Two commits:
strict = true— annotate the remaining 71 functions and resolve every strict finding.Setup
[tool.mypy]inpyproject.toml—strict = true, targetpython_version = "3.9"(the project floor), scoped todataretrieval. The one remaining relaxation isignore_missing_imports(untyped third-party libs like pandas/geopandas/anyio →Any); dropping that viapandas-stubs+ per-module overrides is a possible follow-up.mypy<2in the[test]extra (pinned<2so it can target 3.9; mypy 2.x requires ≥3.10).type-checkCI job, parallel to the rufflintjob.What got fixed
Baseline (commit 1):
HTTPX_DEFAULTS: dict[str, Any]so**HTTPX_DEFAULTStype-checks when splatted intohttpx.get/AsyncClient.utils.pylackedfrom __future__ import annotations, so the newstr | Noneannotations would have been a runtime error on Python 3.9 — mypy (targeting 3.9) caught it.BaseMetadata.commenttypedNone, list-invariance in_format_api_dates, thedict | Noneindexing inratings._search, nldibool/Literalreuse, …).Strict (commit 2):
@_deprecateddecorator with a signature-preservingTypeVar(3.9-safe), un-erasing the decorated nwis getters.tuple[pd.DataFrame, <Metadata>], parameterized baredictgenerics, a few justifiedcasts (NLDI FeatureCollection endpoints, format-keyed StreamStats), defunct stubs as-> NoReturn.Behavior
Annotations + type-narrowing only. The single runtime-visible change (from commit 1) is
nldi.search()raising a clearerValueErrorup front for a basin search missingfeature_source/feature_id. 469 tests pass (2 skipped), unchanged.Notes / surfaced findings
dataretrieval.get_ratings(nwis vs waterdata) anddataretrieval.what_sites(nwis vs wqp) collide across theimport *in__init__.py— the modern definitions win by import order. mypy flagged the last-binding-wins shadowing; it's silenced with a comment here, but it's a real API-surface question worth resolving (e.g. via__all__).dataretrieval/ngwmn.py(still onrequests) isn't in the repo, so CI won't see it; when committed it'll need its own annotations to pass strict.🤖 Generated with Claude Code