Fix nldi: validate data source on every call, not just cache-miss#246
Draft
thodson-usgs wants to merge 3 commits intoDOI-USGS:mainfrom
Draft
Fix nldi: validate data source on every call, not just cache-miss#246thodson-usgs wants to merge 3 commits intoDOI-USGS:mainfrom
thodson-usgs wants to merge 3 commits intoDOI-USGS:mainfrom
Conversation
The validation check `data_source not in _AVAILABLE_DATA_SOURCES` was nested inside `if _AVAILABLE_DATA_SOURCES is None:`, so once any caller warmed the cache, subsequent invalid `data_source`/`feature_source` values were silently accepted. Move the check outside the cache-load branch so all calls are validated. Also guard the `_validate_data_source(feature_source)` call in `get_features` with `if feature_source:` — the previous code unconditionally validated `None` when the user provided only `comid`. The cache bug masked this; with the check now active it would otherwise raise "Invalid data source 'None'". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 3, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes NLDI data-source validation so invalid data_source / feature_source values are rejected on every call, not just while populating the cache. It fits the NLDI client layer by restoring early, user-facing argument validation instead of letting bad inputs fall through to confusing server-side 4xx responses.
Changes:
- Moved
_validate_data_source()’s membership check out of the cache-miss branch so cached calls are still validated. - Added a guard in
get_features()to avoid validatingfeature_source=Noneon comid-based calls. - Added regression tests and a test fixture to reset the module-level data-source cache between tests.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
dataretrieval/nldi.py |
Fixes cached validation behavior and adjusts get_features() feature-source validation flow. |
tests/nldi_test.py |
Adds regression tests for cold-cache and warm-cache invalid data-source handling, plus cache reset fixture. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| _validate_data_source(data_source) | ||
| # validate feature source | ||
| _validate_data_source(feature_source) | ||
| if feature_source: |
Per copilot review on PR DOI-USGS#246. The previous truthiness guard skipped validation for empty-string feature_source/data_source, letting bad inputs build invalid URLs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_validate_data_sourcenested itsdata_source not in cachecheck inside theif cache is None:branch, so the validator was effectively a no-op after the first call. Any invaliddata_source/feature_sourcevalue passed thereafter was silently accepted and used to construct an invalid URL, surfacing as a confusing 4xx from the server rather than a clean ValueError._validate_data_source(feature_source)call inget_featureswithif feature_source:— without this guard the now-active validator would raise "Invalid data source 'None'" whenever a user provided onlycomid(the cache bug previously masked this).Test plan
tests/nldi_test.py: invalid source raises after the cache is warm; invalid source raises on first call._AVAILABLE_DATA_SOURCEScache between tests to keep test order independent.Related PRs
nldi-cleanup) is a broader cleanup ofnldi.pythat adds the sameif feature_source:guard at the same line inget_features. Trivial textual conflict at merge time; the fix is identical so either can land first.nldi-navigation-mode-valueerror) touches a different function (_validate_navigation_mode) in the same file. No conflict.