Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR merges the sen2_testing branch into main, introducing major new features and refactors across querying, caching, and processing geospatial data.
- Refactored core processing to use odc-geo for lazy merging, reprojection, and merged timeseries support (
prepare_timeseries). - Added persistent caching for expensive operations and a new helper (
get_pc_collections) to list STAC collections. - Simplified STAC query logic with thread-based timeouts, retry/backoff, and updated I/O functions.
Reviewed Changes
Copilot reviewed 8 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/pcxarray/utils.py | Added caching decorator, renamed progress param, updated utilities |
| src/pcxarray/query.py | Replaced multiprocessing with threading, added retries & caching, introduced get_pc_collections |
| src/pcxarray/processing.py | New processing module: lazy_merge_arrays, refactored prepare_data, added prepare_timeseries |
| src/pcxarray/io.py | Enhanced raster loading (load_from_url), single-item reading logic |
| src/pcxarray/cache.py | New caching module using joblib.Memory |
| src/pcxarray/init.py | Updated public API exports |
| pyproject.toml | Added dependencies: joblib, odc-geo, bottleneck |
| README.md | Expanded overview, usage examples, and documentation updates |
Comments suppressed due to low confidence (6)
src/pcxarray/io.py:59
- The docstring references a
timeoutparameter, but the function signature does not includetimeout. Either remove or add the parameter.
timeout : float, default 60.0
src/pcxarray/processing.py:496
- [nitpick] New
prepare_timeseriesAPI introduces complex behavior but lacks unit tests to validate grouping, parallel execution, and output structure.
def prepare_timeseries(
src/pcxarray/utils.py:120
- The function uses
requests.getbutimport requestsis missing at the top of this file, causing a NameError.
with requests.get(url, stream=True, verify=verify) as r:
src/pcxarray/processing.py:250
- Iterating
for item in items_full_overlaploops over column names, not rows. To try each feature record, use.iterrows()or.itertuples()and pass the correct row toread_single_item.
for item in items_full_overlap:
src/pcxarray/processing.py:89
- This check will raise an error when all three arguments are None, but the intention is to auto‐compute them when all are missing. Adjust the condition to allow
sum(...) == 3to proceed with automatic determination.
if sum([geom is None, crs is None, resolution is None]) > 1:
src/pcxarray/processing.py:116
- [nitpick] Typo in comment: "rekli=ommon" should be "common".
# reproject all arrays to the rekli=ommon geobox
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains many many changes, the most notable of which are listed below:
pcxarray.prepare_timeseries()odc-geofor fast, Dask-compliant reproject, resampling, merge, and clip operationspcxarray.get_pc_collections()function to list available Planetary Computer collectionsWhile this branch was originally created to test Sentinel-2 collections, it has expanded into a avenue for incorporating tons of new features into the base package.