Skip to content

Merge sen2_testing into main#1

Merged
DakotaHester merged 17 commits intomainfrom
sen2_testing
Jul 10, 2025
Merged

Merge sen2_testing into main#1
DakotaHester merged 17 commits intomainfrom
sen2_testing

Conversation

@DakotaHester
Copy link
Collaborator

This PR contains many many changes, the most notable of which are listed below:

  • New function for creating timeseries using pcxarray.prepare_timeseries()
  • Switch to odc-geo for fast, Dask-compliant reproject, resampling, merge, and clip operations
  • New pcxarray.get_pc_collections() function to list available Planetary Computer collections
  • Enable multiprocessing functionality for prepare_data
  • Automatic caching of query results
  • Add new examples demonstrating new functionality on various datasets (gNATSGO, Landsat, Sentinel-2, HLS).

While this branch was originally created to test Sentinel-2 collections, it has expanded into a avenue for incorporating tons of new features into the base package.

@DakotaHester DakotaHester requested a review from Copilot July 10, 2025 00:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR merges the sen2_testing branch into main, introducing major new features and refactors across querying, caching, and processing geospatial data.

  • Refactored core processing to use odc-geo for lazy merging, reprojection, and merged timeseries support (prepare_timeseries).
  • Added persistent caching for expensive operations and a new helper (get_pc_collections) to list STAC collections.
  • Simplified STAC query logic with thread-based timeouts, retry/backoff, and updated I/O functions.

Reviewed Changes

Copilot reviewed 8 out of 13 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/pcxarray/utils.py Added caching decorator, renamed progress param, updated utilities
src/pcxarray/query.py Replaced multiprocessing with threading, added retries & caching, introduced get_pc_collections
src/pcxarray/processing.py New processing module: lazy_merge_arrays, refactored prepare_data, added prepare_timeseries
src/pcxarray/io.py Enhanced raster loading (load_from_url), single-item reading logic
src/pcxarray/cache.py New caching module using joblib.Memory
src/pcxarray/init.py Updated public API exports
pyproject.toml Added dependencies: joblib, odc-geo, bottleneck
README.md Expanded overview, usage examples, and documentation updates
Comments suppressed due to low confidence (6)

src/pcxarray/io.py:59

  • The docstring references a timeout parameter, but the function signature does not include timeout. Either remove or add the parameter.
    timeout : float, default 60.0

src/pcxarray/processing.py:496

  • [nitpick] New prepare_timeseries API introduces complex behavior but lacks unit tests to validate grouping, parallel execution, and output structure.
def prepare_timeseries(

src/pcxarray/utils.py:120

  • The function uses requests.get but import requests is missing at the top of this file, causing a NameError.
        with requests.get(url, stream=True, verify=verify) as r:

src/pcxarray/processing.py:250

  • Iterating for item in items_full_overlap loops over column names, not rows. To try each feature record, use .iterrows() or .itertuples() and pass the correct row to read_single_item.
        for item in items_full_overlap:

src/pcxarray/processing.py:89

  • This check will raise an error when all three arguments are None, but the intention is to auto‐compute them when all are missing. Adjust the condition to allow sum(...) == 3 to proceed with automatic determination.
        if sum([geom is None, crs is None, resolution is None]) > 1:

src/pcxarray/processing.py:116

  • [nitpick] Typo in comment: "rekli=ommon" should be "common".
    # reproject all arrays to the rekli=ommon geobox

@DakotaHester DakotaHester merged commit 8d75a62 into main Jul 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants