Skip to content

v0.1.3

Latest

Choose a tag to compare

@github-actions github-actions released this 13 Apr 04:00
· 56 commits to main since this release

Added

  • wildsat now supports variant selection via model_config={"variant": "..."} (or the variant= keyword). Available variants are vitb16 (default), resnet50, and swint, each backed by its own ImageNet-pretrained checkpoint that auto-downloads from Google Drive. The previous vitl16 arch has been removed as no upstream checkpoint exists. Users who need other pre-training initializations (CLIP, Prithvi, SatCLIP, etc.) can still point RS_EMBED_WILDSAT_CKPT at a local checkpoint. The RS_EMBED_WILDSAT_ARCH environment variable is still respected as a fallback but is now overridden by variant when both are set.

Changed

  • GEEProvider initialization now prefers an explicit Google Cloud project when one is provided via project=..., EE_PROJECT, or GOOGLE_CLOUD_PROJECT, but no longer hard-requires rs-embed callers to pass one explicitly. When no project is supplied, rs-embed now lets ee.Initialize() and geemap.ee_initialize() resolve Earth Engine's configured default project first, while still surfacing a clear error message when authentication is missing or no usable Cloud/quota project is configured.
  • thor now defaults RS_EMBED_THOR_PATCH_SIZE to 8 instead of 16, increasing the default token-grid density while keeping RS_EMBED_THOR_IMG=288. THOR also now defaults to a bounded native_snap preprocessing policy for ordinary resize-mode inputs: near-square inputs can keep a snapped native side when they stay within configured side/token limits, while tiled inputs still force fixed per-tile resize so stitched grids remain geometrically stable. The THOR model page now documents the patch-size/image-size coupling, native-snap limits, and concrete environment-variable examples for common tuning patterns.

Fixed

  • anysat and satmaepp_s2_10b now validate the variant keyword in a single place instead of accepting a wider alias set in _normalize_*_variant and then raising a second ModelError deeper in the runtime resolver. Previously, passing variant="tiny" / "small" to anysat or variant="base" to satmaepp_s2_10b would first be silently normalized and then rejected with a confusingly-located "currently exposes only variant=..." error. The normalize helpers now only accept the variants that actually map to a wired checkpoint (anysatbase, satmaepp_s2_10blarge), raise an immediate and descriptive ModelError for anything else, and the duplicate runtime guards have been removed. The satmaepp_s2_10b env-var path (RS_EMBED_SATMAEPP_S2_MODEL_FN) now also raises a clear error for unknown model_fn values instead of silently producing a variant=None runtime config. The describe() output for both adapters already advertises choices: ["base"] / choices: ["large"], so the schema side was already correct; this fix just makes the validation code match it.
  • BBox.validate() now enforces geographic bounds: longitudes must be in [-180, 180] and latitudes in [-90, 90]. Out-of-range coordinates previously passed validation and caused confusing downstream errors from the GEE provider.
  • describe_model() now returns a cached copy of the embedder's describe() output instead of instantiating a new embedder class on every call. The cache is keyed by canonical model name and is cleared by reset_runtime(). The returned dict is always a shallow copy so callers cannot mutate the cached entry.
  • fetch_api_side_inputs() now wraps per-spatial fetch errors in a ModelError that includes the spatial index and the original exception, making it easier to pinpoint which location caused a failure when running get_embeddings_batch() with input_prep="tile" or "auto".
  • run_embedding_request() now uses strict=True in the zip of spatials and prefetched inputs. A length mismatch between the two lists now raises immediately instead of silently truncating the result.
  • Loading checkpoint arrays during combined-export resume now emits a warnings.warn instead of silently swallowing the exception. Users will see a clear message indicating that array loading failed and that all inputs will be re-fetched.
  • _write_per_item_chunk no longer accesses the private _shutdown attribute of ThreadPoolExecutor to guard against double-shutdown. The outer finally block now relies on the documented idempotency of ThreadPoolExecutor.shutdown() instead, removing a fragile dependency on CPython internals that could break on future Python versions.
  • sensor_key() no longer applies int() truncation to scale_m and cloudy_pct when building the embedder instance cache key. Previously, float values such as 10.1 and 10.9 were both mapped to 10, allowing two sensors with different resolutions to share a cached embedder instance incorrectly. The raw field values are now used directly.
  • _run_per_item now closes all progress bars (main and per-model) inside the finally block of the chunk-pipeline loop. Previously the cleanup ran after the try/finally, so an unhandled exception (e.g. continue_on_error=False) would leave progress bars open and leak display resources in notebook environments.