download only specific bands from source gribs#2
Merged
Conversation
Add additional data variables
… in get_template Support processing only a subset of variables
aldenks
commented
Dec 10, 2024
…variables isn't starting earlier
aldenks
pushed a commit
that referenced
this pull request
May 25, 2026
- Drop the Materialized/Virtual DynamicalDataset split — single base. Keep sibling MaterializedRegionJob/VirtualRegionJob under RegionJob to host their substantial per-variant code. - Reframe the update process around indexed CronJobs (same scheduling pattern as materialized): worker 0 expands dims on main; workers fill chunks in parallel; no temp branch for operational updates. Avoids concurrent dim-expansion conflicts on init_time coord chunks. - Promote filtering-already-ingested from open question to the core efficiency mechanism for steady-state updates (region jobs span shards that are mostly already populated). - Walk through a concurrent-update scenario explicitly and explain why ConflictDetector accepts it. Call out the integration test PR #2 needs to verify icechunk 2.x rebase semantics. - Document three options for per-variable serializer (encoding factory, metadata-only common config, inherit-and-replace) instead of picking one prematurely. - Minimize __main__.py surface: source virtual chunk containers declared on the VirtualRegionJob class; store factory picks them up automatically. - Address each unresolved review comment from Alden inline. - Add appendices with concrete code patterns from PR #511, context from PR #510, and an inventory of existing infrastructure we reuse. https://claude.ai/code/session_01YbsupHKaGd11C8RaW6gQVQ
aldenks
pushed a commit
that referenced
this pull request
May 26, 2026
Previous draft routed multi-new-init scenarios through a "catchup edge case" with two divergent strategies (app-level retry vs backfill fallback). That's operational complexity for no good reason. Replace with a single uniform model: every batch commit opens a fresh session, computes refs against current state (filter-already-ingested + lazy index lookup), expands the dim if needed, sets refs, commits. On ConflictDetector rejection, throw away the session and retry with a fresh one — the retry's "recompute against current state" step naturally picks up whatever the other pod committed and recomputes target indices. Retries are cheap: byte ranges from parsed index files are already in hand, set_virtual_ref calls are microseconds, only the chunk-key indices need recomputation. Steady state sees ~0 conflicts; the rare multi-new-init scenario pays a few extra retries and converges. Update PR #2 integration tests to verify this uniform model converges in both the disjoint-write and concurrent-expansion scenarios. https://claude.ai/code/session_01YbsupHKaGd11C8RaW6gQVQ
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.