feat: support starting_timestamp to start from an offset#12
Merged
mdrakiburrahman merged 2 commits intomainfrom Apr 7, 2026
Merged
feat: support starting_timestamp to start from an offset#12mdrakiburrahman merged 2 commits intomainfrom
mdrakiburrahman merged 2 commits intomainfrom
Conversation
Add a starting_timestamp config parameter (ISO-8601 UTC) that allows skipping historical source files when no checkpoint exists. When a checkpoint watermark is present, the parameter is silently ignored. - Add _parse_starting_timestamp() validator in impl.py - Extend discover_files() and has_unprocessed_files() with the new param - Create synthetic Watermark from starting_timestamp when no checkpoint - Throw DbtRuntimeError on invalid or future timestamps - Update table.sql and incremental.sql macros to pass the parameter - Add 12 unit tests covering all behaviors - Update integration test models with starting_timestamp config Closes #11 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When any step fails, upload /tmp/imds-router.log and .logs/ as a downloadable artifact with 7-day retention. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why this change is needed
Closes #11
When a model has no checkpoint yet, dbt-scope processes all source files from the beginning of time. For large datasets with years of historical files, this can be very expensive. Users need a way to skip historical files and start processing from a given point in time — analogous to Delta Streaming's
startingTimestamp.How
A new
starting_timestampconfig parameter (ISO-8601 UTC string, e.g."2026-04-07T10:00:00+00:00") is added to the adapter's file discovery pipeline.Behavior:
starting_timestampset?DbtRuntimeErrorDbtRuntimeErrorDbtRuntimeErrorKey implementation details:
_parse_starting_timestamp()validates the timestamp string and ensures it has timezone info, converting to UTCstarting_timestampis provided, a syntheticWatermarkis created from it and used for filtering — no changes tofile_tracker.pyorcheckpoint.pywere neededtable.sqlandincremental.sqlmacros read the config and pass it through toadapter.discover_files()Files changed:
dbt/adapters/scope/impl.py— core logic:_parse_starting_timestamp(), extendeddiscover_files()andhas_unprocessed_files()dbt/include/scope/macros/materializations/table.sql— reads and passesstarting_timestampdbt/include/scope/macros/materializations/incremental.sql— reads and passesstarting_timestamptests/integration/dbt_project/models/append_no_delete.sql— uses1900-01-01(process all)tests/integration/dbt_project/models/filtered_edition.sql— uses2026-01-01(skip pre-2026)Test
tests/unit/test_starting_timestamp.pycovering:uv run pytest tests/unit/ -q)