Skip to content

feat: structured load statistics from insert_into_target_tables#71

Merged
martinv13 merged 4 commits into
cre-dev:mainfrom
martinv13:feat/load-stats
Jun 22, 2026
Merged

feat: structured load statistics from insert_into_target_tables#71
martinv13 merged 4 commits into
cre-dev:mainfrom
martinv13:feat/load-stats

Conversation

@martinv13

Copy link
Copy Markdown
Collaborator

Summary

insert_into_target_tables previously returned a bare int (total inserted rows).
It now returns a LoadStats dataclass with inserted/existing row counts and
per-phase timing, giving callers visibility into both deduplication effectiveness
and load performance without any extra instrumentation.

Changes

Document.insert_into_target_tablesLoadStats

stats = doc.insert_into_target_tables()
print(stats.inserted)            # rows written to target tables
print(stats.existing)            # rows skipped (hash-deduplicated or parent was existing)
print(stats.duration_temp_insert)  # seconds — staging phase
print(stats.duration_merge)        # seconds — merge phase
print(stats.duration_cleanup)      # seconds — temp-table cleanup

Document.merge_into_target_tablesMergeStats(inserted, existing, duration)

Document.insert_into_temp_tablesfloat (phase duration)


**`Document.insert_into_target_tables`**`LoadStats`

```python
stats = doc.insert_into_target_tables()
print(stats.inserted)            # rows written to target tables
print(stats.existing)            # rows skipped (hash-deduplicated or parent was existing)
print(stats.duration_temp_insert)  # seconds — staging phase
print(stats.duration_merge)        # seconds — merge phase
print(stats.duration_cleanup)      # seconds — temp-table cleanup

Document.merge_into_target_tablesMergeStats(inserted, existing, duration)

Document.insert_into_temp_tablesfloat (phase duration)

martinv13 and others added 4 commits June 22, 2026 16:04
…dataclass

insert_into_target_tables now returns LoadStats(inserted, existing,
duration_temp_insert, duration_merge, duration_cleanup).
merge_into_target_tables returns MergeStats(inserted, existing, duration).
insert_into_temp_tables returns the phase duration as a float.

Row counts use the first INSERT rowcount per table (the data-table INSERT;
join-table INSERTs are ignored). Backends that do not report rowcount for
INSERT…FROM SELECT (e.g. DuckDB) silently yield 0 for inserted/existing.
Both types are exported from the package root.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@martinv13 martinv13 merged commit ab748b6 into cre-dev:main Jun 22, 2026
10 checks passed
@martinv13 martinv13 deleted the feat/load-stats branch June 22, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant