feat: Validate overrides key consistency for schema sampling #165

MoritzPotthoffQC · 2025-10-10T18:40:32Z

Motivation

When sampling for a schema by providing a sequence of mappings (rather than a mapping of iterables), the keys of the first mapping in the override dictate the schema of the data frame that is initialized to start sampling. This is currently not clearly documented.

While specifying different keys for the different entries can work out (e.g., by leaving out certain keys in following members, which are then imputed as None), it is pretty easy to add overrides keys to entries at indices > 0 which are not in the first entry. These will then silently be ignored. This can cause the sampling operation to fail (as it would need the overrides to find a compliant data frame) or (even worse) lead to a sampled data frame that does not comply with the expectations of the user.

Changes

Explicitly validate that all entires in the overrides sequence specify the same keys
Add test

codecov · 2025-10-10T18:41:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (b711ac3) to head (3c9702c).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##              main      #165   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           51        51           
  Lines         2903      2907    +4     
=========================================
+ Hits          2903      2907    +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

borchero

Nice!

tests/schema/test_sample.py

MoritzPotthoffQC added 2 commits October 10, 2025 20:32

validate overrides keys

66d49cb

docs

664bfa7

MoritzPotthoffQC self-assigned this Oct 10, 2025

github-actions bot added the enhancement New feature or request label Oct 10, 2025

MoritzPotthoffQC marked this pull request as ready for review October 10, 2025 19:02

MoritzPotthoffQC requested review from AndreasAlbertQC, borchero and delsner as code owners October 10, 2025 19:02

borchero approved these changes Oct 10, 2025

View reviewed changes

tests/schema/test_sample.py Outdated Show resolved Hide resolved

formatting

3c9702c

MoritzPotthoffQC enabled auto-merge (squash) October 13, 2025 06:58

MoritzPotthoffQC merged commit 7b3be2d into main Oct 13, 2025
22 checks passed

MoritzPotthoffQC deleted the schema-sample-override-keys branch October 13, 2025 07:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: Validate overrides key consistency for schema sampling #165

feat: Validate overrides key consistency for schema sampling #165

Uh oh!

MoritzPotthoffQC commented Oct 10, 2025 •

edited

Loading

Uh oh!

codecov bot commented Oct 10, 2025 •

edited

Loading

Uh oh!

borchero left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat: Validate overrides key consistency for schema sampling #165

feat: Validate overrides key consistency for schema sampling #165

Uh oh!

Conversation

MoritzPotthoffQC commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Changes

Uh oh!

codecov bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

borchero left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MoritzPotthoffQC commented Oct 10, 2025 •

edited

Loading

codecov bot commented Oct 10, 2025 •

edited

Loading