Skip to content

Conversation

@MoritzPotthoffQC
Copy link
Contributor

@MoritzPotthoffQC MoritzPotthoffQC commented Oct 10, 2025

Motivation

When sampling for a schema by providing a sequence of mappings (rather than a mapping of iterables), the keys of the first mapping in the override dictate the schema of the data frame that is initialized to start sampling. This is currently not clearly documented.

While specifying different keys for the different entries can work out (e.g., by leaving out certain keys in following members, which are then imputed as None), it is pretty easy to add overrides keys to entries at indices > 0 which are not in the first entry. These will then silently be ignored. This can cause the sampling operation to fail (as it would need the overrides to find a compliant data frame) or (even worse) lead to a sampled data frame that does not comply with the expectations of the user.

Changes

  • Explicitly validate that all entires in the overrides sequence specify the same keys
  • Add test

@MoritzPotthoffQC MoritzPotthoffQC self-assigned this Oct 10, 2025
@github-actions github-actions bot added the enhancement New feature or request label Oct 10, 2025
@codecov
Copy link

codecov bot commented Oct 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (b711ac3) to head (3c9702c).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #165   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           51        51           
  Lines         2903      2907    +4     
=========================================
+ Hits          2903      2907    +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MoritzPotthoffQC MoritzPotthoffQC marked this pull request as ready for review October 10, 2025 19:02
Copy link
Member

@borchero borchero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@MoritzPotthoffQC MoritzPotthoffQC enabled auto-merge (squash) October 13, 2025 06:58
@MoritzPotthoffQC MoritzPotthoffQC merged commit 7b3be2d into main Oct 13, 2025
22 checks passed
@MoritzPotthoffQC MoritzPotthoffQC deleted the schema-sample-override-keys branch October 13, 2025 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants