-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata auto-detection should ensure primary keys are unique (special sdtypes are not exempt from this rule!) #1876
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
sdv/metadata/single_table.py
Outdated
@@ -525,6 +525,8 @@ def _detect_columns(self, data): | |||
for field in data: | |||
column_data = data[field] | |||
has_nan = column_data.isna().any() | |||
is_unique = column_data.nunique() == len(column_data) | |||
valid_potential_primary_key = is_unique and not has_nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use column_data.is_unique
here instead of using nunique
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point, done in bbaeac1
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1876 +/- ##
=======================================
Coverage 97.49% 97.49%
=======================================
Files 51 51
Lines 4907 4908 +1
=======================================
+ Hits 4784 4785 +1
Misses 123 123 ☔ View full report in Codecov by Sentry. |
CU-86azuyq0t
Resolve #1871