Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support null foreign keys in get_random_subset #2082

Merged
merged 4 commits into from
Jun 24, 2024

Conversation

R-Palazzo
Copy link
Contributor

CU-86b0vyyrv
Resolve #2056

@R-Palazzo R-Palazzo requested review from rwedge and gsheni June 20, 2024 07:57
@R-Palazzo R-Palazzo self-assigned this Jun 20, 2024
@R-Palazzo R-Palazzo requested a review from a team as a code owner June 20, 2024 07:57
@sdv-team
Copy link
Contributor

@R-Palazzo R-Palazzo removed the request for review from a team June 20, 2024 07:57
@R-Palazzo R-Palazzo changed the base branch from main to nullable_foreign_keys June 20, 2024 08:02
'The data contains null values in foreign key columns. '
'We recommend using ``drop_unknown_foreign_keys`` method from sdv.utils'
' to drop these rows before using ``get_random_subset``.'
)

try:
_subsample_disconnected_roots(result, metadata, main_table_name, ratio_to_keep)
_subsample_table_and_descendants(result, metadata, main_table_name, num_rows)
_subsample_ancestors(result, metadata, main_table_name, primary_keys_referenced)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than removing the the code, should be make it controllable via a parameter on _subsample_data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, done in a4f8295

@R-Palazzo R-Palazzo requested a review from gsheni June 21, 2024 10:31
@R-Palazzo R-Palazzo merged commit 49213ca into nullable_foreign_keys Jun 24, 2024
39 checks passed
@R-Palazzo R-Palazzo deleted the issue-2056-get-random-subset branch June 24, 2024 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support null foreign keys in get_random_subset
4 participants