Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error handling for GaussianCopulaSynthesizer: numerical_distributions #1211

Closed
npatki opened this issue Jan 27, 2023 · 0 comments
Closed
Assignees
Labels
feature request Request for a new feature
Milestone

Comments

@npatki
Copy link
Contributor

npatki commented Jan 27, 2023

Problem Description

In SDV 1.0, I will be able to specify numerical distributions using the API below:

synthesizer = GaussianCopulaSynthesizer(
  metadata,
  numerical_distributions={
    'age': 'uniform',
    'weight': 'norm',
    'bmi': 'beta'
  }
)

The numerical_distributions parameter is only valid for the existing columns in my metadata. Currently, the synthesizer is allowing me to pass in something like:

numerical_distributions = {
  'totally_fake_column_name': 'beta'
}

Expected behavior

Check to make sure that the the dictionary keys in numerical_distributions actually correspond to column names (as specified by the metadata object). If they do not, then throw an error:

SynthesizerInputError: Invalid column names found in the numerical_distributions dictionary (<list>).
The column names you provide must be present in the metadata.

Eg. 
SynthesizerInputError: Invalid column names found in the numerical_distributions dictionary ('fake', 'fake2').
The column names you provide must be present in the metadata.
@npatki npatki added the feature request Request for a new feature label Jan 27, 2023
@npatki npatki added this to the 1.0.0 milestone Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

3 participants