Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if QualityReport needs the synthetic data to match the metadata #509

Closed
frances-h opened this issue Nov 9, 2023 · 0 comments · Fixed by #499
Closed

Check if QualityReport needs the synthetic data to match the metadata #509

frances-h opened this issue Nov 9, 2023 · 0 comments · Fixed by #499
Assignees
Labels
bug Something isn't working
Milestone

Comments

@frances-h
Copy link
Contributor

Environment Details

Please indicate the following details about the environment in which you found the bug:

  • SDMetrics version:
  • Python version:
  • Operating System:

Error Description

Similar to #508, we'd like to verify that the QualityReport still runs correctly even if the synthetic data does not exactly match the metadata. The QualityReport should be tested with multiple datasets that have missing or extra columns. If the QualityReport runs as expected, the requirement that synthetic data should match the metadata should be relaxed for the QualityReport AND DiagnosticReport. The error message should be updated as well.

Steps to reproduce

import pandas as pd
from sdmetrics.reports.single_table import QualityReport

data = pd.DataFrame({
   'id': [0, 1, 2],
   'val1': ['a', 'a', 'b'],
   'val2': [0.1, 2.4, 5.7]
})
synthetic_data = pd.DataFrame({
  'id': [1, 2, 3],
  'extra_col': ['x', 'y', 'z'],
  'val1': ['c', 'd', 'd']
})

metadata = {
  'columns': {
     'id': {'sdtype': 'id'},
     'val1': {'sdtype': 'categorical'},
     'val2': {'sdtype': 'numerical'}
  },
  'primary_key': 'id'
}


report = QualityReport()
report.generate(data, synthetic_data, metadata)
@frances-h frances-h added bug Something isn't working new Label applied to new issues labels Nov 9, 2023
@npatki npatki removed the new Label applied to new issues label Nov 13, 2023
@amontanez24 amontanez24 added this to the 0.13.0 milestone Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants