Add `CategoryAdherence` metric #462

frances-h · 2023-10-09T21:44:32Z

Problem Description

As a user, I would like a metric that verifies I do not have invalid discrete values.

Expected behavior

Add a new single_column metric that calculates the percent of values that match at least 1 value from the real data.
This metric takes in categorical and boolean sdtype columns.

Attributes

The metric should have the following attributes:

name: 'CategoryAdherence'
goal: Goal.MAXIMIZE
min_value: 0.0
max_value: 1.0

Methods

The metric should also define the following methods

compute(real_data, synthetic_data): Compute the score for the metric. The returned score should be the percent of synthetic values that match at least 1 value in the real data. Null values should be counted as a separate category.
- Parameters:
  - (required) real_data: A pandas.Series object with the column of real data
  - (required) synthetic_data: A pandas.Series object with the column of synthetic data
- Returns: The score for this metric

>>> from sdmetrics.single_column import CategoryAdherence
>>> CategoryAdherence.compute(
	real_data=real_table['ethnicity'],
	synthetic_data=synthetic_table['ethnicity'])
1.0
>>> CategoryAdherence.compute_breakdown(
	real_data=real_table['ethnicity'],
	synthetic_data=synthetic_table['ethnicity'])
{ 'score': 1.0 }

The text was updated successfully, but these errors were encountered:

frances-h added feature request Request for a new feature new Label applied to new issues labels Oct 9, 2023

frances-h mentioned this issue Oct 12, 2023

Add DataValidity property #467

Closed

R-Palazzo mentioned this issue Oct 23, 2023

Add CategoryAdherence metric #475

Merged

amontanez24 removed the new Label applied to new issues label Oct 23, 2023

amontanez24 added this to the 0.13.0 milestone Oct 23, 2023

R-Palazzo mentioned this issue Nov 7, 2023

New Diagnostic Reports #499

Merged

R-Palazzo closed this as completed in #499 Nov 27, 2023

amontanez24 assigned R-Palazzo Nov 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `CategoryAdherence` metric #462

Add `CategoryAdherence` metric #462

frances-h commented Oct 9, 2023 •

edited

Add CategoryAdherence metric #462

Add CategoryAdherence metric #462

Comments

frances-h commented Oct 9, 2023 • edited

Problem Description

Expected behavior

Attributes

Methods

Add `CategoryAdherence` metric #462

Add `CategoryAdherence` metric #462

frances-h commented Oct 9, 2023 •

edited