Add `ReferentialIntegrity` metric #461

frances-h · 2023-10-09T21:41:51Z

Problem Description

As a user, I would like a metric that checks the integrity of my inter-table relationships.

Expected behavior

Add a new column_pairs metric that calculates the percent of foreign keys values that reference a real parent value.
This metric takes in primary key and foreign key column pairs.

Attributes

The metric should have the following attributes:

name: 'ReferentialIntegrity'
goal: Goal.MAXIMIZE
min_value: 0.0
max_value: 1.0

Methods

The metric should also define the following methods

compute(real_data, synthetic_data): Compute the score for the metric. The returned score should be the percent of foreign key values that reference a value in the primary key column.
- Parameters:
  - (required) real_data: a tuple of 2 pandas.Series objects. The first is the primary key column and the second is the foreign key column from the real data. (Note that this is different than other column_pair metrics)
  - (required) synthetic_data: a tuple of 2 pandas.Series objects. The first is the primary key column and the second is the foreign key column from the synthetic data. (Note that this is different than other column_pair metrics)
- Returns: The score for the metric

>>> from sdmetrics.column_pairs import ReferentialIntegrity
>>> ReferentialIntegrity.compute(
	real_data=(real['users']['id'], real['sessions']['user_id']),
	synthetic_data=(synth['users']['id'], synth['sessions']['user_id'])
1.0
>>> ReferentialIntegrity.compute_breakdown(
	real_data=(real['users']['id'], real['sessions']['user_id']),
	synthetic_data=(synth['users']['id'], synth['sessions']['user_id'])
{ 'score': 1.0 }

The text was updated successfully, but these errors were encountered:

frances-h added feature request Request for a new feature new Label applied to new issues labels Oct 9, 2023

frances-h mentioned this issue Oct 12, 2023

Add Relationship Validity property #469

Closed

amontanez24 removed the new Label applied to new issues label Oct 23, 2023

amontanez24 added this to the 0.13.0 milestone Oct 23, 2023

R-Palazzo mentioned this issue Oct 24, 2023

Add ReferentialIntegrity metric #480

Merged

R-Palazzo mentioned this issue Nov 7, 2023

New Diagnostic Reports #499

Merged

R-Palazzo closed this as completed in #499 Nov 27, 2023

amontanez24 assigned R-Palazzo Nov 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `ReferentialIntegrity` metric #461

Add `ReferentialIntegrity` metric #461

frances-h commented Oct 9, 2023 •

edited

Add ReferentialIntegrity metric #461

Add ReferentialIntegrity metric #461

Comments

frances-h commented Oct 9, 2023 • edited

Problem Description

Expected behavior

Attributes

Methods

Add `ReferentialIntegrity` metric #461

Add `ReferentialIntegrity` metric #461

frances-h commented Oct 9, 2023 •

edited