The DCRBaselineProtection metric crashes when the distance between random data and real data is 0

### Environment Details

- SDMetrics version: 0.19.1 (DCR Branch)
- Python version: Python 3.11
- Operating System:  Linux Colab

### Error Description
The new `DCRBaselineProtection` metric is a measure of privacy of the synthetic data. It asks the Q: _If I were to use random data instead of synthetic data, how much more private would it be?_

The metric is based on the distance to closest record. It measures:
- `random_data_median`: The typical distance between random data and real
- `synthetic_data_median`: The typical distance between synthetic data and real

The final score is: `synthetic_data_median / random_data_median`

However in some cases, `random_data_median=0`. This happens when you have a dataset that is capable of very little diversity. For eg, the dataset has only 2 columns, which each can only contain 2 possible discrete values (=4 possibilities). 

| is_active | response |
|-----------|----------|
| True      | "YES"    |
| True      | "NO"     |
| False     | YES      |
| False     | "NO"     |

If `random_data_median=0`, this metric currently crashes with a `ZeroDivisionError`.

### Expected Behavior
Rather than crashing, the final metric score should be `NaN`, indicating that it was is not recommended to be computing privacy on such a dataset anyways.

The `compute_breakdown` should still return the individual median scores so the user can understand more about what's happening.

```python
>>> DCRBaselineProtection.compute_breakdown(
  real_data=real_df,
  synthetic_data=synthetic_df,
  metadata=my_metadata)
{
  'score': NaN,
  'median_DCR_to_real_data': {
    'synthetic_data': 0.25
    'random_data_baseline': 0.0
  }
}
`` `


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The DCRBaselineProtection metric crashes when the distance between random data and real data is 0 #738

Environment Details

Error Description

Expected Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The DCRBaselineProtection metric crashes when the distance between random data and real data is 0 #738

Description

Environment Details

Error Description

Expected Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions