Allow subsampling when computing the `ContingencySimilarity` metric

### Problem Description
The [ContingencySimilarity metric](https://docs.sdv.dev/sdmetrics/metrics/metrics-glossary/contingencysimilarity) computes an entire contingency table for both the real and synthetic data in order to compare the discrete, 2D distributions. Our experiments have shown that for large datasets with high cardinality (i.e. large # of category values), this metric is not very performant.

Our experiments have also shown that a simple approach of subsampling the data (both the real dataset and the synthetic dataset) can yield much faster performance without affecting the final score by too much (within 5%). Based on this, we should add a parameter to this metric to allow for subsampling.

### Expected behavior
Add an optional parameter to `ContingencySimilarity` called `num_rows_subsample`
- (default) `None`: Do not subsample the rows
- `<integer>`: Randomly subsample the provided number of rows for both the real and the synthetic datasets before computing the metric

```python
from sdmetrics.column_pairs import ContingencySimilarity

ContingencySimilarity.compute(
    real_data=real_table[['column_1', 'column_2']],
    synthetic_data=synthetic_table[['column_1', 'column_2']],
    num_rows_subsample=1000
)
```

### Additional context
Our experiments have shown that multiple iterations are not needed when doing such a subsample, as the overall score is not affected by much. So we are not adding any parameter for iterations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow subsampling when computing the `ContingencySimilarity` metric #716

Problem Description

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow subsampling when computing the ContingencySimilarity metric #716

Description

Problem Description

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Allow subsampling when computing the `ContingencySimilarity` metric #716