Add metric for inter-row MSAS

### Problem Description
In [this paper](https://arxiv.org/pdf/2207.14406), we introduced a new methodology for calculating multi-sequence metrics called MSAS. We should add the MSAS-related metrics to SDMetrics so that users with sequential data can use them for evaluation.

### Expected behavior
Add a metric called InterRowMSAS that performs the MSAS algorithm for inter-row differences in a sequence.

**Data compatibility**: 1 ID column (representing the sequence key), and 1 continuous column (datetime or numerical)

**Parameters**:
- (required) `real_data`: A tuple of 2 pandas.Series objects. The first represents the sequence key of the real data and the second represents a continuous column of data.
- (required) `synthetic_data`: A tuple of 2 pandas.Series objects. The first represents the sequence key of the synthetic data and the second represents a continuous column of data.
- `n_rows_diff`: An integer representing the number of rows to consider when taking the difference
    - (default) 1: Take the difference of a row and the one right before it
    - Int > 0: Take the difference between a row `n` and `n + n_rows_diff`
- `apply_log`: Whether to apply a natural log before taking the difference
   - (default) `False`: Do not apply a log. This results in the absolute difference, useful when you expect the data to grow or shrink linearly
   - `True`: Apply a lot before taking the difference. This is recommended when you expect the data to grow or shrink exponentially

**Output**: A score in range [0, 1] -- 0 being the worst and 1 being the best

```python
from sdmetrics.column_pairs import InterRowMSAS

score = InterRowMSAS.compute(
  real_data=(real_table['patient_id'], real_table['heart_rate']),
  synthetic_data = (synthetic_table['patient_id'], synthetic_table['heart_rate']),
  n_rows_diff=100,
  apply_log=False
)
```

**How does it work?** The sequence key determines which continuous values belong to which sequence. This metric computes a statistic for all sequences in the real and synthetic data, and then compares those distributions.

1. Calculate the difference between row `r` and row `r+x` for each row in the real data. Then take the average over each sequence to form a distribution D_r
2. Do the same for the synthetic data to form a new distribution D_s
3. Now apply the [KSComplement metric](https://docs.sdv.dev/sdmetrics/metrics/metrics-glossary/kscomplement) to compare the similarities of the distributions (D_r, D_s). Return this score.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add metric for inter-row MSAS #640

Problem Description

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add metric for inter-row MSAS #640

Description

Problem Description

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions