-
Notifications
You must be signed in to change notification settings - Fork 50
Description
Problem Description
Currently, the DisclosureProtection metric warns about poor performance when the size of the input data is greater than 50,000 rows. This number was chosen without investigation into the performance of the metric. It'd be helpful to know how the performance of the metric changes based on the size of the input, so that we can warn the user of possible poor performance earlier and suggest an alternative metric.
Expected behavior
Investigate the performance of the DisclosureProtection metric, considering input data length, number of known/sensitive columns, and number of unique discrete values in those columns. Also test across the different CAP methods.
Once we have a good understanding of the performance, we should update the warning in DisclosureProtection based on the results of the investigation.