Add the ability to visualize and select a threshold for binary postprocessing #2158

geoffreyangus · 2022-06-16T18:43:19Z

The binary classification threshold threshold postprocesses probabilities as either negatives (<threshold) or positives (>=threshold). This is currently a value that one would set prior to training, and is often set to its default as 0.5. If the user wanted to change the threshold of their model upon discovery of an ideal operating point, they would edit the threshold in the config after training. However, it is currently somewhat difficult to find said operating point.

In class imbalanced settings, it is often desired to set the threshold to some value != 0.5. For example, if the user wants to have high precision at the expense of some recall of a rare positive, the user may want to set a higher threshold. We want to provide two new functionalities: (1) a visualization of threshold vs. metric, where the metrics plotted are those that are threshold-dependent (i.e. accuracy, precision, recall, f1, etc.), and (2) a lightweight ability to experiment with different thresholds from the LudwigModel.predict function.

Such functionality would enable users to analyze and ultimately select the optimal threshold for their use case.

The text was updated successfully, but these errors were encountered:

geoffreyangus · 2023-09-14T16:33:24Z

From the visualization standpoint, you can take inspiration from some of our existing visualization functionality:

ludwig/ludwig/visualize.py

Line 2388 in 49b4c79

def confidence_thresholding_data_vs_acc(

In order to change the threshold in predict, you can build off of functionality added by @justinxzhao in this PR: #3520

Hopefully that is helpful!

geoffreyangus added the feature New feature or request label Jun 16, 2022

justinxzhao added this to To do in AutoML Jun 23, 2022

justinxzhao assigned geoffreyangus Jun 23, 2022

justinxzhao mentioned this issue Jun 23, 2022

Add in-training tooling to find a more optimal threshold for binary classification. #2181

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the ability to visualize and select a threshold for binary postprocessing #2158

Add the ability to visualize and select a threshold for binary postprocessing #2158

geoffreyangus commented Jun 16, 2022

geoffreyangus commented Sep 14, 2023

Add the ability to visualize and select a threshold for binary postprocessing #2158

Add the ability to visualize and select a threshold for binary postprocessing #2158

Comments

geoffreyangus commented Jun 16, 2022

geoffreyangus commented Sep 14, 2023