Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to visualize and select a threshold for binary postprocessing #2158

Open
geoffreyangus opened this issue Jun 16, 2022 · 1 comment
Assignees
Labels
feature New feature or request
Projects

Comments

@geoffreyangus
Copy link
Collaborator

The binary classification threshold threshold postprocesses probabilities as either negatives (<threshold) or positives (>=threshold). This is currently a value that one would set prior to training, and is often set to its default as 0.5. If the user wanted to change the threshold of their model upon discovery of an ideal operating point, they would edit the threshold in the config after training. However, it is currently somewhat difficult to find said operating point.

In class imbalanced settings, it is often desired to set the threshold to some value != 0.5. For example, if the user wants to have high precision at the expense of some recall of a rare positive, the user may want to set a higher threshold. We want to provide two new functionalities: (1) a visualization of threshold vs. metric, where the metrics plotted are those that are threshold-dependent (i.e. accuracy, precision, recall, f1, etc.), and (2) a lightweight ability to experiment with different thresholds from the LudwigModel.predict function.

Such functionality would enable users to analyze and ultimately select the optimal threshold for their use case.

@geoffreyangus
Copy link
Collaborator Author

From the visualization standpoint, you can take inspiration from some of our existing visualization functionality:

def confidence_thresholding_data_vs_acc(

In order to change the threshold in predict, you can build off of functionality added by @justinxzhao in this PR: #3520

Hopefully that is helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
AutoML
To do
Development

No branches or pull requests

1 participant