UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
-
Updated
Jul 7, 2025 - Python
UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
👽 Out-of-Distribution Detection with PyTorch
Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores
A list of papers that studies out-of-distribution (OOD) detection and misclassification detection (MisD)
📈 SiRE (Simulation-Informed Revenue Extrapolation with Confidence Estimate for Scaleup Companies Using Scarce Time-Series Data), accepted by CIKM'2022 🗽
PyTorch implementation of our ECCV 2022 paper "Rethinking Confidence Calibration for Failure Prediction"
Learning from scratch a confidence measure
Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks https://arxiv.org/abs/1910.11933 or https://ieeexplore.ieee.org/document/9053264
Official pytorch implementation of the paper [Adaptive confidence thresholding for monocular depth estimation]
Demo code for GACE: Geometry Aware Confidence Enhancement
Free WordPress Plugin: This sample size calculator enables you to calculate the minimum sample size and the margin of error. Learn about sample size, the margin of error, & confidence interval. www.calculator.io/sample-size-calculator/
This repo contains code to perform Bootstrap Confidence Intervals estimation (a.k.a. Monte Carlo Confidence Interval or Empirical Confidence Interval estimation) for Machine Learing models.
[ACL 2025] Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models' Uncertainty?.
Benchmark for "Offline Policy Comparison with Confidence"
KBS 2024 Paper, A Confidence-based Knowledge Integration Framework for Cross-Domain Table Question Answering
Project of ACL 2025 MlingConf: A Comprehensive Investigation of Multilingual Confidence Estimation for Large Language Models
Source code for predicting confidence scores for the samples in t-sne embeddings.
Code for "Confidence-Driven Hierarchical Classification of Cultivated Plant Stresses"
number of times an experiment should be repeated for a 95% probability
Add a description, image, and links to the confidence-estimation topic page so that developers can more easily learn about it.
To associate your repository with the confidence-estimation topic, visit your repo's landing page and select "manage topics."