Skip to content

STAC-USC/NNK-Means-OOD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Out-of-Distribution Detection Through Soft Clustering with NNK-Means

Official repository for the paper "Out-of-Distribution Detection Through Soft Clustering with NNK-Means"

Table of Contents

Abstract

As language models become more general purpose, increased attention needs to be paid to detecting out-of-distribution (OOD) instances, i.e., those not belonging to any of the distributions seen during training. Existing methods for detecting OOD data are computationally complex and storage-intensive. We propose a novel soft clustering approach for OOD detection based on non-negative kernel regression. Our approach greatly reduces computational and space complexities (up to 11x improvement in inference time and 87% reduction in storage requirements) and outperforms existing approaches by up to 4 AUROC points on four different benchmarks. We also introduce an entropy-constrained version of our algorithm, which leads to further reductions in storage requirements (up to 97% lower than comparable approaches) while retaining competitive performance.Our soft clustering approach for OOD detection highlights its potential for detecting tail-end phenomena in extreme-scale data settings.

Contents

  1. src/ec_nnk_means.py: Core logic and implementation of the entropy-constrained NNK-Means algorithm. Also defines NNKMU a wrapper class for training and inference using the algorithm.

  2. src/data.py: Utility functions for loading and preprocessing data.

  3. src/ood_detection.py: Utility functions for fititng, inference and evaluation of OOD detection methods.

  4. src/main.py: OOD Detection driver script that benchmarks our approach against various others.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages