# Anomaly Detection

Detect rare and anomalous samples using TabPFN’s unsupervised extension.

The TabPFN Unsupervised Extension brings TabPFN’s foundation-model reasoning to unsupervised anomaly detection. TabPFN estimates the likelihood of each sample under its learned distribution. Samples with low joint probability are considered anomalous or outliers.

![alt text](https://mintcdn.com/priorlabs/3G1G1o7Jr-FF4nX2/public/outliers.png?w%3D1100%26fit%3Dmax%26auto%3Dformat%26n%3D3G1G1o7Jr-FF4nX2%26q%3D85%26s%3D13bf69b9f6af61c3f7106a6d64659fca)

In [1]:
# !uv pip install "tabpfn-extensions[unsupervised]"

In [2]:
import torch
from sklearn.datasets import load_breast_cancer
from tabpfn import TabPFNClassifier, TabPFNRegressor
from tabpfn_extensions.unsupervised import TabPFNUnsupervisedModel


In [3]:
# Load dataset
df = load_breast_cancer(return_X_y=False)
X = df["data"]
attribute_names = df["feature_names"]


In [None]:
# device = "cuda" if torch.cuda.is_available() else "cpu"
device = "cpu"
device


'cpu'

In [5]:
# Initialize models
clf = TabPFNClassifier(device=device, n_estimators=4)
reg = TabPFNRegressor(device=device, n_estimators=4)
model_unsupervised = TabPFNUnsupervisedModel(
    tabpfn_clf=clf, tabpfn_reg=reg
)


In [6]:
# Compute anomaly scores
# Fit with numpy array (for categorical feature inference)
model_unsupervised.fit(X)

# Convert X to tensor for outliers() method
X_tensor = torch.tensor(X, dtype=torch.float32).to(device)

# Convert the stored X_ to tensor to avoid errors in outliers()
# The outliers() method uses self.X_ internally and needs it to be a tensor
if hasattr(model_unsupervised, 'X_'):
    if not isinstance(model_unsupervised.X_, torch.Tensor):
        model_unsupervised.X_ = torch.tensor(model_unsupervised.X_, dtype=torch.float32).to(device)
    else:
        # Ensure it's on the correct device and contiguous
        model_unsupervised.X_ = model_unsupervised.X_.to(device).contiguous()

scores = model_unsupervised.outliers(
    X_tensor,
    n_permutations=10,
)


Consider using a GPU or the tabpfn-client API: https://github.com/PriorLabs/tabpfn-client
  _validate_num_samples_for_cpu(
  pred = pred["criterion"].pdf(pred["logits"], torch.tensor(y_predict))
Consider using a GPU or the tabpfn-client API: https://github.com/PriorLabs/tabpfn-client
  _validate_num_samples_for_cpu(
  pred = pred["criterion"].pdf(pred["logits"], torch.tensor(y_predict))
Consider using a GPU or the tabpfn-client API: https://github.com/PriorLabs/tabpfn-client
  _validate_num_samples_for_cpu(
  pred = pred["criterion"].pdf(pred["logits"], torch.tensor(y_predict))
Consider using a GPU or the tabpfn-client API: https://github.com/PriorLabs/tabpfn-client
  _validate_num_samples_for_cpu(
  x_inv[pos] = np.expm1(np.log(x[pos] * lmbda + 1) / lmbda)
  x_inv[pos] = np.expm1(np.log(x[pos] * lmbda + 1) / lmbda)
  pred = pred["criterion"].pdf(pred["logits"], torch.tensor(y_predict))
Consider using a GPU or the tabpfn-client API: https://github.com/PriorLabs/tabpfn-client
  _validat

In [7]:
print("Outlier scores:", scores[:10])


Outlier scores: tensor([7.3454e-12, 5.4470e-21, 3.9196e-18, 1.8406e-14, 6.9053e-18, 6.1096e-23,
        3.1487e-23, 5.3450e-20, 1.8536e-21, 5.8473e-16])
