Uncovering the Latent Potential of
Deep Intermediate Representations

Layerwise Optimal Embedding Selection for supervised and label-free representation discovery

Arnesh Batra¹ · Arush Gumber^*1 · Aniket Khandelwal^*1 · Jashn Khemani¹ · Anubha Gupta¹

¹SBILab, Indraprastha Institute of Information Technology Delhi, Delhi, India
^*Equal contribution

LOES is the reference implementation for Layerwise Optimal Embedding Selection, a lightweight module for identifying useful intermediate layers in deep models.

LOES supports supervised layer selection with labels and label-free selection when labels are unavailable.

Highlights

Supervised embeddings: pass (n_cal, L, D) embeddings and labels; LOES returns the best layers for classification or regression.
Label-free embeddings: pass embeddings only; LOES ranks layers using isotropy and redundancy.
Hugging Face model-id mode: pass a model id plus either a PyTorch dataloader or a Hugging Face-style dataset.
Progress and logs: use show_progress=True and verbose=True for clean progress bars and structured run logs.

Installation

pip install -e .

For Hugging Face model-id and dataset loading:

pip install -e ".[huggingface]"

Quick Start

1. Supervised Embeddings

import torch
from loes import select_layers_from_custom_embeddings

embeddings = torch.randn(256, 12, 768)
labels = torch.randint(0, 10, (256,))

result = select_layers_from_custom_embeddings(
    embeddings=embeddings,
    targets=labels,
    k=3,
    task="classification",
    show_progress=True,
    verbose=True,
)

print(result.selected_layers)
print(result.layer_scores)

2. Label-Free Embeddings

from loes import select_layers_from_custom_embeddings

result = select_layers_from_custom_embeddings(
    embeddings=embeddings,
    k=3,
    task="label_free",
    show_progress=True,
)

print(result.selected_layers)

Label-free LOES uses only representation isotropy and redundancy, so labels are not required.

3. Hugging Face Model ID With A PyTorch Dataloader

from loes import select_layers_from_hf_id

result = select_layers_from_hf_id(
    "bert-base-uncased",
    dataset=train_loader,  # dataloader=train_loader also works
    task="classification",
    k=4,
    max_calibration_samples=512,
    show_progress=True,
    verbose=True,
)

The dataloader can yield (inputs, targets) tuples or dictionaries with model inputs and a target key such as labels, label, targets, target, or y.

4. Hugging Face Model ID With A Hugging Face Dataset

from loes import select_layers_from_hf_id

result = select_layers_from_hf_id(
    "bert-base-uncased",
    dataset="ag_news",
    split="train",
    text_key="text",
    target_key="label",
    task="classification",
    k=4,
    max_calibration_samples=512,
)

dataset can be a Hugging Face dataset id, a loaded Dataset, or a DatasetDict; for DatasetDict, LOES uses the requested split. If the dataset is already tokenized, LOES will collate keys such as input_ids, attention_mask, pixel_values, or input_values. For raw text datasets, it uses the model tokenizer and a text column such as text, sentence, query, or document.

Public API

from loes import (
    select_layers_from_custom_embeddings,
    select_layers_from_hf,
    select_layers_from_hf_id,
)

All selectors return:

LOESResult(
    selected_layers=[...],
    layer_scores=[...],
    task="classification",
    num_layers_seen=12,
    num_calibration_samples=256,
    pooling="cls",
    dataset="...",
    model_name="...",
)

Progress Bars And Logs

LOES uses tqdm progress bars and Python's standard logging module.

result = select_layers_from_custom_embeddings(
    embeddings,
    targets=labels,
    k=4,
    task="classification",
    show_progress="auto",
    verbose=True,
)

show_progress="auto" is the default and displays bars only in interactive terminals. Use show_progress=True or "on" to force bars, and show_progress=False or "off" for quiet scripts.

verbose=True emits logs for calibration collection, layer scoring, greedy layer selection, and final selected layers.

Pooling Rules

When pooling="auto":

Text encoder families such as BERT, RoBERTa, DeBERTa, DistilBERT, and ModernBERT use cls.
Vision transformer families such as ViT, DeiT, BEiT, DINOv2, and DINOv3 use cls.
Audio encoders such as Wav2Vec2, HuBERT, WavLM, Whisper, and AST use mean.
Unknown model types fall back to mean.

You can also force pooling="cls", pooling="mean", or pooling="masked_mean".

Task Semantics

task="classification": labels can be integer class ids or one-hot floating targets.
task="regression": targets can be shaped as (n_cal,) or (n_cal, out_dim).
task="label_free": targets are optional and ignored; selection uses isotropy and redundancy only.

Citation

@inproceedings{
anonymous2026uncovering,
title={Uncovering the Latent Potential of Deep Intermediate Representations},
author={Anonymous},
booktitle={Forty-third International Conference on Machine Learning},
year={2026},
url={https://openreview.net/forum?id=6up1qGJwYZ}
}

License

This project is released under the MIT License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
experiments		experiments
loes		loes
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example_loes.py		example_loes.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uncovering the Latent Potential of
Deep Intermediate Representations

Layerwise Optimal Embedding Selection for supervised and label-free representation discovery

Highlights

Installation

Quick Start

1. Supervised Embeddings

2. Label-Free Embeddings

3. Hugging Face Model ID With A PyTorch Dataloader

4. Hugging Face Model ID With A Hugging Face Dataset

Public API

Progress Bars And Logs

Pooling Rules

Task Semantics

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Uncovering the Latent Potential of Deep Intermediate Representations

Layerwise Optimal Embedding Selection for supervised and label-free representation discovery

Highlights

Installation

Quick Start

1. Supervised Embeddings

2. Label-Free Embeddings

3. Hugging Face Model ID With A PyTorch Dataloader

4. Hugging Face Model ID With A Hugging Face Dataset

Public API

Progress Bars And Logs

Pooling Rules

Task Semantics

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Uncovering the Latent Potential of
Deep Intermediate Representations

Packages