Skip to content

uiuctml/LESSViT

Repository files navigation

LESSViT: Robust Hyperspectral Representation Learning under Spectral Configuration Shift

arXiv Project Page GitHub

This is the official repository for the paper "LESSViT: Robust Hyperspectral Representation Learning under Spectral Configuration Shift".

Authors: Haozhe Si, Yuxuan Wan, Yuqing Wang, Minh Do, Han Zhao.

Overview

LESSViT Architecture

Modeling hyperspectral imagery (HSI) across different sensors presents a fundamental challenge due to variations in wavelength coverage, band sampling, and channel dimensionality. We introduce LESSViT (Low-rank Efficient Spatial–Spectral Vision Transformer), a sensor-flexible architecture for cross-spectral generalization.

Our contributions are:

  1. LESS Attention — a structured low-rank factorization of spatial–spectral attention that reduces complexity from O(N²C²) to O(rNC), where N is the number of spatial tokens, C is the number of spectral channels, and r is the approximation rank.
  2. LESSViT — a channel-agnostic ViT with wavelength-aware positional encoding (SSRoPE) that enables consistent modeling under varying spectral configurations.
  3. HyperMAE — a hyperspectral masked autoencoder pre-training strategy with decoupled spatial–spectral masking and hierarchical channel sampling for scalable and robust learning.

Pre-training

We pre-train LESSViT using HyperMAE on the SpectralEarth benchmark (EnMAP hyperspectral data) for 200 epochs.

To launch pre-training, run:

bash launch_train.sh

See GeospatialFM/scripts/args.py for full argument descriptions.

Evaluation: Cross-Spectral Generalization

We evaluate LESSViT under a cross-spectral generalization setting on the SpectralEarth benchmark. Models are pre-trained and fine-tuned on a fixed channel configuration (C120_VNIR+) and evaluated across four spectral settings:

Setting Description
id In-distribution (C120_VNIR+)
ood_a Spectral shift (C120_SWIR+)
ood_complement Unseen wavelengths (C82, disjoint from training)
ood_full Channel expansion (C202, all channels)

Downstream datasets: enmap_cdl, enmap_corine, enmap_eurocrops, enmap_bdforet, enmap_bnetd.

To launch fine-tuning on a SpectralEarth dataset:

python3 GeospatialFM/finetune/finetune.py \
    --dataset_name ${DATASET_NAME} \
    --task_type ${TASK_TYPE} \
    --data_dir ${DATA_DIR} \
    --gen_task ${GEN_TASK} \
    --pretrained_model_path ${CHECKPOINT_PATH} \
    --run_name ${RUN_NAME} \
    --output_dir ${OUTPUT_DIR}
  • --dataset_name: One of enmap_cdl, enmap_corine, enmap_eurocrops, enmap_bdforet, enmap_bnetd.
  • --gen_task: One of id, ood_a, ood_complement, ood_full.

See GeospatialFM/finetune/args.py for full argument descriptions.

Model Weights

Pre-trained model checkpoints will be released soon. Stay tuned!

Citation

If you find our work helpful, please cite our paper:

@misc{si2026lessvitrobusthyperspectralrepresentation,
      title={LESSViT: Robust Hyperspectral Representation Learning under Spectral Configuration Shift}, 
      author={Haozhe Si and Yuxuan Wan and Yuqing Wang and Minh Do and Han Zhao},
      year={2026},
      eprint={2605.18541},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.18541}, 
}

Contact

Haozhe Si, Han Zhao

About

LESSViT: Robust Hyperspectral Representation Learning under Spectral Configuration Shift

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors