Multiscale Spatial–spectral Transformer Network for Hyperspectral and Multispectral Image Fusion (INFFUS)

The code in this toolbox implements the "Multiscale Spatial–spectral Transformer Network for Hyperspectral and Multispectral Image Fusion". More specifically, it is detailed as follow.

Abstract: Fusing hyperspectral images (HSIs) and multispectral images (MSIs) is an economic and feasible way to obtain images with both high spectral resolution and spatial resolution. Due to the limited receptive field of convolution kernels, fusion methods based on convolutional neural networks (CNNs) fail to take advantage of the global relationship in a feature map. In this paper, to exploit the powerful capability of Transformer to extract global information from the whole feature map for fusion, we propose a novel Multiscale Spatial–spectral Transformer Network (MSST-Net). The proposed network is a two-branch network that integrates the self-attention mechanism of the Transformer to extract spectral features from HSI and spatial features from MSI, respectively. Before feature extraction, cross-modality concatenations are performed to achieve cross-modality information interaction between the two branches. Then, we propose a spectral Transformer (SpeT) to extract spectral features and introduce multiscale band/patch embeddings to obtain multiscale features through SpeTs and spatial Transformers (SpaTs). To further improve the network’s performance and generalization, we proposed a self-supervised pre-training strategy, in which a masked bands autoencoder (MBAE) and a masked patches autoencoder (MPAE) are specially designed for self-supervised pre-training of the SpeTs and SpaTs. Extensive experiments on simulated and real datasets illustrate that the proposed network can achieve better performance when compared to other state-of-the-art fusion methods. The code of MSST-Net will be available at http://www.jiasen.tech/papers/ for the sake of reproducibility.

Network Architecture

The overall architecture diagram of our proposed multiscale spatial–spectral Transformer network.

The architecture diagram of the masked patches autoencoder.

The architecture diagram of the masked bands autoencoder.

1. Create Envirement:

Python 3 (Recommend to use Anaconda)
NVIDIA GPU + CUDA

2. Data Preparation:

Download the CAVE dataset from here.
Download the Harvard dataset from here.
Download the WDCM dataset .mat files from here (code: gtgr) for a quick start and place them in MSST-Net/data/.

3. Pre-training

To pre-train MBAE, run

# Training on CAVE dataset
python main_hsi.py --save_dir ./train_hsi/cave/8/1 --dataset cave --ratio 8 --hsi_channel 31 --hsi_embed_dim 32 --hsi_mask_ratio 0.75 --device cuda:0

To pre-train MPAE, run

# Training on CAVE dataset
python main_msi.py --save_dir ./train_msi/cave/8/16 --dataset cave --ratio 8 --msi_channel 3 --msi_embed_dim 256 --hsi_mask_ratio 0.5 --patch_size 16 --device cuda:0

4. Fine-tuning

To fine-tune MSST-Net, run

# Training on CAVE dataset
python main.py --save_dir ./train/cave/8 --dataset cave --ratio 8 --hsi_channel 31 --msi_channel 3 --hsi_embed_dim 32 --msi_embed_dim 256 --n_feats 64 --patch_size 16  --hsi_model_path_1 ./train_hsi/8/1/model/model_05000.pt --hsi_model_path_2 ./train_hsi/8/2/model/model_05000.pt --hsi_model_path_3 ./train_hsi/8/3/model/model_05000.pt  --msi_model_path_16 ./train_msi/8/16/model/model_05000.pt --msi_model_path_8 ./train_msi/8/8/model/model_05000.pt --msi_model_path_32 ./train_msi/8/32/model/model_05000.pt --device cuda:0

5. Testing:

To test a trained model, run

# Testing on CAVE dataset
python test.py --save_dir ./test/cave/8 --dataset cave --ratio 8 --hsi_channel 31 --msi_channel 3 --hsi_embed_dim 32 --msi_embed_dim 256 --n_feats 64 --patch_size 16  --model_path ./train/cave/8/model/model_05000.pt --device cuda:0

Citation

If this repo helps you, please consider citing our works:

@ARTICLE{jia2023multiscale,
  title={Multiscale spatial-spectral transformer network for hyperspectral and multispectral image fusion},
  author={Jia, Sen and Min, Zhichao and Fu, Xiyou},
  journal={Information Fusion}, 
  year={2023},
  volume={96},
  pages={117-129}
}

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
data		data
dataset		dataset
figure		figure
model		model
README.md		README.md
argsParser.py		argsParser.py
argsParser_hsi.py		argsParser_hsi.py
argsParser_msi.py		argsParser_msi.py
main.py		main.py
main_hsi.py		main_hsi.py
main_msi.py		main_msi.py
metrics.py		metrics.py
test.py		test.py
trainer.py		trainer.py
trainer_hsi.py		trainer_hsi.py
trainer_msi.py		trainer_msi.py
utils.py		utils.py

jx-mzc/MSST-Net

Folders and files

Latest commit

History

Repository files navigation

Multiscale Spatial–spectral Transformer Network for Hyperspectral and Multispectral Image Fusion (INFFUS)

Network Architecture

1. Create Envirement:

2. Data Preparation:

3. Pre-training

4. Fine-tuning

5. Testing:

Citation

About

Resources

Stars

Watchers

Forks

Languages