🌍 FWDNNet

FWDNNet: Cross-Heterogeneous Encoder Fusion via Feature-Level TensorDot Operations for Land-Cover Mapping

✨ Official PyTorch Implementation ✨
🎉 31-December-2025 Accepted at IEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS) 🎉

Key Features • Getting Started • Datasets • Results • Citation • Contact

📢 Latest News

+ 🎊 31-December-2025: FWDNNet accepted for publication in IEEE TGRS!
+ 📝 October-November 2025: Manuscript submitted to IEEE TGRS
+ 🚀 October 2023: sKwanda_V1,2 datasets publicly released
+ 🚀 March 2023: Code and datasets developed

👥 Authors

Lead Authors

Boaz Mwubahimana¹ · Graduate Student Member, IEEE
Yan Jianguo¹² · Corresponding Author
Dingruibo Miao¹ · Corresponding Author

Co-Authors

Swalpa Kumar Roy³ · Senior Member, IEEE
Zhuohong Li⁴
Le Ma¹
Clarisse Kagoyire⁵
Haonan Guo¹ · Member, IEEE
Maurice Mugabowindekwe⁶
Elias Nyandwi⁵
Isaac Nzayisenga⁷
Hafashimana Athanase⁸
Eugene Maridadi⁹
Jean Baptiste Nsengiyumva¹⁰
Elie Byukusenge¹¹
Remy Dukundane¹²
Gaspard Rwanyiziri⁵
Xiao Huang¹³

🏛️ Affiliations

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University, China
Xinjiang Astronomical Observatory, Chinese Academy of Sciences, China
Department of Computer Science and Engineering, Tezpur University, India
Nicholas School of the Environment, Duke University, USA
Center for Geographic Information Systems and Remote Sensing (CGIS), University of Rwanda
Department of Geosciences and Natural Resource Management, University of Copenhagen, Denmark
College of Geography and Remote Sensing, Hohai University, China
AIMS Research and Innovation Centre & African Centre of Excellence in Data Science, University of Rwanda
Rwanda Environment Management Authority (REMA), Rwanda
WaterAid Rwanda, Kigali, Rwanda
Water for People Rwanda, Kigali, Rwanda
College of Engineering, Carnegie Mellon University, Rwanda
Department of Environmental Sciences, Emory University, USA

📧 Corresponding Authors:

Yan Jianguo (jgyan@whu.edu.cn)
Dingruibo Miao (miaodrb@whu.edu.cn)

📖 Abstract

We present FWDNNet, a novel encoder-decoder architecture that integrates heterogeneous deep learning backbones through innovative TensorDot fusion modules for high-resolution land cover mapping. Unlike traditional fusion approaches that rely on simple concatenation or averaging, FWDNNet preserves tensor structures while enabling adaptive, probabilistic feature weighting across five specialized backbone encoders.

🔑 Key Innovation

TensorDot Fusion: High-order multilinear transformations that capture complex inter-architectural dependencies
Probabilistic Attention: Variational inference-based adaptive backbone weighting
Heterogeneous Integration: Seamless fusion of CNNs (ResNet34, InceptionV3, VGG16, EfficientNet-B3) and Transformers (Swin-T)

🎯 Key Features

State-of-the-Art Accuracy
_{+2.2% over best baseline}

Superior Segmentation
_{+1.7% mIoU improvement}

Inference Efficiency
_{58.2ms per image}

Resource Efficient
_{12.85GB GPU memory}

🏆 Performance Highlights

Metric	FWDNNet	Best Baseline	Improvement
🎯 Overall Accuracy	95.3%	93.1%	+2.2% ↑
📊 mean IoU (mIoU)	91.8%	90.1%	+1.7% ↑
⚡ Inference Time	58.2ms	73.8ms	-21.1% ↓
💾 Memory Usage	12.85GB	95.74GB	-86.6% ↓
🔢 Parameters	35.0M	41.0M	-14.6% ↓
🌍 Transfer Score	97.1%	92.3%	+4.8% ↑

🏗️ Architecture

### Network Overview

graph TB
    A[Input Image<br/>H×W×C] --> B1[ResNet34]
    A --> B2[InceptionV3]
    A --> B3[VGG16]
    A --> B4[EfficientNet-B3]
    A --> B5[Swin Transformer]
    
    B1 --> C[TensorDot<br/>Fusion Module]
    B2 --> C
    B3 --> C
    B4 --> C
    B5 --> C
    
    C --> D[Probabilistic<br/>Attention]
    D --> E[Tucker<br/>Decomposer]
    E --> F[Unified<br/>Decoder]
    F --> G[Segmentation<br/>Output]
    
    style C fill:#ff9999
    style D fill:#99ccff
    style E fill:#99ff99
    style F fill:#ffcc99

🧩 Core Components

1️⃣ Heterogeneous Encoders (Click to expand)

Five specialized backbone networks for parallel feature extraction:

Encoder	Purpose	Key Feature
🔷 ResNet34	Residual Learning	Deep feature extraction
🔶 InceptionV3	Multi-scale	Multiple receptive fields
🔵 VGG16	Hierarchical	Layer-wise features
🟢 EfficientNet-B3	Efficiency	Compound scaling
🟣 Swin Transformer	Global Context	Shifted window attention

2️⃣ TensorDot Fusion Module (Click to expand)

Mathematical Formulation:

𝒯_fused = 𝒢 ×₁ 𝒯₁ ×₂ 𝒯₂ ... ×_M 𝒯_M

Preserves tensor structure
Captures high-order interactions
Learnable core tensor 𝒢

3️⃣ Probabilistic Attention (Click to expand)

Variational Inference Weighting:

𝒲_att = softmax(f_θ(𝒯₁, 𝒯₂, ..., 𝒯_M))

Adaptive backbone selection
Scene-dependent weighting
Reduces feature redundancy

4️⃣ Multi-Objective Loss (Click to expand)

Comprehensive Loss Function:

ℒ_total = ℒ_focal + λ₁ℒ_consist + λ₂ℒ_uncert + λ₃ℒ_div + λ₄ℒ_sparse + λ₅ℒ_bound

Focal loss for class imbalance
Consistency regularization
Uncertainty estimation
Diversity promotion
Boundary preservation

📊 Datasets

🌐 Multi-Regional Coverage

🏙️ Dubai Dataset

Urban Landscapes

📍 Location: UAE
🛰️ Sensor: WorldView-3, QuickBird
📏 Resolution: 0.31-2.4m
🖼️ Images: 1,500
📐 Area: 450 km²
🏷️ Classes: Built-up, Vegetation, Water, Other

🌾 Nyagatare Dataset

Agricultural Lands

📍 Location: Rwanda
🛰️ Sensor: Google Earth
📏 Resolution: 0.5-1.07m
🖼️ Images: 2,200
📐 Area: 1,200 km²
🏷️ Classes: Crops, Grassland, Forest, Water

🌾 Oklahoma Dataset

Great Plains

📍 Location: USA
🛰️ Sensor: NAIP
📏 Resolution: 0.5-0.60m
🖼️ Images: 1,800
📐 Area: 2,800 km²
🏷️ Classes: 7 land cover types

📥 Download Links

📁 Dataset Structure

data/
├── Dubai/
│   ├── train/
│   │   ├── images/          # 512×512 RGB patches
│   │   └── labels/          # Ground truth masks
│   ├── val/
│   │   ├── images/
│   │   └── labels/
│   └── test/
│       ├── images/
│       └── labels/
├── Nyagatare/
│   └── [same structure]
└── Oklahoma/
    └── [same structure]

🚀 Getting Started

📋 Requirements

Core Dependencies

Python >= 3.7
PyTorch >= 1.7.1
torchvision >= 0.8.2
CUDA >= 10.1

Additional Packages

opencv-python >= 4.5.5
numpy >= 1.19.5
matplotlib >= 3.3.4
scikit-learn >= 0.24.2
wandb >= 0.13.10

⚙️ Installation

# 1️⃣ Clone the repository
git clone https://github.com/YourUsername/FWDNNet.git
cd FWDNNet

# 2️⃣ Create conda environment
conda create -n fwdnnet python=3.7 -y
conda activate fwdnnet

# 3️⃣ Install PyTorch (CUDA 10.1)
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch

# 4️⃣ Install other dependencies
pip install -r requirements.txt

# 5️⃣ Verify installation
python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda.is_available()}')"

📦 Quick Setup

# Download datasets and models
bash scripts/download_data.sh

# Prepare dataset
python utils/prepare_dataset.py --dataset all

# Run quick test
python test_installation.py

🎓 Training

🏃 Quick Start Training

# Train on Dubai dataset (default config)
python train.py --dataset Dubai

# Train with custom config
python train.py --config configs/fwdnnet_dubai.yaml

# Multi-GPU training
python -m torch.distributed.launch --nproc_per_node=4 train.py --dataset Dubai

⚙️ Training Configuration

📝 Configuration File Example (Click to expand)

# configs/fwdnnet_dubai.yaml
model:
  name: FWDNNet
  encoders:
    - resnet34
    - inception_v3
    - vgg16
    - efficientnet_b3
    - swin_t
  fusion:
    type: tensordot
    tucker_rank: [64, 64, 64]
  attention:
    type: probabilistic
    temperature: 0.1

training:
  batch_size: 16
  epochs: 200
  learning_rate: 1e-3
  optimizer:
    name: AdamW
    betas: [0.9, 0.999]
    weight_decay: 0.01
  scheduler:
    name: ExponentialLR
    gamma: 0.95
    step: 10
  early_stopping:
    patience: 20
    monitor: val_miou

loss:
  focal_weight: 1.0
  consistency_weight: 0.5
  uncertainty_weight: 0.3
  diversity_weight: 0.2
  boundary_weight: 0.4

data:
  input_size: [512, 512]
  num_classes: 4
  augmentation:
    flip: 0.5
    rotate: 45
    elastic: true
    gaussian_noise: 0.02

📊 Monitor Training

# With Weights & Biases
python train.py --dataset Dubai --use_wandb

# With TensorBoard
python train.py --dataset Dubai --use_tensorboard
tensorboard --logdir=runs/

🔄 Resume Training

# Resume from checkpoint
python train.py --dataset Dubai --resume checkpoints/fwdnnet_epoch_50.pth

# Resume with different learning rate
python train.py --resume checkpoints/fwdnnet_epoch_50.pth --lr 1e-4

🧪 Evaluation & Inference

📈 Evaluation

# Evaluate on test set
python test.py --dataset Dubai --checkpoint checkpoints/fwdnnet_best.pth

# Evaluate with visualization
python test.py --dataset Dubai --checkpoint checkpoints/fwdnnet_best.pth --visualize

# Cross-domain evaluation
python test.py \
  --source_dataset Dubai \
  --target_dataset Nyagatare \
  --checkpoint checkpoints/fwdnnet_dubai.pth

🖼️ Inference

# Single image inference
python inference.py \
  --input path/to/image.tif \
  --checkpoint checkpoints/fwdnnet_best.pth \
  --output results/prediction.png

# Batch inference
python inference.py \
  --input_dir path/to/images/ \
  --checkpoint checkpoints/fwdnnet_best.pth \
  --output_dir results/

# Large-scale inference (tiled processing)
python inference_large.py \
  --input large_image.tif \
  --checkpoint checkpoints/fwdnnet_best.pth \
  --tile_size 512 \
  --overlap 50 \
  --output result_mosaic.tif

📊 Results

🏆 Quantitative Performance

Overall Performance Comparison

Model	Accuracy (%)	mIoU (%)	F1-Score	Inference (ms)	Params (M)	Memory (GB)
ResNet-34	93.5	89.0	0.800	45.2	24.0	93.30
InceptionV3	80.1	84.0	0.832	52.7	30.0	114.19
VGG-16	82.0	88.0	0.729	38.4	24.0	90.61
EfficientNet-B3	91.8	87.3	0.825	28.6	12.0	45.67
Swin-T	89.2	86.1	0.847	67.3	28.0	78.32
SegFormer-B2	92.4	88.7	0.859	41.2	25.0	62.48
HRNet-W32	94.1	90.1	0.862	73.8	41.0	95.74
FWDNNet	95.3	91.8	0.876	58.2	35.0	12.85

🌍 Dataset-Specific Results

Dubai	Nyagatare	Oklahoma
`OA: 96.1% mIoU: 92.4% Built: 95.8% Water: 96.7%`	`OA: 94.7% mIoU: 91.1% Crops: 94.3% Forest: 93.2%`	`OA: 95.0% mIoU: 91.6% Veg: 92.7% Water: 95.1%`

🔄 Cross-Domain Transfer

Source → Target	Source mIoU	Target mIoU	Transfer Score
Dubai → Nyagatare	92.4%	90.1%	97.5%
Dubai → Oklahoma	92.4%	89.3%	96.6%
Nyagatare → Oklahoma	91.1%	89.8%	98.6%

💡 Ablation Study

Configuration	mIoU (%)	Δ mIoU
Single Encoder (ResNet34)	89.0	-
Multi-Encoder (Avg)	90.1	+1.1%
+ TensorDot Fusion	91.3	+2.3%
+ Probabilistic Attention	91.8	+2.8%
Full FWDNNet	91.8	+2.8%

⚡ Computational Efficiency

Metric	FWDNNet	HRNet-W32	Improvement
🕐 Training Time	6.2h	13.4h	-53.7% ⬇️
💾 Memory Usage	12.85GB	95.74GB	-86.6% ⬇️
🔢 Parameters	35.0M	41.0M	-14.6% ⬇️
🔄 FLOPs	45.2G	52.1G	-13.2% ⬇️
📈 Throughput	17.2 img/s	13.5 img/s	+27.4% ⬆️

🖼️ Visualizations

🎨 Qualitative Results

📉 Training Curves

🔍 Feature Maps: (attention weight visualizations)

(qualitative comparison figures)

🗺️ Large-Scale Mapping

(regional-scale inference results)

🙏 Acknowledgments

This work was supported by:

National Natural Science Foundation of China (Grant Nos. 42241116 and 42071332)
National Key R&D Program of China (Grant Nos. 2022YFF0503202 and 2022YFB3903605)
Macau Science and Technology Development Fund (SKL-LPS(MUST)-2021-2023)
Xinjiang Heaven Lake Talent Program (2022)

We acknowledge support from:

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University
National Land Authority of Rwanda
Mohammed Bin Rashid Space Centre (MBRSC) for satellite imagery
U.S. Department of Agriculture's National Agriculture Imagery Program (NAIP) for aerial imagery

📝 Citation

If you find this work useful in your research, please consider citing:

@ARTICLE{11343844,
  author={Mwubahimana, Boaz and Jianguo, Yan and Miao, Dingruibo and Roy, Swalpa Kumar and Li, Zhuohong and Ma, Le and Kagoyire, Clarisse and Guo, Haonan and Mugabowindekwe, Maurice and Nyandwi, Elias and Nzayisenga, Isaac and Athanase, Hafashimana and Maridadi, Eugene and Nsengiyumva, Jean Baptiste and Byukusenge, Elie and Dukundane, Remy and Rwanyiziri, Gaspard and Huang, Xiao},
  journal={IEEE Transactions on Geoscience and Remote Sensing}, 
  title={FWDNNet: Cross-Heterogeneous Encoder Fusion via Feature-Level TensorDot Operations for Land-Cover Mapping}, 
  year={2026},
  volume={},
  number={},
  pages={1-1},
  keywords={Remote sensing;Transformers;Computer architecture;Feature extraction;Semantic segmentation;Computational modeling;Computational efficiency;Semantics;Land surface;Faces;Convolutional neural networks (CNNs);deep learning;CNN-to-token conversion;TensorDot fusion;remote sensing (RS) segmentation},
  doi={10.1109/TGRS.2026.3652451}}

📚 Related Publications

Our previous works on land cover mapping:

@ARTICLE{11124258,
  author={Mwubahimana, Boaz and Jianguo, Yan and Miao, Dingruibo and Li, Zhuohong and Guo, Haonan and Ma, Le and Mugabowindekwe, Maurice and Roy, Swalpa Kumar and Huang, Xiao and Nyandwi, Elias and Joseph, Tuyishimire and Habineza, Eric and Mwizerwa, Fidele and Athanase, Hafashimana and Rwanyiziri, Gaspard},
  journal={IEEE Transactions on Geoscience and Remote Sensing}, 
  title={C2FNet: Cross-Probabilistic Weak Supervision Learning for High-Resolution Land Cover Enhancement}, 
  year={2025},
  volume={63},
  number={},
  pages={1-30},
  keywords={Spatial resolution;Remote sensing;Land surface;Weak supervision;Training;Feature extraction;Annotations;Image resolution;Noise measurement;Earth;Coarse-to-fine networks (C2FNets);cross-resolution learning;deep neural networks;Earth observation;land cover mapping;probabilistic supervision;remote sensing;weakly supervised learning (WSL)},
  doi={10.1109/TGRS.2025.3598681}}

@article{Mwubahimana19052025,
  author = {Boaz Mwubahimana and Yan Jianguo and Maurice Mugabowindekwe and Xiao Huang and Elias Nyandwi and Joseph Tuyishimire and Eric Habineza and Fidele Mwizerwa and Dingruibo Miao},
  title = {Vision transformer-based feature harmonization network for fine-resolution land cover mapping},
  journal = {International Journal of Remote Sensing},
  volume = {46},
  number = {10},
  pages = {3736--3769},
  year = {2025},
  publisher = {Taylor \& Francis},
  doi = {10.1080/01431161.2025.2491816}
}

🔗 Related Resources

🔬 VHF-ParaNet: Vision Transformers Feature Harmonization [Code] [Paper]
📊 GLC10 Dataset: Global Land Cover at 10m resolution [Link]
🛰️ Google Earth Engine: Satellite imagery access [Link]
🗺️ ESRI Land Cover: Global land cover products [Link]

📞 Contact

💬 Get in Touch

For questions, collaborations, or issues:

📧 Corresponding Authors:

Yan Jianguo: jgyan@whu.edu.cn
Dingruibo Miao: miaodrb@whu.edu.cn

📧 Lead Author:

Boaz Mwubahimana: aiboaz1896@gmail.com | m.boaz@whu.edu.cn

🐛 Issues & Contributions:

Open an Issue
Submit a Pull Request

📄 License

Copyright (c) 2025 Wuhan University, State Key Laboratory of LIESMARS

This code and datasets are released for NON-COMMERCIAL and RESEARCH purposes only.

For commercial applications, please contact the corresponding authors:
- Yan Jianguo (jgyan@whu.edu.cn)
- Dingruibo Miao (miaodrb@whu.edu.cn)

Licensed under the MIT License for research purposes.

⭐ Star History

If you find this work helpful, please consider giving us a ⭐!

🎓 Recommended Citation Format

IEEE Style:

APA Style:

Mwubahimana, B., Yan, J., Miao, D., Roy, S. K., Li, Z., Ma, L., ... & Huang, X. (2025). FWDNNet: Cross-Heterogeneous Encoder Fusion via Feature-Level TensorDot Operations for Land-Cover Mapping. IEEE Transactions on Geoscience and Remote Sensing. [Accepted for publication]

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
FWDNNet_loss_function.py		FWDNNet_loss_function.py
README.md		README.md
inference.py		inference.py
mainfwdnnet.py		mainfwdnnet.py
train.py		train.py
utils.py		utils.py

BoazGithub/FWDNNet

Folders and files

Latest commit

History

Repository files navigation