Any2Any: Unified Arbitrary Modality Translation for Remote Sensing

Haoyang Chen^1,2, Jing Zhang^{1,2 †}, Hebaixu Wang^1,2, Shiqin Wang¹, PoHsun Huang¹, Jiayuan Li^2,3, Haonan Guo^1,2, Di Wang^{1,2 †}, Zheng Wang^{1,2 †}, Bo Du^{1,2 †}.

¹ Wuhan University, ² Zhongguancun Academy, ³ Beijing Institute of Technology.

^† Corresponding author

🔥 Update

2026.3.5

The paper is post on arXiv (arXiv 2603.04114)

🌞 Abstract

Multi-modal remote sensing imagery provides complementary observations of the same geographic scene, yet such observations are frequently incomplete in practice. Existing cross-modal translation methods treat each modality pair as an independent task, resulting in quadratic complexity and limited generalization to unseen modality combinations. We formulate Any-to-Any translation as inference over a shared latent representation of the scene, where different modalities correspond to partial observations of the same underlying semantics. Based on this formulation, we propose Any2Any, a unified latent diffusion framework that projects heterogeneous inputs into a geometrically aligned latent space. Such structure performs anchored latent regression with a shared backbone, decoupling modality-specific representation learning from semantic mapping. Moreover, lightweight target-specific residual adapters are used to correct systematic latent mismatches without increasing inference complexity. To support learning under sparse but connected supervision, we introduce RST-1M, the first million-scale remote sensing dataset with paired observations across five sensing modalities, providing supervision anchors for any-to-any translation. Experiments across 14 translation tasks show that Any2Any consistently outperforms pairwise translation methods and exhibits strong zero-shot generalization to unseen modality pairs.

Figure 1. Quantitative comparison between our proposed Any2Any and other compar methods.

📖 Datasets

Figure 2. Statistics and example images of the RST-1M dataset.

The RST-1M is Coming Soon.

🚀 Model

Figure 3. Overview of the Any2Any framework.

The checkpoints are Coming Soon.

🔨 Usage

Training

Wait for update.

Inference

We provide an inference script:

Wait for update.

🍭 Results

Figure 4. Qualitative comparison between our proposed Any2Any and other compar methods.

Figure 5. Qualitative results of our method on unseen remote sensing modality translation tasks with missing paired training data.

⭐ Citation

If you find Any2Any helpful, please give a ⭐ and cite it as follows:


@article{chen2026Any2Any,
  title={Any2Any: Unified Arbitrary Modality Translation for Remote Sensing},
  author={Chen, Haoyang and Zhang, Jing and Wang, Hebaixu and Wang, Shiqin and Huang, Pohsun and Li, Jiayuan and Guo, Haonan and Wang, Di and Wang, Zheng and Du, Bo},
  journal={arXiv preprint arXiv:2603.04114},
  year={2026}
}

🎺 Statement

For any other questions please contact Haoyang Chen at whu.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Images		Images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Any2Any: Unified Arbitrary Modality Translation for Remote Sensing

🔥 Update

🌞 Abstract

📖 Datasets

🚀 Model

🔨 Usage

Training

Inference

🍭 Results

⭐ Citation

🎺 Statement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Folders and files

Latest commit

History

Repository files navigation

Any2Any: Unified Arbitrary Modality Translation for Remote Sensing

🔥 Update

🌞 Abstract

📖 Datasets

🚀 Model

🔨 Usage

Training

Inference

🍭 Results

⭐ Citation

🎺 Statement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Packages