Skip to content

MiliLab/Any2Any

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Any2Any: Unified Arbitrary Modality Translation for Remote Sensing

Haoyang Chen1,2, Jing Zhang1,2 †, Hebaixu Wang1,2, Shiqin Wang1, PoHsun Huang1, Jiayuan Li2,3, Haonan Guo1,2, Di Wang1,2 †, Zheng Wang1,2 †, Bo Du1,2 †.

1 Wuhan University, 2 Zhongguancun Academy, 3 Beijing Institute of Technology.

† Corresponding author

News | Abstract | Datasets | Checkpoints | Usage | Statement

πŸ”₯ Update

2026.3.5

🌞 Abstract

Multi-modal remote sensing imagery provides complementary observations of the same geographic scene, yet such observations are frequently incomplete in practice. Existing cross-modal translation methods treat each modality pair as an independent task, resulting in quadratic complexity and limited generalization to unseen modality combinations. We formulate Any-to-Any translation as inference over a shared latent representation of the scene, where different modalities correspond to partial observations of the same underlying semantics. Based on this formulation, we propose Any2Any, a unified latent diffusion framework that projects heterogeneous inputs into a geometrically aligned latent space. Such structure performs anchored latent regression with a shared backbone, decoupling modality-specific representation learning from semantic mapping. Moreover, lightweight target-specific residual adapters are used to correct systematic latent mismatches without increasing inference complexity. To support learning under sparse but connected supervision, we introduce RST-1M, the first million-scale remote sensing dataset with paired observations across five sensing modalities, providing supervision anchors for any-to-any translation. Experiments across 14 translation tasks show that Any2Any consistently outperforms pairwise translation methods and exhibits strong zero-shot generalization to unseen modality pairs.

Figure 1. Quantitative comparison between our proposed Any2Any and other compar methods.

πŸ“– Datasets

Figure 2. Statistics and example images of the RST-1M dataset.

The RST-1M is Coming Soon.

πŸš€ Model

Figure 3. Overview of the Any2Any framework.

The checkpoints are Coming Soon.

πŸ”¨ Usage

Training

Wait for update.

Inference

We provide an inference script:

Wait for update.

🍭 Results

Figure 4. Qualitative comparison between our proposed Any2Any and other compar methods.

Figure 5. Qualitative results of our method on unseen remote sensing modality translation tasks with missing paired training data.

⭐ Citation

If you find Any2Any helpful, please give a ⭐ and cite it as follows:


@article{chen2026Any2Any,
  title={Any2Any: Unified Arbitrary Modality Translation for Remote Sensing},
  author={Chen, Haoyang and Zhang, Jing and Wang, Hebaixu and Wang, Shiqin and Huang, Pohsun and Li, Jiayuan and Guo, Haonan and Wang, Di and Wang, Zheng and Du, Bo},
  journal={arXiv preprint arXiv:2603.04114},
  year={2026}
}

🎺 Statement

For any other questions please contact Haoyang Chen at whu.edu.cn.

About

Official repo for "Any2Any: Unified Arbitrary Modality Translation for Remote Sensing"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages