Skip to content

iLearn-Lab/CVPR26-Air-Know

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

26 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

[CVPR 2026] Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval

Zhiheng Fu1 Β Β  YupengΒ Hu1βœ‰, Β Β  Qianyun Yang1 Β Β  Shiqi Zhang1 Β Β  Zhiwei Chen1 Β Β  Zixu Li1

1School of Software, Shandong University Β Β Β 
βœ‰Β Corresponding authorΒ Β 

AirKnow Teaser

arXiv Paper page Author Page PyTorch Python stars

πŸ“Œ Introduction

Welcome to the official repository for Air-Know. This is about Noisy Correspondence Learning (NCL) and Composed Image Retrieval (CIR).

Disclaimer: This codebase is intended for research purposes.

πŸ“’ News and Updates

  • [2026-04-02] πŸš€ All codes are released.
  • [2026-02-21] πŸ”₯ Air-Know is accepted by CVPR 2026. Codes are coming soon.

Air-Know Pipeline (based on LAVIS)

airknow architecture

Figure 1. The proposed Air-Know consists of three primary modules: (a) External Prior Arbitration leverages an offline multimodal expert to generate reliable arbitration priors for CIR triplets, bypassing the unreliable small-loss hypothesis. (b) Expert-Knowledge Internalization transfers these priors into a lightweight proxy network, structurally preventing the memorization of ambiguous partial matches. Finally, (c) Dual-Stream Reconciliation dynamically integrates the internalized knowledge to provide robust online feedback, guiding the final representation learning. Figure best viewed in color.

Table of Contents

πŸƒβ€β™‚οΈ Experiment-Results

CIR Task Performance

πŸ’‘ Note for Fully-Supervised CIR Benchmarking:
🎯 The 0% noise setting in the tables below is equivalent to the traditional fully-supervised CIR paradigm. We highlight this 0% block to facilitate direct and fair comparisons for researchers working on conventional supervised methods.

FashionIQ:

Table 1. Performance comparison on FashionIQ validation set in terms of R@K (%). The best result under each noise ratio is highlighted in bold, while the second-best result is underlined.

CIRR:

Table 2. Performance comparison on the CIRR test set in terms of R@K (%) and Rsub@K (%). The best and second-best results are highlighted in bold and underlined, respectively.

⬆ Back to top


πŸ“¦ Install

1. Clone the repository

git clone https://github.com/ZhihFu/Air-Know
cd Air-Know

2. Setup Python Environment

The code is evaluated on Python 3.8.10 and CUDA 12.6. We recommend using Anaconda to create an isolated virtual environment:

conda create -n conesep python=3.8
conda activate conesep

# Install PyTorch (The evaluated environment uses Torch 2.1.0 with CUDA 12.1 compatibility)
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)

# Install core dependencies
pip install scikit-learn==1.3.2 transformers==4.25.0 salesforce-lavis==1.0.2 timm==0.9.16

⬆ Back to top


πŸ“‚ Project Structure

To help you navigate our codebase quickly, here is an overview of the main components:

β”œβ”€β”€ lavis/                 # Core model directory (built upon LAVIS)
β”‚   └── models/
β”‚       └── blip2_models/
β”‚           └── blip2_cir.py   # 🧠 The core model implementation.
β”œβ”€β”€ train_BLIP2.py        # πŸš‚ Main training script
β”œβ”€β”€ test_BLIP2.py                # πŸ§ͺ General evaluation script
β”œβ”€β”€ cirr_sub_BLIP2.py      # πŸ“€ Script to generate submission files for the CIRR dataset
β”œβ”€β”€ datasets.py            # πŸ“Š Data loading and processing utilities
└── utils.py               # πŸ› οΈ Helper functions (logging, metrics, etc.)

πŸ’Ύ Data Preparation

Before training or testing, you need to download and structure the datasets.

Download the CIRR / FashionIQ dataset from CIRR official repo and FashionIQ official repo.

Organize the data as follows:

1) FashionIQ:

β”œβ”€β”€ FashionIQ
β”‚   β”œβ”€β”€ captions
|   |   β”œβ”€β”€ cap.dress.[train | val].json
|   |   β”œβ”€β”€ cap.toptee.[train | val].json
|   |   β”œβ”€β”€ cap.shirt.[train | val].json

β”‚   β”œβ”€β”€ image_splits
|   |   β”œβ”€β”€ split.dress.[train | val | test].json
|   |   β”œβ”€β”€ split.toptee.[train | val | test].json
|   |   β”œβ”€β”€ split.shirt.[train | val | test].json

β”‚   β”œβ”€β”€ dress
|   |   β”œβ”€β”€ [B000ALGQSY.jpg | B000AY2892.jpg | B000AYI3L4.jpg |...]

β”‚   β”œβ”€β”€ shirt
|   |   β”œβ”€β”€ [B00006M009.jpg | B00006M00B.jpg | B00006M6IH.jpg | ...]

β”‚   β”œβ”€β”€ toptee
|   |   β”œβ”€β”€ [B0000DZQD6.jpg | B000A33FTU.jpg | B000AS2OVA.jpg | ...]

2) CIRR:

β”œβ”€β”€ CIRR
β”‚   β”œβ”€β”€ train
|   |   β”œβ”€β”€ [0 | 1 | 2 | ...]
|   |   |   β”œβ”€β”€ [train-10108-0-img0.png | train-10108-0-img1.png | ...]

β”‚   β”œβ”€β”€ dev
|   |   β”œβ”€β”€ [dev-0-0-img0.png | dev-0-0-img1.png | ...]

β”‚   β”œβ”€β”€ test1
|   |   β”œβ”€β”€ [test1-0-0-img0.png | test1-0-0-img1.png | ...]

β”‚   β”œβ”€β”€ cirr
|   |   β”œβ”€β”€ captions
|   |   |   β”œβ”€β”€ cap.rc2.[train | val | test1].json
|   |   β”œβ”€β”€ image_splits
|   |   |   β”œβ”€β”€ split.rc2.[train | val | test1].json

(Note: Please modify datasets.py if your local data paths differ from the default setup.)

⬆ Back to top


πŸš€ Quick Start

1. Training under Noisy Settings

In our implementation, we introduce the noise_ratio parameter to simulate varying degrees of NTC (Noisy Triplet Correspondence) interference. You can reproduce the experimental results from the paper by modifying the --noise_ratio parameter (default options evaluated are 0.0, 0.2, 0.5, 0.8).

Training on FashionIQ:

python train_BLIP2.py \
    --dataset fashioniq \
    --fashioniq_path "/path/to/FashionIQ/" \
    --model_dir "./checkpoints/fashioniq_noise0.8" \
    --noise_ratio 0.8 \
    --batch_size 256 \
    --num_epochs 20 \
    --lr 1e-5

Training on CIRR:

python train_BLIP2.py \
    --dataset cirr \
    --cirr_path "/path/to/CIRR/" \
    --model_dir "./checkpoints/cirr_noise0.8" \
    --noise_ratio 0.8 \
    --batch_size 256 \
    --num_epochs 20 \
    --lr 2e-5

2. Testing

To generate the prediction files on the CIRR dataset for submission to the CIRR Evaluation Server, run the following command:

python src/cirr_test_submission.py checkpoints/cirr_noise0.8/

(The corresponding script will automatically output .json based on the generated best checkpoints in the folder for online evaluation.)

⬆ Back to top


πŸ™ Acknowledgements

This codebase is heavily inspired by and built upon the excellent Salesforce LAVIS, SPRC and TME library. We thank the authors for their open-source contributions.

⬆ Back to top

βœ‰οΈ Contact

For any questions, issues, or feedback, please open an issue on GitHub or reach out to us at fuzhiheng8@gmail.com

⬆ Back to top

πŸ”— Related Projects

Ecosystem & Other Works from our Team

TEMA
TEMA (ACL'26)
Web | Code |
ConeSep
ConeSep (CVPR'26)
Web | Code |
HABIT
HABIT (AAAI'26)
Web | Code | Paper
ReTrack
ReTrack (AAAI'26)
Web | Code | Paper
INTENT
INTENT (AAAI'26)
Web | Code | Paper
HUD
HUD (ACM MM'25)
Web | Code | Paper
OFFSET
OFFSET (ACM MM'25)
Web | Code | Paper
ENCODER
ENCODER (AAAI'25)
Web | Code | Paper

πŸ“β­οΈ Citation

If you find our work or this code useful in your research, please consider leaving a Star⭐️ or CitingπŸ“ our paper πŸ₯°. Your support is our greatest motivation!

@InProceedings{Air-Know,
    title={Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval},
    author={Fu, Zhiheng and Hu, Yupeng and Qianyun Yang and Shiqi Zhang and Chen, Zhiwei and Li, Zixu},
    booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    year = {2026}
}

⬆ Back to top


πŸ“„ License

This project is released under the terms of the LICENSE file included in this repository.


If this project helps you, please leave a Star!

GitHub stars

About

[CVPR 2026] Official repository of Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages