[CVPR 2026] Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval
Zhiheng Fu1 Β Β YupengΒ Hu1β, Β Β Qianyun Yang1 Β Β Shiqi Zhang1 Β Β Zhiwei Chen1 Β Β Zixu Li1
1School of Software, Shandong University Β Β ΒβΒ Corresponding authorΒ Β
Welcome to the official repository for Air-Know. This is about Noisy Correspondence Learning (NCL) and Composed Image Retrieval (CIR).
Disclaimer: This codebase is intended for research purposes.
- [2026-04-02] π All codes are released.
- [2026-02-21] π₯ Air-Know is accepted by CVPR 2026. Codes are coming soon.
Air-Know Pipeline (based on LAVIS)
- Experiment Results
- Install
- Project Structure
- Data Preparation
- Quick Start
- Acknowledgement
- Contact
- Related Projects
- Citation
Table 1. Performance comparison on FashionIQ validation set in terms of R@K (%). The best result under each noise ratio is highlighted in bold, while the second-best result is underlined. Table 2. Performance comparison on the CIRR test set in terms of R@K (%) and Rsub@K (%). The best and second-best results are highlighted in bold and underlined, respectively.π‘ Note for Fully-Supervised CIR Benchmarking:
π― The 0% noise setting in the tables below is equivalent to the traditional fully-supervised CIR paradigm. We highlight this0%block to facilitate direct and fair comparisons for researchers working on conventional supervised methods.
1. Clone the repository
git clone https://github.com/ZhihFu/Air-Know
cd Air-Know2. Setup Python Environment
The code is evaluated on Python 3.8.10 and CUDA 12.6. We recommend using Anaconda to create an isolated virtual environment:
conda create -n conesep python=3.8
conda activate conesep
# Install PyTorch (The evaluated environment uses Torch 2.1.0 with CUDA 12.1 compatibility)
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)
# Install core dependencies
pip install scikit-learn==1.3.2 transformers==4.25.0 salesforce-lavis==1.0.2 timm==0.9.16To help you navigate our codebase quickly, here is an overview of the main components:
βββ lavis/ # Core model directory (built upon LAVIS)
β βββ models/
β βββ blip2_models/
β βββ blip2_cir.py # π§ The core model implementation.
βββ train_BLIP2.py # π Main training script
βββ test_BLIP2.py # π§ͺ General evaluation script
βββ cirr_sub_BLIP2.py # π€ Script to generate submission files for the CIRR dataset
βββ datasets.py # π Data loading and processing utilities
βββ utils.py # π οΈ Helper functions (logging, metrics, etc.)
Before training or testing, you need to download and structure the datasets.
Download the CIRR / FashionIQ dataset from CIRR official repo and FashionIQ official repo.
Organize the data as follows:
βββ FashionIQ
β βββ captions
| | βββ cap.dress.[train | val].json
| | βββ cap.toptee.[train | val].json
| | βββ cap.shirt.[train | val].json
β βββ image_splits
| | βββ split.dress.[train | val | test].json
| | βββ split.toptee.[train | val | test].json
| | βββ split.shirt.[train | val | test].json
β βββ dress
| | βββ [B000ALGQSY.jpg | B000AY2892.jpg | B000AYI3L4.jpg |...]
β βββ shirt
| | βββ [B00006M009.jpg | B00006M00B.jpg | B00006M6IH.jpg | ...]
β βββ toptee
| | βββ [B0000DZQD6.jpg | B000A33FTU.jpg | B000AS2OVA.jpg | ...]
βββ CIRR
β βββ train
| | βββ [0 | 1 | 2 | ...]
| | | βββ [train-10108-0-img0.png | train-10108-0-img1.png | ...]
β βββ dev
| | βββ [dev-0-0-img0.png | dev-0-0-img1.png | ...]
β βββ test1
| | βββ [test1-0-0-img0.png | test1-0-0-img1.png | ...]
β βββ cirr
| | βββ captions
| | | βββ cap.rc2.[train | val | test1].json
| | βββ image_splits
| | | βββ split.rc2.[train | val | test1].json
(Note: Please modify datasets.py if your local data paths differ from the default setup.)
In our implementation, we introduce the noise_ratio parameter to simulate varying degrees of NTC (Noisy Triplet Correspondence) interference. You can reproduce the experimental results from the paper by modifying the --noise_ratio parameter (default options evaluated are 0.0, 0.2, 0.5, 0.8).
Training on FashionIQ:
python train_BLIP2.py \
--dataset fashioniq \
--fashioniq_path "/path/to/FashionIQ/" \
--model_dir "./checkpoints/fashioniq_noise0.8" \
--noise_ratio 0.8 \
--batch_size 256 \
--num_epochs 20 \
--lr 1e-5Training on CIRR:
python train_BLIP2.py \
--dataset cirr \
--cirr_path "/path/to/CIRR/" \
--model_dir "./checkpoints/cirr_noise0.8" \
--noise_ratio 0.8 \
--batch_size 256 \
--num_epochs 20 \
--lr 2e-5To generate the prediction files on the CIRR dataset for submission to the CIRR Evaluation Server, run the following command:
python src/cirr_test_submission.py checkpoints/cirr_noise0.8/(The corresponding script will automatically output .json based on the generated best checkpoints in the folder for online evaluation.)
This codebase is heavily inspired by and built upon the excellent Salesforce LAVIS, SPRC and TME library. We thank the authors for their open-source contributions.
For any questions, issues, or feedback, please open an issue on GitHub or reach out to us at fuzhiheng8@gmail.com
Ecosystem & Other Works from our Team
![]() TEMA (ACL'26) Web | Code | |
![]() ConeSep (CVPR'26) Web | Code | |
![]() HABIT (AAAI'26) Web | Code | Paper |
![]() ReTrack (AAAI'26) Web | Code | Paper |
![]() INTENT (AAAI'26) Web | Code | Paper |
![]() HUD (ACM MM'25) Web | Code | Paper |
![]() OFFSET (ACM MM'25) Web | Code | Paper |
![]() ENCODER (AAAI'25) Web | Code | Paper |
If you find our work or this code useful in your research, please consider leaving a StarβοΈ or Citingπ our paper π₯°. Your support is our greatest motivation!
@InProceedings{Air-Know,
title={Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval},
author={Fu, Zhiheng and Hu, Yupeng and Qianyun Yang and Shiqi Zhang and Chen, Zhiwei and Li, Zixu},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
year = {2026}
}This project is released under the terms of the LICENSE file included in this repository.











