Skip to content

BMC-SDNU/Cross-Modal-Retrieval

Repository files navigation

Cross-modal Retrieval

1. Introduction

This library is an open-source repository that contains cross-modal retrieval methods and codes.

2. Supported Methods

The currently supported algorithms include:

[Click to expand]

2.1 Unsupervised cross-modal hashing retrieval

[Click to expand]

2.1.1 Unsupervised shallow cross-modal hashing retrieval

[Click to expand]

2.1.1.1 Matrix Factorization

[Click to expand]
2017
  • RFDH:Robust and Flexible Discrete Hashing for Cross-Modal Similarity Search(TCSVT) [PDF] [Code]
2015
  • STMH:Semantic Topic Multimodal Hashing for Cross-Media Retrieval(IJCAI)[PDF]
2014
  • LSSH:Latent Semantic Sparse Hashing for Cross-Modal Similarity Search(SIGIR)[PDF]

  • CMFH:Collective Matrix Factorization Hashing for Multimodal Data(CVPR)[PDF]

2.1.1.2 Graph Theory

[Click to expand]
2018
  • HMR:Hetero-Manifold Regularisation for Cross-Modal Hashing(TPAMI)[PDF]
2017
  • FSH:Cross-Modality Binary Code Learning via Fusion Similarity Hashing(CVPR)[PDF][Code]
2014
  • SM2H:Sparse Multi-Modal Hashing(TMM)[PDF]
2013
  • IMH:Inter-Media Hashing for Large-scale Retrieval from Heterogeneous Data Sources(SIGMOD)[PDF]

  • LCMH:Linear Cross-Modal Hashing for Efficient Multimedia Search(MM)[PDF]

2011
  • CVH:Learning Hash Functions for Cross-View Similarity Search(IJCAI)[PDF]

2.1.1.3 Other Shallow

[Click to expand]
2019
  • CRE:Collective Reconstructive Embeddings for Cross-Modal Hashing(TIP)[PDF]
2018
  • HMR:Hetero-Manifold Regularisation for Cross-Modal Hashing(TPAMI)[PDF]
2015
  • FS-LTE:Full-Space Local Topology Extraction for Cross-Modal Retrieval(TIP)[PDF]
2014
  • IMVH:Iterative Multi-View Hashing for Cross Media Indexing(MM)[PDF]
2013
  • PDH:Predictable Dual-View Hashing(ICML)[PDF]

2.1.1.4 Quantization

[Click to expand]
2016
  • CCQ:Composite Correlation Quantization for Efficient Multimodal Retrieval(SIGIR)[PDF]

  • CMCQ:Collaborative Quantization for Cross-Modal Similarity Search(CVPR)[PDF]

2015
  • ACQ:Alternating Co-Quantization for Cross-modal Hashing(ICCV)[PDF]

2.1.2 Unsupervised deep cross-modal hashing retrieval

[Click to expand]

2.1.2.1 Naive Network

[Click to expand]
2019
  • UDFCH:Unsupervised Deep Fusion Cross-modal Hashing(ICMI)[PDF]
2018
  • UDCMH:Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval(IJCAI)[PDF]
2017
  • DBRC:Deep Binary Reconstruction for Cross-modal Hashing(MM)[PDF]
2015
  • DMHOR:Learning Compact Hash Codes for Multimodal Representations Using Orthogonal Deep Structure(TMM)[PDF]

2.1.2.2 GAN

[Click to expand]
2020
  • MGAH:Multi-Pathway Generative Adversarial Hashing for Unsupervised Cross-Modal Retrieval(TMM)[PDF]
2019
  • CYC-DGH:Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval(TIP)[PDF]

  • UCH:Coupled CycleGAN: Unsupervised Hashing Network for Cross-Modal Retrieval(AAAI)[PDF]

2018
  • UGACH:Unsupervised Generative Adversarial Cross-modal Hashing(AAAI)[PDF][Code]

2.1.2.3 Graph Model

[Click to expand]
2022
  • ASSPH:Adaptive Structural Similarity Preserving for Unsupervised Cross Modal Hashing(MM)[PDF]
2021
  • AGCH:Aggregation-based Graph Convolutional Hashing for Unsupervised Cross-modal Retrieval(TMM)[PDF]

  • DGCPN:Deep Graph-neighbor Coherence Preserving Network for Unsupervised Cross-modal Hashing(AAAI)[PDF][Code]

2020
  • DCSH:Unsupervised Deep Cross-modality Spectral Hashing(TIP)[PDF]

  • SRCH:Set and Rebase: Determining the Semantic Graph Connectivity for Unsupervised Cross-Modal Hashing(IJCAI)[PDF]

  • JDSH:Joint-modal Distribution-based Similarity Hashing for Large-scale Unsupervised Deep Cross-modal Retrieval(SIGIR)[PDF][Code]

  • DSAH:Deep Semantic-Alignment Hashing for Unsupervised Cross-Modal Retrieval(ICMR)[PDF][Code]

2019
  • DJSRH:Deep Joint-Semantics Reconstructing Hashing for Large-Scale Unsupervised Cross-Modal Retrieval(ICCV)[PDF][Code]

2.1.2.4 Knowledge Distillation

[Click to expand]
2022
  • DAEH:Deep Adaptively-Enhanced Hashing With Discriminative Similarity Guidance for Unsupervised Cross-Modal Retrieval(TCSVT)[PDF]
2021
  • KDCMH:Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval(ICMR)[PDF]

  • JOG:Joint-teaching: Learning to Refine Knowledge for Resource-constrained Unsupervised Cross-modal Retrieval(MM)[PDF]

2020
  • UKD:Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing(CVPR)[PDF]

2.2 Supervised-cross-modal-hashing-retrieval

[Click to expand]

2.2.1 Supervised shallow cross-modal hashing retrieval

[Click to expand]

2.2.1.1 Matrix Factorization

[Click to expand]
2022
  • SCLCH: Joint Specifics and Consistency Hash Learning for Large-Scale Cross-Modal Retrieval(TIP) [PDF]
2020
  • BATCH: A Scalable Asymmetric Discrete Cross-Modal Hashing(TKDE) [PDF] [Code]
2019
  • LCMFH: Label Consistent Matrix Factorization Hashing for Large-Scale Cross-Modal Similarity Search(TPAMI) [PDF]

  • TECH: A Two-Step Cross-Modal Hashing by Exploiting Label Correlations and Preserving Similarity in Both Steps(MM) [PDF]

2018
  • SCRATCH: A Scalable Discrete Matrix Factorization Hashing for Cross-Modal Retrieval(MM) [PDF]
2017
  • DCH: Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval(TIP) [PDF]
2016
  • SMFH: Supervised Matrix Factorization for Cross-Modality Hashing(IJCAI) [PDF]

  • SMFH: Supervised Matrix Factorization Hashing for Cross-Modal Retrieval(TIP) [PDF]

2.2.1.2 Dictionary Learning

[Click to expand]
2016
  • DCDH: Discriminative Coupled Dictionary Hashing for Fast Cross-Media Retrieval(MM) [PDF]
2014
  • DLCMH: Dictionary Learning Based Hashing for Cross-Modal Retrieval(SIGIR) [PDF]

2.2.1.3 Feature Mapping-Sample-Constraint-Label-Constraint

[Click to expand]
2022
  • DJSAH: Discrete Joint Semantic Alignment Hashing for Cross-Modal Image-Text Search(TCSVT) (PDF)
2020
  • FUH: Fast Unmediated Hashing for Cross-Modal Retrieval(TCSVT) (PDF)
2016
  • MDBE: Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval(TIP) (PDF) [Code]

2.2.1.4 Feature Mapping-Sample-Constraint-Separate-Hamming

[Click to expand]
2017
  • CSDH: Sequential Discrete Hashing for Scalable Cross-Modality Similarity Retrieval(TIP) (PDF)
2016
  • DASH: Frustratingly Easy Cross-Modal Hashing(MM) (PDF)
2015
  • QCH: Quantized Correlation Hashing for Fast Cross-Modal Search(IJCAI) (PDF)

2.2.1.5 Feature Mapping-Sample-Constraint-Common Hamming

[Click to expand]
2021
  • ASCSH: Asymmetric Supervised Consistent and Specific Hashing for Cross-Modal Retrieval(TIP) (PDF) [Code]
2019
  • SRDMH: Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval(TMM) (PDF)
2018
  • FDCH: Fast Discrete Cross-modal Hashing With Regressing From Semantic Labels(MM) (PDF)
2017
  • SRSH: Semi-Relaxation Supervised Hashing for Cross-Modal Retrieval(MM) (PDF) [Code]

  • RoPH: Cross-Modal Hashing via Rank-Order Preserving(TMM) (PDF) [Code]

2016
  • SRDMH: Supervised Robust Discrete Multimodal Hashing for Cross-Media Retrieval(CIKM) (PDF)

2.2.1.6 Feature Mapping-Relation-Constraint

[Click to expand]
2017
  • LSRH: Linear Subspace Ranking Hashing for Cross-Modal Retrieval(TPAMI) (PDF)
2014
  • SCM: Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization(AAAI) (PDF)

  • HTH: Scalable Heterogeneous Translated Hashing(KDD) (PDF)

2013
  • PLMH: Parametric Local Multimodal Hashing for Cross-View Similarity Search(IJCAI) (PDF)

  • RaHH: Comparing Apples to Oranges: A Scalable Solution with Heterogeneous Hashing(KDD) (PDF) [Code]

2012
  • CRH: Co-Regularized Hashing for Multimodal Data(CRH) (PDF)

2.2.1.7 Other Shallow

[Click to expand]
2019
  • DLFH: Discrete Latent Factor Model for Cross-Modal Hashing(TIP) (PDF) [Code]
2018
  • SDMCH: Supervised Discrete Manifold-Embedded Cross-Modal Hashing(IJCAI) (PDF)
2015
  • SePH: Semantics-Preserving Hashing for Cross-View Retrieval(CVPR) (PDF)
2012
  • MLBE: A Probabilistic Model for Multimodal Hash Function Learning(KDD) (PDF)
2010
  • CMSSH: Data Fusion through Cross-modality Metric Learning using Similarity-Sensitive Hashing(CVPR) (PDF)

2.2.2 Supervised deep cross-modal hashing retrieval

[Click to expand]

2.2.2.1 Naive Network-Distance-Constraint

[Click to expand]
2019
  • MCITR: Cross-modal Image-Text Retrieval with Multitask Learning(CIKM) (PDF)
2016
  • CAH: Correlation Autoencoder Hashing for Supervised Cross-Modal Search(ICMR) (PDF)
2014
  • CMNNH: Cross-Media Hashing with Neural Networks(MM) (PDF)

  • MMNN: Multimodal Similarity-Preserving Hashing(TPAMI) (PDF)

2.2.2.2 Naive Network-Similarity-Constraint

[Click to expand]
2022
  • Bi-CMR: Bidirectional Reinforcement Guided Hashing for Effective Cross-Modal Retrieval(AAAI) (PDF) [Code]

  • Bi-NCMH: Deep Normalized Cross-Modal Hashing with Bi-Direction Relation Reasoning(CVPR) (PDF)

2021
  • OTCMR: Bridging Heterogeneity Gap with Optimal Transport for Cross-modal Retrieval(CIKM) (PDF)

  • DUCMH: Deep Unified Cross-Modality Hashing by Pairwise Data Alignment(IJCAI) (PDF)

2020
  • NRDH: Nonlinear Robust Discrete Hashing for Cross-Modal Retrieval(SIGIR) (PDF)

  • DCHUC: Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning(TKDE) (PDF) [Code]

2017
  • CHN: Correlation Hashing Network for Efficient Cross-Modal Retrieval(BMVC) (PDF)
2016
  • DVSH: Deep Visual-Semantic Hashing for Cross-Modal Retrieval(KDD) (PDF)

2.2.2.3 Naive Network-Negative-Log-Likelihood

[Click to expand]
2022
  • MSSPQ: Multiple Semantic Structure-Preserving Quantization for Cross-Modal Retrieval(ICMR) (PDF)
2021
  • DMFH: Deep Multiscale Fusion Hashing for Cross-Modal Retrieval(TCSVT) (PDF)

  • TEACH: Attention-Aware Deep Cross-Modal Hashing(ICMR) (PDF)

2020
  • MDCH: Mask Cross-modal Hashing Networks(TMM) (PDF)
2019
  • EGDH: Equally-Guided Discriminative Hashing for Cross-modal Retrieval(IJCAI) (PDF)
2018
  • DDCMH: Dual Deep Neural Networks Cross-Modal Hashing(AAAI) (PDF)

  • CMHH: Cross-Modal Hamming Hashing(ECCV) (PDF)

2017
  • PRDH: Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval(AAAI) (PDF)

  • DCMH: Deep Cross-Modal Hashing(CVPR) (PDF) [Code]

2.2.2.4 Naive Network-Triplet-Constraint

[Click to expand]
2019
  • RDCMH: Multiple Semantic Structure-Preserving Quantization for Cross-Modal Retrieval(AAAI) (PDF)
2018
  • MCSCH: Multi-Scale Correlation for Sequential Cross-modal Hashing Learning(MM) (PDF)

  • TDH: Triplet-Based Deep Hashing Network for Cross-Modal Retrieval(TIP) (PDF)

2.2.2.5 GAN

[Click to expand]
2022
  • SCAHN: Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning(MM) (PDF) [Code]
2021
  • TGCR: Multiple Semantic Structure-Preserving Quantization for Cross-Modal Retrieval(TCSVT) (PDF)
2020
  • CPAH: Multi-Task Consistency-Preserving Adversarial Hashing for Cross-Modal Retrieval(TIP) (PDF) [Code]

  • MLCAH: Multi-Level Correlation Adversarial Hashing for Cross-Modal Retrieval(TMM) (PDF)

  • DADH: Deep Adversarial Discrete Hashing for Cross-Modal Retrieval(ICMR) (PDF) [Code]

2019
  • AGAH: Adversary Guided Asymmetric Hashing for Cross-Modal Retrieval(ICMR) (PDF) [Code]
2018
  • SSAH: Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval(CVPR) (PDF) [Code]

2.2.2.6 Graph Model

[Click to expand]
2022
  • HMAH: Multi-Task Consistency-Preserving Adversarial Hashing for Cross-Modal Retrieval(TMM) (PDF)

  • SCAHN: Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning(MM) (PDF) [Code]

2021
  • LGCNH: Local Graph Convolutional Networks for Cross-Modal Hashing(MM) (PDF) [Code]
2019
  • GCH: Graph Convolutional Network Hashing for Cross-Modal Retrieval(IJCAI) (PDF) [Code]

2.2.2.7 Transformer

[Click to expand]
2022
  • DCHMT: Differentiable Cross-modal Hashing via Multimodal Transformers(CIKM) (PDF) [Code]

  • UniHash: Contrastive Label Correlation Enhanced Unified Hashing Encoder for Cross-modal Retrieval(MM) (PDF) [Code]

2.2.2.8 Memory Network

[Click to expand]
2021
  • CMPD: Using Cross Memory Network With Pair Discrimination for Image-Text Retrieval(TCSVT) (PDF)
2019
  • CMMN: Deep Memory Network for Cross-Modal Retrieval(TMM) (PDF)

2.2.2.9 Quantization

[Click to expand]
2022
  • ACQH: Asymmetric Correlation Quantization Hashing for Cross-Modal Retrieval(TMM) (PDF)
2017
  • CDQ: Collective Deep Quantization for Efficient Cross-Modal Retrieval(AAAI) (PDF) [Code]

2.3 Unsupervised-cross-modal-real-valued

[Click to expand]

2.3.1 Early unsupervised cross-modal real-valued retrieval

[Click to expand]

2.3.1.1 CCA

[Click to expand]
2017
  • ICCA:Towards Improving Canonical Correlation Analysis for Cross-modal Retrieval(MM) [PDF]
2015
  • DCMIT:Deep Correlation for Matching Images and Text(CVPR) [PDF]

  • RCCA:Learning Query and Image Similarities with Ranking Canonical Correlation Analysis(ICCV) [PDF]

2014
  • MCCA:A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics(IJCV) [PDF]
2013
  • KCCA:Framing Image Description as a Ranking Task Data, Models and Evaluation Metrics(JAIR) [PDF]

  • DCCA:Deep Canonical Correlation Analysis(ICML) [PDF] [Code]

2012
  • CR:Continuum Regression for Cross-modal Multimedia Retrieval(ICIP) [PDF]
2010
  • CCA:A New Approach to Cross-Modal Multimedia Retrieval(MM) [PDF][Code]

2.3.1.2 Topic Model

[Click to expand]
2011
  • MDRF:Learning Cross-modality Similarity for Multinomial Data(ICCV) [PDF]
2010
  • tr-mmLDA:Topic Regression Multi-Modal Latent Dirichlet Allocation for Image Annotation(CVPR) [PDF]
2003
  • Corr-LDA:Modeling Annotated Data(SIGIR) [PDF]

2.3.1.3 Other Shallow

[Click to expand]
2013
  • Bi-CMSRM:Cross-Media Semantic Representation via Bi-directional Learning to Rank(MM) [PDF]

  • CTM:Cross-media Topic Mining on Wikipedia(MM) [PDF]

2012
  • CoCA:Dimensionality Reduction on Heterogeneous Feature Space(ICDM) [PDF]
2011
  • MCU:Maximum Covariance Unfolding: Manifold Learning for Bimodal Data(NIPS) [PDF]
2008
  • PAMIR:A Discriminative Kernel-Based Model to Rank Images from Text Queries(TPAMI) [PDF]
2003
  • CFA:Multimedia Content Processing through Cross-Modal Association(MM) [PDF]

2.3.1.4 Neural Network

[Click to expand]
2018
  • CDPAE:Comprehensive Distance-Preserving Autoencoders for Cross-Modal Retrieval(MM) [PDF][Code]
2016
  • CMDN:Cross-Media Shared Representation by Hierarchical Learning with Multiple Deep Networks(IJCAI) [PDF][Code]

  • MSAE:Effective deep learning-based multi-modal retrieval(VLDB) [PDF]

2014
  • Corr-AE:Cross-modal Retrieval with Correspondence Autoencoder(MM) [PDF]
2013
  • RGDBN:Latent Feature Learning in Social Media Network(MM) [PDF]
2012
  • MDBM:Multimodal Learning with Deep Boltzmann Machines(NIPS) [PDF]

2.3.2 Image-text matching retrieval

[Click to expand]

2.3.2.1 Native Network

[Click to expand]
2022
  • UWML:Universal Weighting Metric Learning for Cross-Modal Retrieval (TPAMI) [PDF][Code]

  • LESS:Learning to Embed Semantic Similarity for Joint Image-Text Retrieval (TPAMI)[PDF]

  • CMCM:Cross-Modal Coherence for Text-to-Image Retrieval (AAAI) [PDF]

  • P2RM:Point to Rectangle Matching for Image Text Retrieval(MM) [PDF]

2020
  • DPCITE:Dual-path Convolutional Image-Text Embeddings with Instance Loss(TOMM) [PDF] [code]

  • PSN:Preserving Semantic Neighborhoods for Robust Cross-Modal Retrieval(ECCV) [PDF] [Code]

2019
  • LDR:Learning Disentangled Representation for Cross-Modal Retrieval with Deep Mutual Information Estimation(MM) [PDF]
2018
  • CHAIN-VSE:Bidirectional Retrieval Made Simple(CVPR) [PDF] [Code]
2017
  • CRC:Cross-media Relevance Computation for Multimedia Retrieval(MM) [PDF]

  • VSE++: Improving Visual-Semantic Embeddings with Hard Negatives:(Arxiv) [PDF][Code]

  • RRF-Net:Learning a Recurrent Residual Fusion Network for Multimodal Matching(ICCV) [PDF][Code]

2016
  • DBRLM:Cross-Modal Retrieval via Deep and Bidirectional Representation Learning(TMM) [PDF]
2015
  • MSDS:Image-Text Cross-Modal Retrieval via Modality-Specific Feature Learning(ICMR) [PDF]
2014
  • DT-RNN:Grounded Compositional Semantics for Finding and Describing Images with Sentences(TACL) [PDF]

2.3.2.2 Dot-product Attention

[Click to expand]
2020
  • SMAN: Stacked Multimodal Attention Network for Cross-Modal Image-Text Retrieval(TC) [PDF]

  • CAAN:Context-Aware Attention Network for Image-Text Retrieval(CVPR) [PDF]

  • IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval(CVPR) [PDF] [Code]

2019
  • PFAN:Position Focused Attention Network for Image-Text Matching (IJCAI) [PDF][Code]

  • CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval(ICCV) [PDF] [Code]

  • CMRSC:Cross-Modal Image-Text Retrieval with Semantic Consistency(MM) [PDF] [Code]

2018
  • MCSM:Modality-specific Cross-modal Similarity Measurement with Recurrent Attention Network(TIP) [PDF][Code]

  • DSVEL:Finding beans in burgers: Deep semantic-visual embedding with localization(CVPR) [PDF][Code]

  • CRAN:Cross-media Multi-level Alignment with Relation Attention Network(IJCAI)[PDF]

  • SCAN:Stacked Cross Attention for Image-Text Matching(ECCV) [PDF] [Code]

2017
  • sm-LSTM:Instance-aware Image and Sentence Matching with Selective Multimodal LSTM(CVPR) [PDF]

2.3.2.3 Graph Model

[Click to expand]
2022
  • LHSC:Learning Hierarchical Semantic Correspondences for Cross-Modal Image-Text Retrieval(ICMR) [PDF]

  • IFRFGF:Improving Fusion of Region Features and Grid Features via Two-Step Interaction for Image-Text Retrieval(MM) [PDF]

  • CODER:Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval(ECCV) [PDF]

2021
  • HSGMP: Heterogeneous Scene Graph Message Passing for Cross-modal Retrieval(ICMR) [PDF]

  • WCGL:Wasserstein Coupled Graph Learning for Cross-Modal Retrieval(ICCV)[PDF]

2020
  • DSRAN:Learning Dual Semantic Relations with Graph Attention for Image-Text Matching(TCSVT) [PDF] [code]

  • VSM:Visual-Semantic Matching by Exploring High-Order Attention and Distraction(CVPR) [PDF]

  • SGM:Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval(WACV) [PDF]

2019
  • KASCE:Knowledge Aware Semantic Concept Expansion for Image-Text Matching(IJCAI) [PDF]

  • VSRN:Visual Semantic Reasoning for Image-Text Matching(ICCV) [PDF] [Code]

2.3.2.4 Transformer

[Click to expand]
2022
  • DREN:Dual-Level Representation Enhancement on Characteristic and Context for Image-Text Retrieval(TCSVT) [PDF]

  • M2D-BERT:Multi-scale Multi-modal Dictionary BERT For Effective Text-image Retrieval in Multimedia Advertising(CIKM) [PDF]

  • ViSTA:ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval(CVPR) [PDF]

  • COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval(CVPR) [PDF]

  • EI-CLIP: Entity-aware Interventional Contrastive Learning for E-commerce Cross-modal Retrieval(CVPR) [PDF]

  • SSAMT:Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval(ICMR) [PDF]

  • TEAM:Token Embeddings Alignment for Cross-Modal Retrieval(MM) [PDF]

  • CAliC: Accurate and Efficient Image-Text Retrieval via Contrastive Alignment and Visual Contexts Modeling(MM) [PDF]

2021
  • GRAN:Global Relation-Aware Attention Network for Image-Text Retrieval(ICMR) [PDF]

  • PCME:Probabilistic Embeddings for Cross-Modal Retrieval(CVPR) [PDF] [code]

2020
  • FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval(SIGIR) [PDF]
2019
  • PVSE:Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval(CVPR) [PDF] [Code]

2.3.2.5 Cross-modal Generation

[Click to expand]
2022
  • PCMDA:Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval(MM)[PDF]
2021
  • CRGN:Deep Relation Embedding for Cross-Modal Retrieval(TIP) [PDF][Code]

  • X-MRS:Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gapin Shared Representation Learning(MM) [PDF][Code]

2020
  • AACR:Augmented Adversarial Training for Cross-Modal Retrieval(TMM) [PDF] [Code]
2018
  • LSCO:Learning Semantic Concepts and Order for Image and Sentence Matching(CVPR) [PDF]

  • TCCM:Towards Cycle-Consistent Models for Text and Image Retrieval(CVPR) [PDF]

  • GXN:Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models(CVPR) [PDF]

2017
  • 2WayNet:Linking Image and Text with 2-Way Nets(CVPR) [PDF]
2015
  • DVSA:Deep Visual-Semantic Alignments for Generating Image Descriptions(CVPR) [PDF]

2.4 Supervised-cross-modal-real-valued

[Click to expand]

2.4.1 Supervised shallow cross-modal real-valued retrieval

[Click to expand]

2.4.1.1 CCA

[Click to expand]
2022
  • MVMLCCA: Multi-view Multi-label Canonical Correlation Analysis for Cross-modal Matching and Retrieval(CVPRW) [PDF] [Code]
2015
  • ml-CCA: Multi-Label Cross-modal Retrieval(ICCV) [PDF] [Code]
2014
  • cluster-CCA: Cluster Canonical Correlation Analysis(ICAIS) [PDF]
2012
  • GMA: Generalized Multiview Analysis: A Discriminative Latent Space(CVPR) [PDF] [Code]

2.4.1.2 Dictionary Learning

[Click to expand]
2018
  • JDSLC: Joint Dictionary Learning and Semantic Constrained Latent Subspace Projection for Cross-Modal Retrieval(CIKM) [PDF]
2016
  • DDL: Discriminative Dictionary Learning With Common Label Alignment for Cross-Modal Retrieval(TMM) [PDF]
2014
  • CMSDL: Cross-Modality Submodular Dictionary Learning for Information Retrieval(CIKM) [PDF]
2013
  • SliM2: Supervised Coupled Dictionary Learning with Group Structures for Multi-Modal Retrieval(AAAI) [PDF]

2.4.1.3 Feature Mapping

[Click to expand]
2017
  • MDSSL: Cross-Modal Retrieval Using Multiordered Discriminative Structured Subspace Learning(TMM) [PDF]

  • JLSLR: Joint Latent Subspace Learning and Regression for Cross-Modal Retrieval(SIGIR) [PDF]

2016
  • JFSSL: Joint Feature Selection and Subspace Learning for Cross-Modal Retrieval(TPAIMI) [PDF] [Code]

  • MDCR: Modality-Dependent Cross-Media Retrieval(TIST) [PDF]

  • CRLC: Cross-modal Retrieval with Label Completion(MM) [PDF]

2013
  • JGRHML: Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval(AAAI) [PDF] [Code]

  • LCFS: Learning Coupled Feature Spaces for Cross-modal Matching(ICCV) [PDF]

2011
  • Multi-NPP: Learning Multi-View Neighborhood Preserving Projections(ICML) [PDF]

2.4.1.4 Topic Model

[Click to expand]
2014
  • M3R: Multi-modal Mutual Topic Reinforce Modeling for Cross-media Retrieval(MM) [PDF]

  • NPBUS: Nonparametric Bayesian Upstream Supervised Multi-Modal Topic Models(WSDM) [PDF]

2.4.1.5 Other Shallow

[Click to expand]
2019
  • CMOS: Online Asymmetric Metric Learning With Multi-Layer Similarity Aggregation for Cross-Modal Retrieval(TIP) [PDF]
2017
  • CMOS: Online Asymmetric Similarity Learning for Cross-Modal Retrieval(CVPR) [PDF]
2016
  • PL-ranking: A Novel Ranking Method for Cross-Modal Retrieval(MM) [PDF]

  • RL-PLS: Cross-modal Retrieval by Real Label Partial Least Squares(MM) [PDF]

2013
  • PFAR: Parallel Field Alignment for Cross Media Retrieval(MM) [PDF]

2.4.2 Supervised deep cross-modal real-valued retrieval

[Click to expand]

2.4.2.1 Naive Network

[Click to expand]
2022
  • C3CMR: Cross-Modality Cross-Instance Contrastive Learning for Cross-Media Retrieval(MM) [PDF]
2020
  • ED-Net: Event-Driven Network for Cross-Modal Retrieval(CIKM) [PDF]
2019
  • DSCMR: Deep Supervised Cross-modal Retrieval(CVPR) [PDF] [Code]

  • SAM: Cross-Modal Subspace Learning with Scheduled Adaptive Margin Constraints(MM) [PDF]

2017
  • deep-SM: Cross-Modal Retrieval With CNN Visual Features: A New Baseline(TCYB) [PDF] [Code]

  • CCL: Cross-modal Correlation Learning With Multigrained Fusion by Hierarchical Network(TMM) [PDF]

  • MSFN: Cross-media Retrieval by Learning Rich Semantic Embeddings of Multimedia(MM) [PDF]

  • MNiL: Multi-Networks Joint Learning for Large-Scale Cross-Modal Retrieval(MM) [PDF] [Code]

2016
  • MDNN: Effective deep learning-based multi-modal retrieval(VLDB) [PDF]
2015
  • RE-DNN: Deep Semantic Mapping for Cross-Modal Retrieval(ICTAI) [PDF]

  • C2MLR: Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment(MM) [PDF]

2.4.2.2 GAN

[Click to expand]
2022
  • JFSE: Joint Feature Synthesis and Embedding: Adversarial Cross-Modal Retrieval Revisited(TPAMI) [PDF] [Code]
2021
  • AACR: Augmented Adversarial Training for Cross-Modal Retrieval(TMM) [PDF] [Code]
2018
  • CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning(TMM) [PDF] [Code]
2017
  • ACMR: Adversarial Cross-Modal Retrieval(MM) [PDF] [Code]

2.4.2.3 Graph Model

[Click to expand]
2022
  • AGCN: Adversarial Graph Convolutional Network for Cross-Modal Retrieval(TCSVT) [PDF]

  • ALGCN: Adaptive Label-Aware Graph Convolutional Networks for Cross-Modal Retrieval(TMM) [PDF]

  • HGE: Cross-Modal Retrieval with Heterogeneous Graph Embedding(MM) [PDF]

2021
  • GCR: Exploring Graph-Structured Semantics for Cross-Modal Retrieval(MM) [PDF] [Code]

  • DAGNN: Dual Adversarial Graph Neural Networks for Multi-label Cross-modal Retrieval(AAAI) [PDF]

2018
  • SSPE: Learning Semantic Structure-preserved Embeddings for Cross-modal Retrieval(MM) [PDF]

2.4.2.4 Transformer

[Click to expand]
2021
  • RLCMR: Rethinking Label-Wise Cross-Modal Retrieval from A Semantic Sharing Perspective(IJCAI) [PDF]

2.5 Cross-modal-Retrieval-under-Special-Retrieval-Scenario

[Click to expand]

2.5.1 Semi-Supervised (Real-valued)

[Click to expand]
2020
  • SSCMR:Semi-Supervised Cross-Modal Retrieval With Label Prediction(TMM) [PDF]
2019
  • A3VSE:Annotation Efficient Cross-Modal Retrieval with Adversarial Attentive Alignment(MM) [PDF]

  • ASFS:Adaptive Semi-Supervised Feature Selection for Cross-Modal Retrieval(TMM) [PDF]

2018
  • GSS-SL:Generalized Semi-supervised and Structured Subspace Learning for Cross-Modal Retrieval(TMM) [PDF]
2017
  • SSDC:Semi-supervised Distance Consistent Cross-modal Retrieval(VSCC)[PDF]
2013
  • JRL:Learning Cross-Media Joint Representation With Sparse and Semisupervised Regularization(TCSVT) [PDF][Code]
2012
  • MVML-GL:Multiview Metric Learning with Global Consistency and Local Smoothness(TIST) [PDF]

2.5.2 Semi-Supervised (Hashing)

[Click to expand]
2020
  • SCH-GAN:Semi-Supervised Cross-Modal Hashing by Generative Adversarial Network(TC) [PDF] [Code]

  • SGCH:Semi-supervised graph convolutional hashing network for large-scale cross-modal retrieval(ICIP) [PDF]

2019
  • SSDQ:Semi-supervised Deep Quantization for Cross-modal Search(MM) [PDF]

  • S3PH:Semi-supervised semantic-preserving hashing for efficient cross-modal retrieval(ICME) [PDF]

2017
  • AUSL:Adaptively Unified Semi-supervised Learning for Cross-Modal Retrieval(IJCAI) [PDF]
2016
  • NPH:Neighborhood-Preserving Hashing for Large-Scale Cross-Modal Search(MM) [PDF]

2.5.3 Imbalance (Real-valued)

[Click to expand]
2021
  • PAN: Prototype-based Adaptive Network for Robust Cross-modal Retrieval(SIGIR) [PDF]

  • MCCN: Multimodal Coordinated Clustering Network for Large-Scale Cross-modal Retrieval(MM) [PDF]

2020
  • DAVAE:Incomplete Cross-modal Retrieval with Dual-Aligned Variational Autoencoders(MM) [PDF]
2015
  • SCDL:Semi-supervised Coupled Dictionary Learning for Cross-modal Retrieval in Internet Images and Texts(MM) [PDF]

  • LGCFL:Learning Consistent Feature Representation for Cross-Modal Multimedia Retrieval(TMM) [PDF]

2.5.4 Imbalance (Hashing)

[Click to expand]
2020
  • RUCMH:Robust Unsupervised Cross-modal Hashing for Multimedia Retrieval(TOIS) [PDF]

  • ATFH-N:Adversarial Tri-Fusion Hashing Network for Imbalanced Cross-Modal Retrieval(TETCI) [PDF]

  • FlexCMH:Flexible Cross-Modal Hashing(TNNLS) [PDF]

2019
  • TFNH:Triplet Fusion Network Hashing for Unpaired Cross-Modal Retrieval(ICMR) [PDF] [Code]

  • CALM:Collective Affinity Learning for Partial Cross-Modal Hashing(TIP) [PDF]

  • MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval:(TIP) [PDF] [Code]

  • GSPH:Generalized Semantic Preserving Hashing for Cross-Modal Retrieval(TIP) [PDF]

2018
  • DAH:Dense Auto-Encoder Hashing for Robust Cross-Modality Retrieval(MM) [PDF]
2017
  • GSPH:Generalized Semantic Preserving Hashing for n-Label Cross-Modal Retrieval(CVPR) [PDF] [Code]

2.5.5 Incremental

[Click to expand]
2021
  • MARS: Learning Modality-Agnostic Representation for Scalable Cross-Media Retrieval(TCSVT) [PDF]

  • CCMR:Continual learning in cross-modal retrieval(CVPR) [PDF]

  • SCML:Real-world Cross-modal Retrieval via Sequential Learning(TMM) [PDF]

2020
  • ATTL-CEL:Adaptive Temporal Triplet-loss for Cross-modal Embedding Learning(MM)[PDF]
2019
  • SVHNs:Separated Variational Hashing Networks for Cross-Modal Retrieval(MM) [PDF]

  • ECMH:Extensible Cross-Modal Hashing(IJCAI) [PDF] [Code]

2018
  • TempXNet:Temporal Cross-Media Retrieval with Soft-Smoothing(MM) [PDF]

2.5.6 Noise

[Click to expand]
2022
  • DECL: Deep Evidential Learning with Noisy Correspondence for Cross-modal Retrieval(MM) (PDF) [Code]

  • ELRCMR: Early-Learning regularized Contrastive Learning for Cross-Modal Retrieval with Noisy Labels(MM) (PDF)

  • CMMQ: Mutual Quantization for Cross-Modal Search with Noisy Labels(CVPR) (PDF)

2021
  • MRL: Learning Cross-Modal Retrieval with Noisy Labels(CVPR) (PDF) [Code]
2018
  • WSJE: Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval(MM) (PDF)

2.5.7 Cross-Domain

[Click to expand]
2021
  • M2GUDA: Multi-Metrics Graph-Based Unsupervised Domain Adaptation for Cross-Modal Hashing(ICMR) (PDF)

  • ACP: Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval(CVPR) (PDF)

2020
  • DASG: Unsupervised Cross-Media Retrieval Using Domain Adaptation With Scene Graph(TCSVT) (PDF)

2.5.8 Zero-Shot

[Click to expand]
2020
  • LCALE: Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval(AAAI) (PDF)

  • CFSA: Correlated Features Synthesis and Alignment for Zero-shot Cross-modal Retrieval(SIGIR) (PDF)

2019
  • ZS-CMR: Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval(TIP) (PDF)

2.5.9 Few-Shot

[Click to expand]
2021
  • SOCMH: Know Yourself and Know Others: Efficient Common Representation Learning for Few-shot Cross-modal Retrieval(ICMR) (PDF)

2.5.10 Online Learning

[Click to expand]
2020
  • CMOLRS: Online Fast Adaptive Low-Rank Similarity Learning for Cross-Modal Retrieval(TMM) (PDF) [Code]

  • LEMON: Label Embedding Online Hashing for Cross-Modal Retrieval(MM) (PDF) [Code]

2019
  • FOMH: Flexible Online Multi-modal Hashing for Large-scale Multimedia Retrieval(MM) (PDF) [Code]
2017
  • OCMSR: Online Cross-Modal Scene Retrieval by Binary Representation and Semantic Graph(MM) (PDF)
2016
  • OCMH: Online cross-modal hashing for web image retrieval(AAAI) (PDF)

2.5.11 Hierarchical

[Click to expand]
2020
  • SHDCH: Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval(MM) (PDF) [Code]
2019
  • HiCHNet: Supervised Hierarchical Cross-Modal Hashing(SIGIR) (PDF) [Code]

2.5.12 Fine-grained

[Click to expand]
2022
  • PCMDA: Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval(MM) (PDF)
2019
  • FGCrossNet: A New Benchmark and Approach for Fine-grained Cross-media Retrieval(MM) (PDF) [Code]

3. Usage

3.1 Datasets

  • Graph Model--GCR

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/1YmW8Zz2uK3AgCs6pDEoA8A?pwd=21xh
Code: 21xh
  • Unsupervised cross-modal real-valued

Dataset link:

Baidu Yun Link:https://pan.baidu.com/s/1hBNo8gBSyLbik0ka1POhiQ 
Code:cc53
  • Quantization--CDQ

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/1mO1hdsJR2FN5xEAv2e7eaw?pwd=us9v
Code: us9v
  • GAN--CPAH

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/145Zool0FUb3758EeSxtHBw?pwd=mxt7
Code: mxt7
  • Transformer--DCHMT

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/1UHr2NVjFkTjLXXQ8Izy5WA?pwd=qfsj
Code: qfsj
  • Feature Mapping(Sample Constraint)(Label Constraint)--MDBE

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/15BtQ_Zz7UihZBW6KXTTodA?pwd=ir7g
Code: ir7g
  • Feature Mapping(Sample Constraint)(Common Hamming)--RoPH

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/1_uIulkuxcIcubvl5u3zsOA?pwd=46c4
Code: 46c4
  • Online learning--SHDCH

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/1-CsIJbvz3IFsmDgYk9BwYg?pwd=7hd8
Code: 7hd8
  • Noise--MRL

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/1FIrB-gXJa9VHKzLRQZf30Q?pwd=g3qt
Code: g3qt
  • Online learning--LEMON

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/1s5SnnAXo5wK7cmRs3zNq4w?pwd=jxjo
Code: jxjo
  • Fine-grained--FGCrossNet

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/1OYxCLmNKvPzwLIs5snTOlA?pwd=r80g
Code: r80g
  • Noise--DECL

Dataset Link:

Baidu Yun Link: https://pan.baidu.com/s/1FcxkwOuuiUXnIl1LAatDLA?pwd=nl2z
Code: nl2z

About

Cross-Modal-Real-valuded-Retrieval

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •