#

cross-modal-retrieval

Here are 62 public repositories matching this topic...

MartinYuanNJU / SEMScene

Code implementation of paper "SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval" (ACM TOMM 2024).

cross-modal-retrieval scene-graph-models image-text-matching

Updated Sep 20, 2024
Python

haomo-ai / ModaLink

[IROS 2024] This repository contains the implementation of our paper: ModaLink: Unifying Modalities for Efficient Image-to-PointCloud Place Recognition

place-recognition cross-modal-retrieval cross-modal-localization camera-to-lidar

Updated Sep 7, 2024
Python

alipay / PC2-NoiseofWeb

Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text matching/retrieval models.

benchmark dataset captioning-images multimodal-learning cross-modal-retrieval acmmm image-text-matching image-text-retrieval noisy-correspondence acmmm2024

Updated Sep 5, 2024
Python

BUAADreamer / SPN4CIR

[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives

transformer data-generation llama clip image-retrieval blip multi-modal-retrieval multimodal-learning cross-modal-retrieval composed-image-retrieval llava blip2 memory-bank acmmm2024

Updated Aug 19, 2024
Python

serizard / text-3d-retrieval

Research project at AI·Robotics Institute, KIST

retrieval 3d cross-modal-retrieval

Updated Aug 6, 2024
Python

BUAADreamer / CCRK

[KDD 2024] Improving the Consistency in Cross-Lingual Cross-Modal Retrieval with 1-to-K Contrastive Learning

retrieval wit cross-modal cross-lingual mscoco multi30k image-text-search cross-modal-retrieval xlm-roberta swin-transformer cross-lingual-retrieval image-text-retrieval vision-language-pretraining iglue xflickrco kdd2024

Updated Jul 18, 2024
Python

ailab-kyunghee / CM2_DVC

[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval

video retrieval memory multi-modal dvc dense-video-captioning cross-modal-retrieval video-cap

Updated Jun 19, 2024
Python

Paranioar / DBL

[TIP2024] The code of “Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching”

tip image-retrieval text-matching cross-modal-retrieval image-text-matching image-text-retrieval metric-research boosting-learning

Updated Jun 12, 2024
Python

naver-ai / pcmepp

Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)

cross-modal-retrieval probabilistic-machine-learning probabilistic-embeddings iclr2024

Updated May 26, 2024
Python

Paranioar / Awesome_Image_Text_Retrieval_Benchmark

The Unified Code of Image-Text Retrieval for Further Exploration.

benchmark cross-modal-retrieval image-text-matching image-text-retrieval

Updated Apr 11, 2024
Python

Paranioar / RCAR

[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”

tip image-retrieval text-matching regulator cross-modal-retrieval image-text-matching image-text-retrieval

Updated Apr 11, 2024
Python

Paranioar / SGRAF

[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”

image-retrieval aaai text-matching similarity-metric cross-modal-retrieval image-text-matching image-text-retrieval

Updated Apr 11, 2024
Python

jpthu17 / DiCoSA

[IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment

ijcai video-retrieval cross-modal-retrieval

Updated Apr 9, 2024
Python

jpthu17 / DiffusionRet

[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

video-retrieval cross-modal-retrieval diffusion-models iccv2023

Updated Apr 9, 2024
Python

jpthu17 / HBI

[CVPR 2023 Highlight] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning

cvpr video-retrieval video-question-answering cross-modal-retrieval

Updated Apr 9, 2024
Python

jpthu17 / EMCL

[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

video-captioning neurips video-retrieval video-question-answering cross-modal-retrieval

Updated Apr 9, 2024
Python

yalesong / pvse

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)

metric-learning mscoco-dataset cross-modal-retrieval mrw-dataset tgif-dataset

Updated Mar 15, 2024
Python

naver-ai / pcme

Official Pytorch implementation of "Probabilistic Cross-Modal Embedding" (CVPR 2021)

cross-modal-retrieval probabilistic-machine-learning cvpr2021 probabilistic-embeddings

Updated Mar 1, 2024
Python

naver-ai / eccv-caption

Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)

machine-learning deep-learning evaluation dataset vision-and-language cross-modal-retrieval image-text-matching eccv2022 vl-benchmark

Updated Mar 1, 2024
Python

kyuyeonpooh / objects-that-sound

The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.

deep-learning audioset sound-localization eccv2018 cross-modal-retrieval audio-visual-learning

Updated Jan 29, 2024
Python

Improve this page

Add a description, image, and links to the cross-modal-retrieval topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the cross-modal-retrieval topic, visit your repo's landing page and select "manage topics."