cDCGAN model for audio-to-image generation: a cross-modal analysis using deep-learning techniques
-
Updated
Jan 10, 2024 - Python
cDCGAN model for audio-to-image generation: a cross-modal analysis using deep-learning techniques
Source code of our KDD 2024 paper "Improving the Consistency in Cross-Lingual Cross-Modal Retrieval with 1-to-K Contrastive Learning"
Search targeted pedestrians with the text.
Implementation of `Objects that Sound` and `Look, Listen, and Learn` papers by Relja Arandjelovi´c and Andrew Zisserman
AlignCLIP: Improving Cross-Modal Alignment in CLIP
Create Disco Diffusion artworks in one line
Implementation of "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives" in Tensorflow.
Code release of "Collective Deep Quantization of Efficient Cross-modal Retrieval" (AAAI 17)
Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval [ECCV 2020]
Cross-modal convolutional neural networks
Code for COBRA: Contrastive Bi-Modal Representation Algorithm (https://arxiv.org/abs/2005.03687)
Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"
DSCNet Visible-Infrared Person ReID (TIFS 2022)
[IEEE T-IP 2021] Semantics-aware Adaptive Knowledge Distillation for Cross-modal Action Recognition
Python implementation of cross-modal hashing algorithms
Generalized cross-modal NNs; new audiovisual benchmark (IEEE TNNLS 2019)
Code for paper "direct speech-to-image translation"
The implementation of AAAI-17 paper "Collective Deep Quantization of Efficient Cross-modal Retrieval"
Code, dataset and models for our CVPR 2022 publication "Text2Pos"
[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
Add a description, image, and links to the cross-modal topic page so that developers can more easily learn about it.
To associate your repository with the cross-modal topic, visit your repo's landing page and select "manage topics."