Here are
73 public repositories
matching this topic...
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!
Updated
Jun 10, 2024
Python
Build high-performance AI models with modular building blocks
Updated
Jun 9, 2024
Python
[ICLR 2024 Spotlight] Deep Symbolic Regression with Multimodal Pretraining
Updated
Jun 8, 2024
Python
[ICLR 2024 Spotlight] This is the official code for the paper "SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training"
Updated
Jun 8, 2024
Python
Public repository of our work in the search for an optimal multi-view crop classifier (considering encoder architectures and fusion strategies)
Updated
Jun 5, 2024
Python
This repository contains code to download data for the preprint "MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning"
Updated
Jun 4, 2024
Python
Public repository of our assessment work in missing views for EO applications
Updated
Jun 4, 2024
Python
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
Updated
May 31, 2024
Python
[IVS'24] UniBEV: the official implementation of UniBEV
Updated
May 30, 2024
Python
[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"
Updated
May 25, 2024
Python
[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Updated
May 24, 2024
Python
Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"
Updated
May 22, 2024
Python
The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"
Updated
May 17, 2024
Python
【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification
Updated
May 10, 2024
Python
🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.
Updated
Apr 20, 2024
Python
Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar
Updated
Apr 4, 2024
Python
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
Updated
Apr 3, 2024
Python
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
Updated
Apr 3, 2024
Python
Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"
Updated
Apr 2, 2024
Python
Under review. [IROS 2024] PGA: Personalizing Grasping Agents with Single Human-Robot Interaction
Updated
Mar 30, 2024
Python
Improve this page
Add a description, image, and links to the
multi-modal-learning
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
multi-modal-learning
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.