Recognize Any Regions
-
Updated
Dec 18, 2024 - Python
Recognize Any Regions
FSvFM: A Generalizable Face Security vision Foundation Model via Self-Supervised Facial Representation Learning (CVPR25)
A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explore how LMMs represent, transform, and align multimodal information internally.
Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"
MonoDINO-DETR: Depth-Enhanced Monocular 3D Object Detection Using a Vision Foundation Model
This repo collects some latest research work of Generative AI. It provides simple implementations to understand the ideas and some follow-up discussions to inspire future work.
A Vision Foundation Model for Cine Cardiac Magnetic Resonance Imaging
Implementation of "CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation"
Implementation of CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation.
Simple Gradio application integrated with Hugging Face Multimodals to support visual question answering chatbot and more features
"Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model"
Codebase for probing VFMs and Feature Upsamplers using Intractive Segmentation.
Florence-2 quick test
ProRobo3D Benchmark to be release...
CAST is a method for semi-supervised instance segmentation that efficiently trains a compact model using both labeled and unlabeled data. This repository contains the implementation of our three-stage pipeline, showcasing contrastive adaptation and distillation techniques. 🐙🌟
Add a description, image, and links to the vision-foundation-model topic page so that developers can more easily learn about it.
To associate your repository with the vision-foundation-model topic, visit your repo's landing page and select "manage topics."