vision-foundation-model

Here are 16 public repositories matching this topic...

Surrey-UP-Lab / RegionSpot

Recognize Any Regions

open-world object-detection zero-shot instance-segmentation auto-labeling vision-language-pretraining open-vocabulary vision-language-model multimodal-representation-learning vision-foundation-model vision-language-foundation-model

Updated Dec 18, 2024
Python

wolo-wolo / FSFM

Star

FSvFM: A Generalizable Face Security vision Foundation Model via Self-Supervised Facial Representation Learning (CVPR25)

face-antispoofing self-supervised-learning deepfake-detection vision-foundation-model

Updated Jun 6, 2025
Python

itsqyh / Awesome-LMMs-Mechanistic-Interpretability

Star

A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explore how LMMs represent, transform, and align multimodal information internally.

generative-model generative paperlist vision-models large-language-models mechanistic-interpretability large-vision-language-models large-multimodal-models vision-foundation-model

Updated Jun 18, 2025

tue-mps / benchmark-vfm-ss

Star

benchmark semantic-segmentation vfm vision-transformer foundation-model vision-foundation-model

Updated May 13, 2025
Python

SliMM-X / CoMP-MM

Star

Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"

large-multimodal-models vision-foundation-model continual-pre-training

Updated Apr 3, 2025
Python

JihyeokKim / MonoDINO-DETR

Star

MonoDINO-DETR: Depth-Enhanced Monocular 3D Object Detection Using a Vision Foundation Model

monocular-3d-detection detection-transformer vision-foundation-model

Updated May 27, 2025
Python

tbhou / sigma

Star

This repo collects some latest research work of Generative AI. It provides simple implementations to understand the ideas and some follow-up discussions to inspire future work.

video-generation generative-ai vision-foundation-model

Updated Apr 24, 2025
Python

mathpluscode / CineMA

Star

A Vision Foundation Model for Cine Cardiac Magnetic Resonance Imaging

cardiac cardiac-segmentation cmr self-supervised-learning disease-detection vision-foundation-model

Updated Jun 16, 2025
Python

zdk258 / CorrCLIP

Star

Implementation of "CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation"

clip open-vocabulary-semantic-segmentation segment-anything-model vision-foundation-model

Updated Nov 27, 2024

PardisTaghavi / contrastive-distillation

Star

Implementation of CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation.

computer-vision knowledge-distillation instance-segmentation semi-supervised contrastive-learning dense-contrastive-learning grounding-dino vision-foundation-model sam2 semi-supervised-knowledge-distillation

Updated Jun 9, 2025

nguyennpa412 / simple-multimodal-ai

Star

Simple Gradio application integrated with Hugging Face Multimodals to support visual question answering chatbot and more features

docker text-to-speech computer-vision gradio vlm visual-question-answering llm mllm vision-foundation-model image-text-to-text florence-2 xtts-v2 mini-internvl

Updated Aug 16, 2024
Python

jinyang06 / SamGOP

Star

"Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model"

vision-foundation-model gaze-object-prediction dedtection-and-segmentation

Updated May 22, 2025
Python

havrylovv / iSegProbe

Star

Codebase for probing VFMs and Feature Upsamplers using Intractive Segmentation.

upsampling probing interactive-segmentation vision-foundation-model feature-upsampling

Updated May 9, 2025
Python

antonio-f / Florence-2-test

Star

Florence-2 quick test

python tutorial jupyter-notebook image-captioning image-to-text colab-notebook visual-grounding referring-expression-comprehension huggingface-transformers multimodal-large-language-models vision-foundation-model florence-2

Updated Aug 15, 2024
Jupyter Notebook

EricLee0224 / ProRobo3D

Star

ProRobo3D Benchmark to be release...

robotics manipulation imitation-learning 3d vision-foundation-model vision-language-action-model

Updated Feb 11, 2025

dwiaskor99 / contrastive-distillation

Star

CAST is a method for semi-supervised instance segmentation that efficiently trains a compact model using both labeled and unlabeled data. This repository contains the implementation of our three-stage pipeline, showcasing contrastive adaptation and distillation techniques. 🐙🌟

benchmark computer-vision model-compression person-re-identification graph-neural-networks self-supervised-learning 3d-representation-learning data-free dense-contrastive-learning knowledge-dist image-based-person-re-id neurips-2023 grounding-dino vision-foundation-model contrastive-cot-prompting distillation-contrastive-decoding sam2 semi-supervised-knowledge-distillation

Updated Jun 18, 2025

Improve this page

Add a description, image, and links to the vision-foundation-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision-foundation-model topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vision-foundation-model

Here are 16 public repositories matching this topic...

Surrey-UP-Lab / RegionSpot

wolo-wolo / FSFM

itsqyh / Awesome-LMMs-Mechanistic-Interpretability

tue-mps / benchmark-vfm-ss

SliMM-X / CoMP-MM

JihyeokKim / MonoDINO-DETR

tbhou / sigma

mathpluscode / CineMA

zdk258 / CorrCLIP

PardisTaghavi / contrastive-distillation

nguyennpa412 / simple-multimodal-ai

jinyang06 / SamGOP

havrylovv / iSegProbe

antonio-f / Florence-2-test

EricLee0224 / ProRobo3D

dwiaskor99 / contrastive-distillation

Improve this page

Add this topic to your repo