The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 33,637 4,872 Updated Feb 23, 2025

Dmmm1997 / C3VG

[AAAI2025] - Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints

13 Updated Dec 11, 2024

luogen1996 / SimREC

A lightweight codebase for referring expression comprehension and segmentation

Python 53 4 Updated May 21, 2022

DerrickWang005 / CRIS.pytorch

An official PyTorch implementation of the CRIS paper

Python 269 38 Updated Jun 9, 2024

crmauceri / refer_sunspot

Python3 Referring Expression Datasets API

Jupyter Notebook 7 Updated Jan 20, 2025

Adonis-galaxy / DepthCLIP

Official implementation of "Can Language Understand Depth?"

Python 82 7 Updated Oct 21, 2022

shashankvkt / DoRA_ICLR24

This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video""

Python 84 12 Updated May 17, 2024

kdexd / digit-classifier

A single handwritten digit classifier, using the MNIST dataset. Pure Numpy.

Python 786 83 Updated Oct 12, 2019

lichengunc / refer

Referring Expression Datasets API

Jupyter Notebook 504 80 Updated Aug 27, 2024

sosppxo / 3D-STMN

[AAAI 2024] The official implementation of the paper "3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation"

Python 40 3 Updated Dec 20, 2023

sosppxo / MDIN

[MM2024 Oral] 3D-GRES: Generalized 3D Referring Expression Segmentation

Python 35 Updated Dec 15, 2024

fawnliu / TRIS

[ICCV 2023] Official code release of our paper "Referring Image Segmentation Using Text Supervision"

Python 69 3 Updated Oct 13, 2024

deeptibhegde / CLIP-goes-3D

Official code release of "CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition"

Python 232 13 Updated May 1, 2023

shenyunhang / APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 556 42 Updated May 8, 2024

3dlg-hcvc / DuoduoCLIP

[ICLR 2025] Duoduo CLIP: Efficient 3D Understanding with Multi-View Images

Python 50 3 Updated Mar 20, 2025

Jiho Choi JihoChoi

Highlights

Organizations

Lists (22)

🤖 AI

💯 Algorithm

🔍 BigQuery

🔖

📎 CLIP / VLM

Data Mining

👁️‍🗨️ Vision

Game Bot

🧑‍💻 Git

🌐 GNN

👨 Personal Web Templates

💬 NLP

💻 nodesktop

🧊 object-centric learning

📖 Open Vocabulary

🎑 Scene Graph

📜 Templates

⚙️ Setup, dotfile

🎇 Part Segmentation

⭐ Hetero GNN / CL

🖥️ Ubuntu

🎲 Wordle

Starred repositories

self-attention