Stars
Falcon: A Remote Sensing Vision-Language Foundation Model
The python library for real-time communication
📄 A curated list of awesome .cursorrules files
Real-time pose estimation pipeline with 🤗 Transformers
Inference and fine-tuning examples for vision models from 🤗 Transformers
(CVPR 2025) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models"
GeoPixel: A Pixel Grounding Large Multimodal Model for Remote Sensing is specifically developed for high-resolution remote sensing image analysis, offering advanced multi-target pixel grounding cap…
BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Official code for "SRFormer: Permuted Self-Attention for Single Image Super-Resolution" (ICCV 2023) and SRFormerV2
Open and efficient video watermarking
(NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
A course on aligning smol models.
A mini-framework for evaluating LLM performance on the Bulls and Cows number guessing game, supporting multiple LLM providers.
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]
[NeurIPS 2024] Code release for "Segment Anything without Supervision"
Unbearably fast near-real-time hybrid runtime-static type-checking in pure Python.
[ICLR 2025] Diffusion Feedback Helps CLIP See Better
CoTracker is a model for tracking any point (pixel) on a video.
[ECCV 2024 & NeurIPS 2024] Official implementation of the paper TAPTR & TAPTRv2 & TAPTRv3
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Official repo for VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset..