Stars
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day
[CVPR 2023] An academic alternative to Tesla's occupancy network for autonomous driving.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Implements VAR+CLIP for text-to-image (T2I) generation
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Questions that I ask myself at the end of each year and each decade.
[CVPR 2024 Highlight] GenAD: Generalized Predictive Model for Autonomous Driving & Foundation Models in Autonomous System
Visualizer for neural network, deep learning and machine learning models
Official PyTorch implementation of FB-BEV & FB-OCC - Forward-backward view transformation for vision-centric autonomous driving perception
Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D (ECCV 2020)
[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving
A concise but complete full-attention transformer with a set of promising experimental features from various papers
[ICCV'23 Workshop] SAM3D: Segment Anything in 3D Scenes
Official Code for DragGAN (SIGGRAPH 2023)
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
[ICCV 2023] OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
Use python3 to convert depth image into hha image
Lidar Point Cloud Ground Segmentation Using PatchWork++
Download the source latex code of multiple arXiv paper with one click
Vision-Centric BEV Perception: A Survey