Generalized Few-Shot Point Cloud Segmentation via Geometric Words |
|
|
➖ |
Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via Geometry-Guided Cross-View Transformer |
|
|
➖ |
EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization |
|
|
➖ |
Multi-Task View Synthesis with Neural Radiance Fields |
|
|
➖ |
Multi-Task Learning with Knowledge Distillation for Dense Prediction |
➖ |
|
➖ |
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World |
|
|
➖ |
CMDA: Cross-Modality Domain Adaptation for Nighttime Semantic Segmentation |
|
|
➖ |
VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks for Visual Question Answering |
➖ |
|
➖ |
Disentangle then Parse: Night-Time Semantic Segmentation with Illumination Disentanglement |
|
|
➖ |
Visual Traffic Knowledge Graph Generation from Scene Images |
|
|
➖ |
Agglomerative Transformer for Human-Object Interaction Detection |
|
|
➖ |
3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation |
|
|
➖ |
HiLo: Exploiting High Low Frequency Relations for Unbiased Panoptic Scene Graph Generation |
|
|
➖ |
RLIPv2: Fast Scaling of Relational Language-Image Pre-Training |
|
|
➖ |
UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase |
|
|
➖ |
See more and Know More: Zero-Shot Point Cloud Segmentation via Multi-Modal Visual Data |
|
|
➖ |
Compositional Feature Augmentation for Unbiased Scene Graph Generation |
|
|
➖ |
Multi-Weather Image Restoration via Domain Translation |
|
|
➖ |
CLIPTER: Looking at the Bigger Picture in Scene Text Recognition |
➖ |
|
➖ |
Towards Models that Can See and Read |
➖ |
|
➖ |
SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving |
|
|
➖ |
DDP: Diffusion Model for Dense Visual Prediction |
|
|
➖ |
Understanding 3D Object Interaction from a Single Image |
|
|
|
ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces |
|
|
|
Improving Equivariance in State-of-the-Art Supervised Depth and Normal Predictors |
|
|
➖ |
CrossMatch: Source-Free Domain Adaptive Semantic Segmentation via Cross-Modal Consistency Training |
➖ |
|
➖ |
Semantic Attention Flow Fields for Monocular Dynamic Scene Decomposition |
|
|
➖ |
Holistic Geometric Feature Learning for Structured Reconstruction |
|
|
➖ |
Scalable Multi-Temporal Remote Sensing Change Data Generation via Simulating Stochastic Change Process |
|
|
➖ |
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts |
➖ |
|
➖ |
Thinking Image Color Aesthetics Assessment: Models, Datasets and Benchmarks |
|
|
➖ |
STEERER: Resolving Scale Variations for Counting and Localization via Selective Inheritance Learning |
|
|
➖ |
Object-Aware Gaze Target Detection |
|
|
➖ |
Weakly Supervised Referring Image Segmentation with Intra-Chunk and Inter-Chunk Consistency |
➖ |
|
➖ |
Vision Relation Transformer for Unbiased Scene Graph Generation |
|
|
|
DDIT: Semantic Scene Completion via Deformable Deep Implicit Templates |
➖ |
|
➖ |
DQS3D: Densely-Matched Quantization-Aware Semi-Supervised 3D Detection |
|
|
➖ |
Shape Anchor Guided Holistic Indoor Scene Understanding |
|
|
➖ |
SGAligner: 3D Scene Alignment with Scene Graphs |
|
|
|
Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation |
|
|
|