Skip to content

This repository collects research papers of large Foundation Models for Scenario Generation and Analysis in Autonomous Driving. The repository will be continuously updated to track the latest update.

License

Notifications You must be signed in to change notification settings

TUM-AVS/FM-AD-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 

Repository files navigation

Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis 🚗

Paper Badge Stars Badge Forks Badge Pull Requests Badge Issues Badge License Badge

This repository will collect research, implementations, and resources related to Foundation Models for Scenario Generation and Analysis in autonomous driving. The repository will be maintained by TUM-AVS (Professorship of Autonomous Vehicle Systems at Technical University of Munich) and will be continuously updated to track the latest work in the community.

🔥 Updates

  • [Jun.2025] Paper uploaded to arXiv
  • [May.2025] Repository initialized

🤝   Citation

Please visit Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis for more details and comprehensive information. If you find our paper and repo helpful, please consider citing it as follows:

@misc{gao2025foundationmodelsautonomousdriving,
  title={Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis},
  author={Yuan Gao, Mattia Piccinini, Yuchen Zhang, Dingrui Wang, Korbinian Moller, Roberto Brusnicki, Baha Zarrouki, Alessio Gambi, Jan Frederik Totz, Kai Storms, Steven Peters, Andrea Stocco, Bassam Alrifaee, Marco Pavone and Johannes Betz,
  journal={TBD},
  year={2025},
  eprint={2506.11526},
  archivePrefix={arXiv},
  primaryClass={cs.RO},
  url={https://arxiv.org/abs/2506.11526}, 
}

📃 Introduction

Foundation models are large-scale, pre-trained models that can be adapted to a wide range of downstream tasks. In the context of autonomous driving, foundation models offer a powerful approach to scenario generation and analysis, enabling more comprehensive and realistic testing, validation, and verification of autonomous driving systems. This repository aims to collect and organize research, tools, and resources in this important field.

📈 Publication Timeline

The following figure shows the evolution of foundation model research in autonomous driving scenario generation and analysis over time:

🔍 Search Methodology

The following list of keywords was used to search this survey's papers in the Google Scholar database. The keywords were entered either individually or in combination with other keywords in the list. The search was conducted until May 2025.

Keywords:

  • Foundation Model Types: Foundation Models, Large Language Models (LLMs), Vision-Language Models (VLMs), Multimodal Large Language Models (MLLMs), Diffusion Models (DMs), World Models (WMs), Generative Models (GMs)
  • Scenario Generation & Analysis: Scenario Generation, Scenario Simulation, Traffic Simulation, Scenario Testing, Scenario Understanding, Driving Scene Generation, Scene Reasoning, Risk Assessment, Safety-Critical Scenarios, Accident Prediction
  • Application Context: Autonomous Driving, Self-Driving Vehicles, AV Simulation, Driving Video Generation, Traffic Datasets, Closed-Loop Simulation, Safety Assurance

🌟 Large Language Models for Autonomous Driving

Scenario Generation (LLM)
Paper Date Venue Code
TARGET: Automated Scenario Generation from Traffic Rules for Testing Autonomous Vehicles 2023-05 arXiv -
Language Conditioned Traffic Generation 2023-07 CoRL 2023 GitHub
A Generative AI-driven Application: Use of Large Language Models for Traffic Scenario Generation 2023-11 ELECO 2023 -
ChatGPT-Based Scenario Engineer: A New Framework on Scenario Generation for Trajectory Prediction 2024-02 IEEE Transactions on Intelligent Vehicles -
Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation 2024-04 arXiv GitHub
LLMScenario: Large Language Model Driven Scenario Generation 2024-05 IEEE Transactions on Systems, Man, and Cybernetics: Systems -
Automatic Generation Method for Autonomous Driving Simulation Scenarios Based on Large Language Model 2024-05 AIAT 2024 -
ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles 2024-05 CVPR 2024 GitHub
Editable scene simulation for autonomous driving via collaborative llm-agents 2024-06 CVPR 2024 GitHub
Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model 2024-06 IV 2024 GitHub
SoVAR: Building Generalizable Scenarios from Accident Reports for Autonomous Driving Testing 2024-09 ASE 2024 -
LeGEND: A Top-Down Approach to Scenario Generation of Autonomous Driving Systems Assisted by Large Language Models 2024-09 ASE 2024 GitHub
Traffic Scene Generation from Natural Language Description for Autonomous Vehicles with Large Language Model 2024-09 arXiv GitHub
Promptable Closed-loop Traffic Simulation 2024-09 CoRL 2024 GitHub
Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles 2024-09 arXiv -
LLM-Driven Testing for Autonomous Driving Scenarios 2024-11 FLLM 2024 -
ChatSUMO: Large Language Model for Automating Traffic Scenario Generation in Simulation of Urban MObility 2024-11 IEEE Transactions on Intelligent Vehicles -
Generating Out-Of-Distribution Scenarios Using Language Models 2024-11 arXiv -
Generating Traffic Scenarios via In-Context Learning to Learn Better Motion Planner 2024-12 AAAI 2025 Oral GitHub
LLM-attacker: Enhancing Closed-loop Adversarial Scenario Generation for Autonomous Driving with Large Language Models 2025-01 arXiv -
Risk-Aware Driving Scenario Analysis with Large Language Models 2025-02 arXiv GitHub
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models 2025-02 arXiv GitHub
Text2Scenario: Text-Driven Scenario Generation for Autonomous Driving Test 2025-03 arXiv GitHub
Seeking to Collide: Online Safety-Critical Scenario Generation for Autonomous Driving with Retrieval Augmented Large Language Models 2025-05 arXiv -
Scenario Analysis (LLM)
Paper Date Venue Code
Semantic Anomaly Detection with Large Language Models 2023-09 Autonomous Robots -
LLM Multimodal Traffic Accident Forecasting 2023-11 Sensors 2023 MDPI -
Reality Bites: Assessing the Realism of Driving Scenarios with Large Language Models 2024-03 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering (Forge) GitHub
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving 2024-05 ICRA 2024 GitHub
Generating Out-Of-Distribution Scenarios Using Language Models 2024-11 arXiv -
SenseRAG: Constructing Environmental Knowledge Bases with Proactive Querying for LLM-Based Autonomous Driving 2025-01 arXiv -
Risk-Aware Driving Scenario Analysis with Large Language Models 2025-02 arXiv GitHub
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models 2025-02 arXiv GitHub
A Comprehensive LLM-powered Framework for Driving Intelligence Evaluation 2025-03 arXiv -

🌟 Vision-Language Models for Autonomous Driving

Scenario Generation (VLM)
Paper Date Venue Code
WEDGE: A multi-weather autonomous driving dataset built from generative vision-language models 2023-05 CVPR workshop 2023 -
DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving 2024-08 IAVVC 2024 -
Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles 2024-09 arXiv -
From Dashcam Videos to Driving Simulations: Stress Testing Automated Vehicles against Rare Events 2024-11 arXiv -
Generating Out-Of-Distribution Scenarios Using Language Models 2024-11 arXiv -
From Accidents to Insights: Leveraging Multimodal Data for Scenario-Driven ADS Testing 2025-02 arXiv -
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models 2025-02 arXiv GitHub
Scenario Analysis (VLM)
Paper Date Venue Code
Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous Driving 2023-09 ICCV 2023 -
OpenAnnotate3D: Open-Vocabulary Auto-Labeling System for Multi-modal 3D Data 2023-10 ICRA 2024 -
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving 2023-11 ICIL 2024 Workshop on Large Language Models for Agents GitHub
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving 2023-11 ICRA 2024 GitHub
LLM Multimodal Traffic Accident Forecasting 2023-11 Sensors 2023 MDPI -
NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup Annotations 2024-01 WACVW LLVM-AD 2024 GitHub
Is it safe to cross? Interpretable Risk Assessment with GPT-4V for Safety-Aware Street Crossing 2024-02 UR 2024 -
Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving 2024-03 VLADR 2024 GitHub
LATTE: A Real-time Lightweight Attention-based Traffic Accident Anticipation Engine 2024-04 arXiv -
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning 2024-05 CVPR 2025 GitHub
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving 2024-06 ECCV 2024 GitHub
Large Language Models Powered Context-aware Motion Prediction in Autonomous Driving 2024-07 IROS 2024 GitHub
DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving 2024-08 IAVVC 2024 -
Think-Driver: From Driving-Scene Understanding to Decision-Making with Vision Language Models 2024-09 ECCV 2024 Workshop -
Generating Out-Of-Distribution Scenarios Using Language Models 2024-11 arXiv -
Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases 2024-12 WACV 2025 GitHub
SFF Rendering-Based Uncertainty Prediction using VisionLLM 2024-12 AAAI 2025 Workshop LM4Plan -
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives 2025-01 arXiv GitHub
Enhancing Large Vision Model in Street Scene Semantic Understanding through Leveraging Posterior Optimization Trajectory 2025-01 arXiv -
DriveLM: Driving with Graph Visual Question Answering 2025-01 ECCV 2024 GitHub
Scenario Understanding of Traffic Scenes Through Large Visual Language Models 2025-01 WACV 2025 -
INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation 2025-02 arXiv -
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models 2025-02 arXiv GitHub
Evaluating Multimodal Vision-Language Model Prompting Strategies for Visual Question Answering in Road Scene Understanding 2025-02 WACV workshop 2025 -
NuGrounding: A Multi-View 3D Visual Grounding Framework in Autonomous Driving 2025-03 arXiv -
AutoDrive-QA- Automated Generation of Multiple-Choice Questions for Autonomous Driving Datasets Using Large Vision-Language Models 2025-03 arXiv GitHub
DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding 2025-03 arXiv GitHub
Vision Foundation Model Embedding-Based Semantic Anomaly Detection 2025-05 ICRA 2025 Workshop -
OpenLKA: An Open Dataset of Lane Keeping Assist from Recent Car Models under Real-world Driving Conditions 2025-05 arXiv GitHub
Bridging Human Oversight and Black-box Driver Assistance: Vision-Language Models for Predictive Alerting in Lane Keeping Assist systems 2025-05 arXiv -

🌟 Multimodal Large Language Models for Autonomous Driving

Scenario Generation (MLLM)
Paper Date Venue Code
Realistic Corner Case Generation for Autonomous Vehicles with Multimodal Large Language Model 2024-11 arXiv -
LMM-enhanced Safety-Critical Scenario Generation for Autonomous Driving System Testing From Non-Accident Traffic Videos 2025-01 arXiv GitHub
Scenario Analysis (MLLM)
Paper Date Venue Code
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model 2023-10 IEEE Robotics and Automation Letters 2024 GitHub
Dolphins: Multimodal Language Model for Driving 2023-12 ECCV 2024 GitHub
AccidentGPT: Accident analysis and prevention from V2X Environmental Perception with Multi-modal Large Model 2023-12 IV 2024 GitHub
Lidar-llm: Exploring the potential of large language models for 3d lidar understanding 2023-12 AAAI 2025 GitHub
LingoQA: Visual Question Answering for Autonomous Driving 2023-12 ECCV 2024 GitHub
Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models 2024-01 CVPR 2024 GitHub
MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding 2024-01 CVPR 2024 GitHub
WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-Grained Spatial-Temporal Understanding 2024-06 ECCV 2024 GitHub
Semantic Understanding of Traffic Scenes with Large Vision Language Models 2024-06 IV 2024 GitHub
VLAAD: Vision and Language Assistant for Autonomous Driving 2024-06 WACVW 2024 GitHub
InternDrive: A Multimodal Large Language Model for Autonomous Driving Scenario Understanding 2024-07 AIAHPC 2024 -
LingoQA: Visual Question Answering for Autonomous Driving 2024-09 ECCV 2024 GitHub
Using Multimodal Large Language Models for Automated Detection of Traffic Safety Critical Events 2024-09 Vehicles 2024 MDPI -
MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios 2024-12 arXiv GitHub
TUMTraffic-VideoQA: A Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes 2025-02 ICML 2025 GitHub
ScVLM: Enhancing Vision-Language Model for Safety-Critical Event Understanding 2025-02 WACV Workshop 2025 GitHub
HiLM-D: Enhancing MLLMs with Multi-Scale High-Resolution Details for Autonomous Driving 2025-03 International Journal of Computer Vision -
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models 2025-03 arXiv -
Tracking Meets Large Multimodal Models for Driving Scenario Understanding 2025-03 arXiv GitHub
V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving 2025-04 arXiv -
Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video Understanding 2025-04 arXiv GitHub

🌟 Diffusion Models for Autonomous Driving

Scenario Generation (Diffusion Models)
Paper Date Venue Code
Guided Conditional Diffusion for Controllable Traffic Simulation 2022-10 ICRA 2023 GitHub
Generating Driving Scenes with Diffusion 2023-05 arXiv -
DiffScene: Guided Diffusion Models for Safety-Critical Scenario Generation 2023-06 AdvML-Frontiers 2023 -
BEVControl: Accurately Controlling Street-view Elements with Multi-perspective Consistency via BEV Sketch Layout 2023-09 arXiv -
DriveSceneGen: Generating Diverse and Realistic Driving Scenarios From Scratch 2023-09 IEEE Robotics and Automation Letters 2024 -
MagicDrive: Street View Generation with Diverse 3D Geometry Control 2023-10 ICLR 2024 GitHub
DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model 2023-10 ECCV 2024 -
Language-guided traffic simulation via scene-level diffusion 2023-11 CoRL 2023 -
Scenario Diffusion: Controllable Driving Scenario Generation With Diffusion 2023-11 NeurIPS 2023 -
Panacea: Panoramic and Controllable Video Generation for Autonomous Driving 2023-11 CVPR 2024 GitHub
SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries 2023-12 ECCV 2024 GitHub
Text2Street: Controllable Text-to-image Generation for Street Views 2024-02 ICPR 2024 -
GEODIFFUSION: Text-Prompted Geometric Control for Object Detection Data Generation 2024-02 LCLR 2024 GitHub
GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model 2024-04 ITSC 2024 -
Versatile Behavior Diffusion for Generalized Traffic Agent Simulation 2024-04 RSS 2024 GitHub
SceneControl: Diffusion for Controllable Traffic Scene Generation 2024-05 ICRA 2024 -
SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic 2024-07 ECCV 2024 GitHub
DrivingGen: Efficient Safety-Critical Driving Video Generation with Latent Diffusion Models 2024-07 ICME 2024 -
AdvDiffuser: Generating Adversarial Safety-Critical Driving Scenarios via Guided Diffusion 2024-10 IROS 2023 -
Data-driven Diffusion Models for Enhancing Safety in Autonomous Vehicle Traffic Simulations 2024-10 arXiv -
DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing 2024-11 arXiv -
SceneDiffuser: Efficient and Controllable Driving Simulation Initialization and Rollout 2024-12 NeurIPS 2024 GitHub
Direct Preference Optimization-Enhanced Multi-Guided Diffusion Model for Traffic Scenario Generation 2025-02 arXiv -
Causal Composition Diffusion Model for Closed-loop Traffic Generation 2025-02 arXiv -
AVD2: Accident Video Diffusion for Accident Video Description 2025-03 ICRA 2025 GitHub
DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance 2025-03 arXiv -
Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments 2025-03 arXiv -
DriveGen: Towards Infinite Diverse Traffic Scenarios with Large Models 2025-03 arXiv -
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer 2025-04 arXiv -
DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion 2025-05 arXiv -
LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios 2025-05 arXiv -
Dual-Conditioned Temporal Diffusion Modeling for Driving Scene Generation 2025-05 ICAR 2025 GitHub
Scenario Analysis (Diffusion Models)
Paper Date Venue Code
AVD2: Accident Video Diffusion for Accident Video Description 2025-03 ICRA 2025 GitHub

🌟 World Models for Autonomous Driving

World Models for Autonomous Driving
Paper Date Venue Code Application
DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving 2023-09 ECCV 2024 GitHub Scenario Generation
GAIA-1: A Generative World Model for Autonomous Driving 2023-09 arXiv Wayve - Scenario Generation
Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion 2023-11 ICLR 2024 - Scenario Generation
MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations 2023-11 IV 2025 - Scenario Generation
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving 2023-11 CVPR 2024 GitHub Scenario Generation
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability 2024-03 NeurIPS 2024 - Scenario Generation
MagicDrive: Street View Generation with Diverse 3D Geometry Control 2024-05 arXiv - Scenario Generation
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation 2024-05 AAAI 2025 GitHub Scenario Generation
UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving 2024-08 RAL 2024 - Scenario Generation
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation 2024-08 ECCV 2024 - Scenario Generation
Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving 2024-08 CVPR 2024 GitHub Scenario Generation
DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving 2024-08 arXiv GitHub Scenario Generation
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment 2024-08 arXiv - Scenario Generation
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation 2024-11 CVPR 2025 GitHub Scenario Generation
ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration 2024-11 arXiv - Scenario Generation
MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes 2024-11 arXiv GitHub Scenario Generation
MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control 2024-11 arXiv GitHub Scenario Generation
ACT-Bench: Towards Action Controllable World Models for Autonomous Driving 2024-12 arXiv - Scenario Generation
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control 2024-12 CVPR 2025 GitHub Scenario Generation
SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model 2024-12 CVPR 2025 - Scenario Generation
DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT 2024-12 arXiv GitHub Scenario Generation
Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving 2025-01 AAAI 2025 GitHub Scenario Generation
DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance 2025-03 ICRA 2025 GitHub Scenario Generation
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning 2025-03 arXiv GitHub Scenario Generation
GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving 2025-03 arXiv - Scenario Generation
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control 2025-04 arXiv GitHub Scenario Generation
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving 2025-05 arXiv - Scenario Generation
PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth 2025-05 arXiv - Scenario Generation

📊 Datasets Comparison

The following figure shows the usage distribution of different foundation model types across autonomous driving datasets:

Datasets Comparison
Dataset Year Img View Real Lidar Radar Traj 3D 2D Lane Weather Time Region Company
CamVid 2009 RGB FPV ✖️ ✖️ ✖️ ✖️ D U -
KITTI 2013 RGB/S FPV ✖️ D U/R/H -
Cyclists 2016 RGB FPV ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ D U -
Cityscapes 2016 RGB/S FPV ✖️ ✖️ ✖️ ✖️ D U -
SYNTHIA 2016 RGB FPV ✖️ ✖️ ✖️ ✖️ D/N U -
Campus 2016 RGB BEV ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ D C -
RobotCar 2016 RGB FPV ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ D/N U -
Mapillary 2017 RGB FPV ✖️ ✖️ ✖️ ✖️ D/N U -
P.F.B. 2017 RGB FPV ✖️ ✖️ ✖️ ✖️ D/N U -
BDD100K 2018 RGB FPV ✖️ ✖️ ✖️ D U/H -
HighD 2018 RGB BEV ✖️ ✖️ ✖️ ✖️ ✖️ D H -
Udacity 2018 RGB FPV ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ D U -
KAIST 2018 RGB/S FPV ✖️ ✖️ ✖️ D/N U -
Argoverse 2019 RGB/S FPV ✖️ ✖️ ✖️ D/N U -
TRAF 2019 RGB FPV ✖️ ✖️ ✖️ ✖️ D U -
ApolloScape 2019 RGB/S FPV ✖️ ✖️ ✖️ D U -
ACFR 2019 RGB BEV ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ D RA -
H3D 2019 RGB FPV ✖️ ✖️ ✖️ ✖️ D U -
INTERACTION 2019 RGB BEV ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ D I/RA -
Comma2k19 2019 RGB FPV ✖️ ✖️ ✖️ ✖️ ✖️ D/N U/S/R/H -
InD 2020 RGB BEV ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ D I -
RounD 2020 RGB BEV ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ D RA -
nuScenes 2020 RGB FPV ✖️ D/N U -
Lyft Level 5 2020 RGB FPV ✖️ D/N U/S -
Waymo Open 2020 RGB FPV D/N U -
A*3D 2020 RGB FPV D/N U -
RobotCar Radar 2020 RGB FPV D/N U -
Toronto3D 2020 RGB BEV ✖️ ✖️ ✖️ D/N U University of Waterloo
A2D2 2020 RGB FPV ✖️ D U/H/S/R
WADS 2020 RGB FPV ✖️ ✖️ D/N U/S/R Michigan Technological University
Argoverse 2 2021 RGB/S FPV ✖️ ✖️ D/N U -
PandaSet 2021 RGB FPV D/N U -
ONCE 2021 RGB FPV D/N U -
Leddar PixSet 2021 RGB FPV ✖️ ✖️ D/N U/S/R Leddar
ZOD 2022 RGB FPV D/N U/R/S/H Zenseact
IDD-3D 2022 RGB FPV ✖️ ✖️ ✖️ ✖️ - R INAI
CODA 2022 RGB FPV D/N U/S/R Huawei
SHIFT 2022 RGB FPV D/N U/S/R/H ETH Zürich
DeepAccident 2023 RGB/S FPV/BEV ✖️ ✖️ ✖️ D/N U/S/R/H HKU, Huawei, CARLA
Dual_Radar 2023 RGB FPV ✖️ D/N U Tsinghua University
V2V4Real 2023 RGB FPV ✖️ ✖️ ✖️ - U/H/S UCLA Mobility Lab
SCaRL 2024 RGB/S FPV/BEV ✖️ D/N U/S/R/H Fraunhofer CARLA
MARS 2024 RGB FPV D/N U/S/H NYU, MAY Mobility
Scenes101 2024 RGB FPV ✖️ ✖️ ✖️ ✖️ D/N U/S/R/H Wayve
TruckScenes 2025 RGB FPV ✖️ D/N H/U MAN

Notes: View: FPV=First-Person, BEV=Bird's-Eye; Time: D=Day, N=Night; Region: U=Urban, R=Rural, H=Highway, S=Suburban, C=Campus, I=Intersection, RA=Road Area; Img: RGB/S=RGB+Stereo

🎮 Simulators

The following figure shows the usage distribution of different foundation model types across autonomous driving simulators:

Simulators
Simulator Year Back-end Open Source Realistic Perception Custom Scenario Real World Map Human Design Map Python API C++ API ROS API Company
TORCS 2000 None ✖️ ✖️ ✖️ ✖️ ✖️ -
Webots 2004 ODE ✖️ ✖️ -
CarRacing 2017 None ✖️ ✖️ ✖️ ✖️ ✖️ -
CARLA 2017 UE4 ✖️ -
SimMobilityST 2017 None ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ -
GTA-V 2017 RAGE ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ -
highway-env 2018 None ✖️ ✖️ ✖️ ✖️ -
Deepdrive 2018 UE4 ✖️ ✖️ -
esmini 2018 Unity ✖️ ✖️ ✖️ ✖️ ✖️ ✖️ -
AutonoViSim 2018 PhysX ✖️ ✖️ ✖️ ✖️ ✖️ -
AirSim 2018 UE4 ✖️ ✖️ -
SUMO 2018 None ✖️ ✖️ ✖️ -
Apollo 2018 Unity ✖️ -
Sim4CV 2018 UE4 ✖️ ✖️ ✖️ -
MATLAB 2018 MATLAB ✖️ Mathworks
Scenic 2019 None ✖️ ✖️ Toyota Research Institute, UC Berkeley
SUMMIT 2020 UE4 ✖️ ✖️ -
MultiCarRacing 2020 None ✖️ ✖️ ✖️ ✖️ -
SMARTS 2020 None ✖️ ✖️ -
LGSVL 2020 Unity -
CausalCity 2020 UE4 ✖️ ✖️ -
Vista 2020 None ✖️ ✖️ ✖️ MIT
MetaDrive 2021 Panda3D ✖️ -
L2R 2021 UE4 ✖️ -
AutoDRIVE 2021 Unity -
Nuplan 2021 None ✖️ ✖️ Motional
AWSIM 2021 Unity ✖️ ✖️ Autoware
InterSim 2022 None ✖️ ✖️ ✖️ Tsinghua
Nocturne 2022 None ✖️ Facebook
BeamNG.tech 2022 Soft-body physics ✖️ ✖️ ✖️ BeamNG GmbH
Waymax 2023 JAX ✖️ ✖️ ✖️ Waymo
UNISim 2023 None ✖️ ✖️ ✖️ ✖️ Waabi
TBSim 2023 None ✖️ ✖️ NVIDIA
Nvidia DriveWorks 2024 Nvidia GPU ✖️ ✖️ ✖️ NVIDIA

🏆 Foundation Model Benchmark Challenges (2022–2025)

Benchmark Challenges

Autonomous Driving

Name Host
CARLA AD Challenge CARLA
DRL4Real ICCV
Waymo Open Dataset Challenge Waymo / CVPR WAD
Argoverse 2: Scenario Mining ArgoAI
Roboflow-20VL Roboflow-VL / CVPR
AVA Challenge AVA Challenge Team

Other Fields Related to Generation and Analysis

Name Host
IGLU Challenge NeurIPS / IGLU Team
LLM Efficiency Challenge NeurIPS
Trojan Detection NeurIPS / CAIS
SMART-101 CVPR
NICE Challenge CVPR / LG Research
SyntaGen CVPR
Habitat Challenge CVPR / FAIR
BIG-bench Google Research
BIG-bench Hard (BBH) Google Research
HELM Stanford CRFM
MMBench OpenCompass
MMMU CVPR / U-Waterloo / OSU
Open LLM Leaderboard VILA-Lab
Text-to-Image Leaderboard Artificial Analysis
Ego4D FAIR
VizWiz Grand Challenge CVPR VizWiz Workshop
MedFM NeurIPS / Shanghai AI Laboratory
3D Scene Understanding CVPR

Contributing

We welcome contributions from the community! If you have research papers, tools, or resources to add, please create a pull request or open an issue.

License

This repository is released under the Apache 2.0 license.

About

This repository collects research papers of large Foundation Models for Scenario Generation and Analysis in Autonomous Driving. The repository will be continuously updated to track the latest update.

Topics

Resources

License

Stars

Watchers

Forks