Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis 🚗

This repository will collect research, implementations, and resources related to Foundation Models for Scenario Generation and Analysis in autonomous driving. The repository will be maintained by TUM-AVS (Professorship of Autonomous Vehicle Systems at Technical University of Munich) and will be continuously updated to track the latest work in the community.

🔥 Updates

[Jun.2025] Paper uploaded to arXiv
[May.2025] Repository initialized

🤝 Citation

Please visit Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis for more details and comprehensive information. If you find our paper and repo helpful, please consider citing it as follows:

@misc{gao2025foundationmodelsautonomousdriving,
  title={Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis},
  author={Yuan Gao, Mattia Piccinini, Yuchen Zhang, Dingrui Wang, Korbinian Moller, Roberto Brusnicki, Baha Zarrouki, Alessio Gambi, Jan Frederik Totz, Kai Storms, Steven Peters, Andrea Stocco, Bassam Alrifaee, Marco Pavone and Johannes Betz,
  journal={TBD},
  year={2025},
  eprint={2506.11526},
  archivePrefix={arXiv},
  primaryClass={cs.RO},
  url={https://arxiv.org/abs/2506.11526}, 
}

📃 Introduction

Foundation models are large-scale, pre-trained models that can be adapted to a wide range of downstream tasks. In the context of autonomous driving, foundation models offer a powerful approach to scenario generation and analysis, enabling more comprehensive and realistic testing, validation, and verification of autonomous driving systems. This repository aims to collect and organize research, tools, and resources in this important field.

📈 Publication Timeline

The following figure shows the evolution of foundation model research in autonomous driving scenario generation and analysis over time:

🔍 Search Methodology

The following list of keywords was used to search this survey's papers in the Google Scholar database. The keywords were entered either individually or in combination with other keywords in the list. The search was conducted until May 2025.

Keywords:

Foundation Model Types: Foundation Models, Large Language Models (LLMs), Vision-Language Models (VLMs), Multimodal Large Language Models (MLLMs), Diffusion Models (DMs), World Models (WMs), Generative Models (GMs)
Scenario Generation & Analysis: Scenario Generation, Scenario Simulation, Traffic Simulation, Scenario Testing, Scenario Understanding, Driving Scene Generation, Scene Reasoning, Risk Assessment, Safety-Critical Scenarios, Accident Prediction
Application Context: Autonomous Driving, Self-Driving Vehicles, AV Simulation, Driving Video Generation, Traffic Datasets, Closed-Loop Simulation, Safety Assurance

🌟 Large Language Models for Autonomous Driving

Scenario Generation (LLM)

Paper	Date	Venue	Code
TARGET: Automated Scenario Generation from Traffic Rules for Testing Autonomous Vehicles	2023-05	arXiv	-
Language Conditioned Traffic Generation	2023-07	CoRL 2023	GitHub
A Generative AI-driven Application: Use of Large Language Models for Traffic Scenario Generation	2023-11	ELECO 2023	-
ChatGPT-Based Scenario Engineer: A New Framework on Scenario Generation for Trajectory Prediction	2024-02	IEEE Transactions on Intelligent Vehicles	-
Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation	2024-04	arXiv	GitHub
LLMScenario: Large Language Model Driven Scenario Generation	2024-05	IEEE Transactions on Systems, Man, and Cybernetics: Systems	-
Automatic Generation Method for Autonomous Driving Simulation Scenarios Based on Large Language Model	2024-05	AIAT 2024	-
ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles	2024-05	CVPR 2024	GitHub
Editable scene simulation for autonomous driving via collaborative llm-agents	2024-06	CVPR 2024	GitHub
Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model	2024-06	IV 2024	GitHub
SoVAR: Building Generalizable Scenarios from Accident Reports for Autonomous Driving Testing	2024-09	ASE 2024	-
LeGEND: A Top-Down Approach to Scenario Generation of Autonomous Driving Systems Assisted by Large Language Models	2024-09	ASE 2024	GitHub
Traffic Scene Generation from Natural Language Description for Autonomous Vehicles with Large Language Model	2024-09	arXiv	GitHub
Promptable Closed-loop Traffic Simulation	2024-09	CoRL 2024	GitHub
Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles	2024-09	arXiv	-
LLM-Driven Testing for Autonomous Driving Scenarios	2024-11	FLLM 2024	-
ChatSUMO: Large Language Model for Automating Traffic Scenario Generation in Simulation of Urban MObility	2024-11	IEEE Transactions on Intelligent Vehicles	-
Generating Out-Of-Distribution Scenarios Using Language Models	2024-11	arXiv	-
Generating Traffic Scenarios via In-Context Learning to Learn Better Motion Planner	2024-12	AAAI 2025 Oral	GitHub
LLM-attacker: Enhancing Closed-loop Adversarial Scenario Generation for Autonomous Driving with Large Language Models	2025-01	arXiv	-
Risk-Aware Driving Scenario Analysis with Large Language Models	2025-02	arXiv	GitHub
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models	2025-02	arXiv	GitHub
Text2Scenario: Text-Driven Scenario Generation for Autonomous Driving Test	2025-03	arXiv	GitHub
Seeking to Collide: Online Safety-Critical Scenario Generation for Autonomous Driving with Retrieval Augmented Large Language Models	2025-05	arXiv	-

Scenario Analysis (LLM)

Paper	Date	Venue	Code
Semantic Anomaly Detection with Large Language Models	2023-09	Autonomous Robots	-
LLM Multimodal Traffic Accident Forecasting	2023-11	Sensors 2023 MDPI	-
Reality Bites: Assessing the Realism of Driving Scenarios with Large Language Models	2024-03	IEEE/ACM First International Conference on AI Foundation Models and Software Engineering (Forge)	GitHub
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving	2024-05	ICRA 2024	GitHub
Generating Out-Of-Distribution Scenarios Using Language Models	2024-11	arXiv	-
SenseRAG: Constructing Environmental Knowledge Bases with Proactive Querying for LLM-Based Autonomous Driving	2025-01	arXiv	-
Risk-Aware Driving Scenario Analysis with Large Language Models	2025-02	arXiv	GitHub
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models	2025-02	arXiv	GitHub
A Comprehensive LLM-powered Framework for Driving Intelligence Evaluation	2025-03	arXiv	-

🌟 Vision-Language Models for Autonomous Driving

Scenario Generation (VLM)

Paper	Date	Venue	Code
WEDGE: A multi-weather autonomous driving dataset built from generative vision-language models	2023-05	CVPR workshop 2023	-
DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving	2024-08	IAVVC 2024	-
Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles	2024-09	arXiv	-
From Dashcam Videos to Driving Simulations: Stress Testing Automated Vehicles against Rare Events	2024-11	arXiv	-
Generating Out-Of-Distribution Scenarios Using Language Models	2024-11	arXiv	-
From Accidents to Insights: Leveraging Multimodal Data for Scenario-Driven ADS Testing	2025-02	arXiv	-
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models	2025-02	arXiv	GitHub

Scenario Analysis (VLM)

Paper	Date	Venue	Code
Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous Driving	2023-09	ICCV 2023	-
OpenAnnotate3D: Open-Vocabulary Auto-Labeling System for Multi-modal 3D Data	2023-10	ICRA 2024	-
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving	2023-11	ICIL 2024 Workshop on Large Language Models for Agents	GitHub
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving	2023-11	ICRA 2024	GitHub
LLM Multimodal Traffic Accident Forecasting	2023-11	Sensors 2023 MDPI	-
NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup Annotations	2024-01	WACVW LLVM-AD 2024	GitHub
Is it safe to cross? Interpretable Risk Assessment with GPT-4V for Safety-Aware Street Crossing	2024-02	UR 2024	-
Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving	2024-03	VLADR 2024	GitHub
LATTE: A Real-time Lightweight Attention-based Traffic Accident Anticipation Engine	2024-04	arXiv	-
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning	2024-05	CVPR 2025	GitHub
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving	2024-06	ECCV 2024	GitHub
Large Language Models Powered Context-aware Motion Prediction in Autonomous Driving	2024-07	IROS 2024	GitHub
DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving	2024-08	IAVVC 2024	-
Think-Driver: From Driving-Scene Understanding to Decision-Making with Vision Language Models	2024-09	ECCV 2024 Workshop	-
Generating Out-Of-Distribution Scenarios Using Language Models	2024-11	arXiv	-
Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases	2024-12	WACV 2025	GitHub
SFF Rendering-Based Uncertainty Prediction using VisionLLM	2024-12	AAAI 2025 Workshop LM4Plan	-
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives	2025-01	arXiv	GitHub
Enhancing Large Vision Model in Street Scene Semantic Understanding through Leveraging Posterior Optimization Trajectory	2025-01	arXiv	-
DriveLM: Driving with Graph Visual Question Answering	2025-01	ECCV 2024	GitHub
Scenario Understanding of Traffic Scenes Through Large Visual Language Models	2025-01	WACV 2025	-
INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation	2025-02	arXiv	-
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models	2025-02	arXiv	GitHub
Evaluating Multimodal Vision-Language Model Prompting Strategies for Visual Question Answering in Road Scene Understanding	2025-02	WACV workshop 2025	-
NuGrounding: A Multi-View 3D Visual Grounding Framework in Autonomous Driving	2025-03	arXiv	-
AutoDrive-QA- Automated Generation of Multiple-Choice Questions for Autonomous Driving Datasets Using Large Vision-Language Models	2025-03	arXiv	GitHub
DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding	2025-03	arXiv	GitHub
Vision Foundation Model Embedding-Based Semantic Anomaly Detection	2025-05	ICRA 2025 Workshop	-
OpenLKA: An Open Dataset of Lane Keeping Assist from Recent Car Models under Real-world Driving Conditions	2025-05	arXiv	GitHub
Bridging Human Oversight and Black-box Driver Assistance: Vision-Language Models for Predictive Alerting in Lane Keeping Assist systems	2025-05	arXiv	-

🌟 Multimodal Large Language Models for Autonomous Driving

Scenario Generation (MLLM)

Paper	Date	Venue	Code
Realistic Corner Case Generation for Autonomous Vehicles with Multimodal Large Language Model	2024-11	arXiv	-
LMM-enhanced Safety-Critical Scenario Generation for Autonomous Driving System Testing From Non-Accident Traffic Videos	2025-01	arXiv	GitHub

Scenario Analysis (MLLM)

Paper	Date	Venue	Code
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model	2023-10	IEEE Robotics and Automation Letters 2024	GitHub
Dolphins: Multimodal Language Model for Driving	2023-12	ECCV 2024	GitHub
AccidentGPT: Accident analysis and prevention from V2X Environmental Perception with Multi-modal Large Model	2023-12	IV 2024	GitHub
Lidar-llm: Exploring the potential of large language models for 3d lidar understanding	2023-12	AAAI 2025	GitHub
LingoQA: Visual Question Answering for Autonomous Driving	2023-12	ECCV 2024	GitHub
Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models	2024-01	CVPR 2024	GitHub
MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding	2024-01	CVPR 2024	GitHub
WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-Grained Spatial-Temporal Understanding	2024-06	ECCV 2024	GitHub
Semantic Understanding of Traffic Scenes with Large Vision Language Models	2024-06	IV 2024	GitHub
VLAAD: Vision and Language Assistant for Autonomous Driving	2024-06	WACVW 2024	GitHub
InternDrive: A Multimodal Large Language Model for Autonomous Driving Scenario Understanding	2024-07	AIAHPC 2024	-
LingoQA: Visual Question Answering for Autonomous Driving	2024-09	ECCV 2024	GitHub
Using Multimodal Large Language Models for Automated Detection of Traffic Safety Critical Events	2024-09	Vehicles 2024 MDPI	-
MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios	2024-12	arXiv	GitHub
TUMTraffic-VideoQA: A Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes	2025-02	ICML 2025	GitHub
ScVLM: Enhancing Vision-Language Model for Safety-Critical Event Understanding	2025-02	WACV Workshop 2025	GitHub
HiLM-D: Enhancing MLLMs with Multi-Scale High-Resolution Details for Autonomous Driving	2025-03	International Journal of Computer Vision	-
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models	2025-03	arXiv	-
Tracking Meets Large Multimodal Models for Driving Scenario Understanding	2025-03	arXiv	GitHub
V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving	2025-04	arXiv	-
Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video Understanding	2025-04	arXiv	GitHub

🌟 Diffusion Models for Autonomous Driving

Scenario Generation (Diffusion Models)

Paper	Date	Venue	Code
Guided Conditional Diffusion for Controllable Traffic Simulation	2022-10	ICRA 2023	GitHub
Generating Driving Scenes with Diffusion	2023-05	arXiv	-
DiffScene: Guided Diffusion Models for Safety-Critical Scenario Generation	2023-06	AdvML-Frontiers 2023	-
BEVControl: Accurately Controlling Street-view Elements with Multi-perspective Consistency via BEV Sketch Layout	2023-09	arXiv	-
DriveSceneGen: Generating Diverse and Realistic Driving Scenarios From Scratch	2023-09	IEEE Robotics and Automation Letters 2024	-
MagicDrive: Street View Generation with Diverse 3D Geometry Control	2023-10	ICLR 2024	GitHub
DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model	2023-10	ECCV 2024	-
Language-guided traffic simulation via scene-level diffusion	2023-11	CoRL 2023	-
Scenario Diffusion: Controllable Driving Scenario Generation With Diffusion	2023-11	NeurIPS 2023	-
Panacea: Panoramic and Controllable Video Generation for Autonomous Driving	2023-11	CVPR 2024	GitHub
SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries	2023-12	ECCV 2024	GitHub
Text2Street: Controllable Text-to-image Generation for Street Views	2024-02	ICPR 2024	-
GEODIFFUSION: Text-Prompted Geometric Control for Object Detection Data Generation	2024-02	LCLR 2024	GitHub
GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model	2024-04	ITSC 2024	-
Versatile Behavior Diffusion for Generalized Traffic Agent Simulation	2024-04	RSS 2024	GitHub
SceneControl: Diffusion for Controllable Traffic Scene Generation	2024-05	ICRA 2024	-
SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic	2024-07	ECCV 2024	GitHub
DrivingGen: Efficient Safety-Critical Driving Video Generation with Latent Diffusion Models	2024-07	ICME 2024	-
AdvDiffuser: Generating Adversarial Safety-Critical Driving Scenarios via Guided Diffusion	2024-10	IROS 2023	-
Data-driven Diffusion Models for Enhancing Safety in Autonomous Vehicle Traffic Simulations	2024-10	arXiv	-
DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing	2024-11	arXiv	-
SceneDiffuser: Efficient and Controllable Driving Simulation Initialization and Rollout	2024-12	NeurIPS 2024	GitHub
Direct Preference Optimization-Enhanced Multi-Guided Diffusion Model for Traffic Scenario Generation	2025-02	arXiv	-
Causal Composition Diffusion Model for Closed-loop Traffic Generation	2025-02	arXiv	-
AVD2: Accident Video Diffusion for Accident Video Description	2025-03	ICRA 2025	GitHub
DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance	2025-03	arXiv	-
Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments	2025-03	arXiv	-
DriveGen: Towards Infinite Diverse Traffic Scenarios with Large Models	2025-03	arXiv	-
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer	2025-04	arXiv	-
DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion	2025-05	arXiv	-
LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios	2025-05	arXiv	-
Dual-Conditioned Temporal Diffusion Modeling for Driving Scene Generation	2025-05	ICAR 2025	GitHub

Scenario Analysis (Diffusion Models)

Paper	Date	Venue	Code
AVD2: Accident Video Diffusion for Accident Video Description	2025-03	ICRA 2025	GitHub

🌟 World Models for Autonomous Driving

World Models for Autonomous Driving

Paper	Date	Venue	Code	Application
DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving	2023-09	ECCV 2024	GitHub	Scenario Generation
GAIA-1: A Generative World Model for Autonomous Driving	2023-09	arXiv Wayve	-	Scenario Generation
Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion	2023-11	ICLR 2024	-	Scenario Generation
MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations	2023-11	IV 2025	-	Scenario Generation
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving	2023-11	CVPR 2024	GitHub	Scenario Generation
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability	2024-03	NeurIPS 2024	-	Scenario Generation
MagicDrive: Street View Generation with Diverse 3D Geometry Control	2024-05	arXiv	-	Scenario Generation
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation	2024-05	AAAI 2025	GitHub	Scenario Generation
UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving	2024-08	RAL 2024	-	Scenario Generation
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation	2024-08	ECCV 2024	-	Scenario Generation
Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving	2024-08	CVPR 2024	GitHub	Scenario Generation
DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving	2024-08	arXiv	GitHub	Scenario Generation
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment	2024-08	arXiv	-	Scenario Generation
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation	2024-11	CVPR 2025	GitHub	Scenario Generation
ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration	2024-11	arXiv	-	Scenario Generation
MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes	2024-11	arXiv	GitHub	Scenario Generation
MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control	2024-11	arXiv	GitHub	Scenario Generation
ACT-Bench: Towards Action Controllable World Models for Autonomous Driving	2024-12	arXiv	-	Scenario Generation
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control	2024-12	CVPR 2025	GitHub	Scenario Generation
SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model	2024-12	CVPR 2025	-	Scenario Generation
DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT	2024-12	arXiv	GitHub	Scenario Generation
Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving	2025-01	AAAI 2025	GitHub	Scenario Generation
DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance	2025-03	ICRA 2025	GitHub	Scenario Generation
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning	2025-03	arXiv	GitHub	Scenario Generation
GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving	2025-03	arXiv	-	Scenario Generation
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control	2025-04	arXiv	GitHub	Scenario Generation
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving	2025-05	arXiv	-	Scenario Generation
PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth	2025-05	arXiv	-	Scenario Generation

📊 Datasets Comparison

The following figure shows the usage distribution of different foundation model types across autonomous driving datasets:

Datasets Comparison

Dataset	Year	Img	View	Real	Lidar	Radar	Traj	3D	2D	Lane	Weather	Time	Region	Company
CamVid	2009	RGB	FPV	✔	✖️	✖️	✖️	✖️	✔	✔	✔	D	U	-
KITTI	2013	RGB/S	FPV	✔	✔	✖️	✔	✔	✔	✔	✔	D	U/R/H	-
Cyclists	2016	RGB	FPV	✔	✖️	✖️	✖️	✖️	✖️	✖️	✖️	D	U	-
Cityscapes	2016	RGB/S	FPV	✔	✖️	✖️	✖️	✔	✔	✔	✖️	D	U	-
SYNTHIA	2016	RGB	FPV	✖️	✖️	✖️	✖️	✔	✔	✔	✔	D/N	U	-
Campus	2016	RGB	BEV	✖️	✖️	✖️	✖️	✖️	✖️	✖️	✖️	D	C	-
RobotCar	2016	RGB	FPV	✔	✖️	✖️	✖️	✖️	✖️	✖️	✖️	D/N	U	-
Mapillary	2017	RGB	FPV	✔	✖️	✖️	✖️	✖️	✔	✔	✔	D/N	U	-
P.F.B.	2017	RGB	FPV	✔	✖️	✖️	✖️	✖️	✔	✔	✔	D/N	U	-
BDD100K	2018	RGB	FPV	✔	✖️	✖️	✖️	✔	✔	✔	✔	D	U/H	-
HighD	2018	RGB	BEV	✔	✖️	✖️	✖️	✖️	✔	✔	✖️	D	H	-
Udacity	2018	RGB	FPV	✔	✖️	✖️	✖️	✖️	✖️	✖️	✖️	D	U	-
KAIST	2018	RGB/S	FPV	✔	✔	✖️	✖️	✖️	✔	✔	✔	D/N	U	-
Argoverse	2019	RGB/S	FPV	✔	✔	✖️	✖️	✖️	✔	✔	✔	D/N	U	-
TRAF	2019	RGB	FPV	✔	✖️	✖️	✖️	✖️	✔	✔	✔	D	U	-
ApolloScape	2019	RGB/S	FPV	✔	✖️	✖️	✖️	✔	✔	✔	✔	D	U	-
ACFR	2019	RGB	BEV	✔	✖️	✖️	✖️	✖️	✖️	✖️	✖️	D	RA	-
H3D	2019	RGB	FPV	✔	✖️	✖️	✖️	✖️	✔	✔	✔	D	U	-
INTERACTION	2019	RGB	BEV	✔	✖️	✖️	✖️	✖️	✖️	✖️	✖️	D	I/RA	-
Comma2k19	2019	RGB	FPV	✔	✖️	✖️	✔	✔	✖️	✖️	✖️	D/N	U/S/R/H	-
InD	2020	RGB	BEV	✔	✖️	✖️	✖️	✖️	✖️	✖️	✖️	D	I	-
RounD	2020	RGB	BEV	✔	✖️	✖️	✖️	✖️	✖️	✖️	✖️	D	RA	-
nuScenes	2020	RGB	FPV	✔	✔	✔	✖️	✔	✔	✔	✔	D/N	U	-
Lyft Level 5	2020	RGB	FPV	✔	✔	✔	✖️	✔	✔	✔	✔	D/N	U/S	-
Waymo Open	2020	RGB	FPV	✔	✔	✔	✔	✔	✔	✔	✔	D/N	U	-
A*3D	2020	RGB	FPV	✔	✔	✔	✔	✔	✔	✔	✔	D/N	U	-
RobotCar Radar	2020	RGB	FPV	✔	✔	✔	✔	✔	✔	✔	✔	D/N	U	-
Toronto3D	2020	RGB	BEV	✔	✔	✖️	✔	✔	✖️	✔	✖️	D/N	U	University of Waterloo
A2D2	2020	RGB	FPV	✔	✔	✔	✔	✔	✔	✖️	✔	✔	D	U/H/S/R
WADS	2020	RGB	FPV	✔	✔	✔	✔	✔	✖️	✖️	✔	D/N	U/S/R	Michigan Technological University
Argoverse 2	2021	RGB/S	FPV	✔	✔	✖️	✖️	✔	✔	✔	✔	D/N	U	-
PandaSet	2021	RGB	FPV	✔	✔	✔	✔	✔	✔	✔	✔	D/N	U	-
ONCE	2021	RGB	FPV	✔	✔	✔	✔	✔	✔	✔	✔	D/N	U	-
Leddar PixSet	2021	RGB	FPV	✔	✔	✖️	✔	✔	✔	✖️	✔	D/N	U/S/R	Leddar
ZOD	2022	RGB	FPV	✔	✔	✔	✔	✔	✔	✔	✔	D/N	U/R/S/H	Zenseact
IDD-3D	2022	RGB	FPV	✔	✔	✖️	✖️	✔	✔	✖️	✖️	-	R	INAI
CODA	2022	RGB	FPV	✔	✔	✔	✔	✔	✔	✔	✔	D/N	U/S/R	Huawei
SHIFT	2022	RGB	FPV	✔	✔	✔	✔	✔	✔	✔	✔	D/N	U/S/R/H	ETH Zürich
DeepAccident	2023	RGB/S	FPV/BEV	✖️	✔	✖️	✖️	✔	✔	✔	✔	D/N	U/S/R/H	HKU, Huawei, CARLA
Dual_Radar	2023	RGB	FPV	✔	✔	✔	✔	✔	✖️	✔	✔	D/N	U	Tsinghua University
V2V4Real	2023	RGB	FPV	✔	✔	✖️	✔	✔	✖️	✔	✖️	-	U/H/S	UCLA Mobility Lab
SCaRL	2024	RGB/S	FPV/BEV	✖️	✔	✔	✔	✔	✔	✔	✔	D/N	U/S/R/H	Fraunhofer CARLA
MARS	2024	RGB	FPV	✔	✔	✔	✔	✔	✔	✔	✔	D/N	U/S/H	NYU, MAY Mobility
Scenes101	2024	RGB	FPV	✔	✖️	✖️	✔	✖️	✖️	✔	✔	D/N	U/S/R/H	Wayve
TruckScenes	2025	RGB	FPV	✔	✔	✔	✔	✔	✖️	✔	✔	D/N	H/U	MAN

Notes: View: FPV=First-Person, BEV=Bird's-Eye; Time: D=Day, N=Night; Region: U=Urban, R=Rural, H=Highway, S=Suburban, C=Campus, I=Intersection, RA=Road Area; Img: RGB/S=RGB+Stereo

🎮 Simulators

The following figure shows the usage distribution of different foundation model types across autonomous driving simulators:

Simulators

Simulator	Year	Back-end	Open Source	Realistic Perception	Custom Scenario	Real World Map	Human Design Map	Python API	C++ API	ROS API	Company
TORCS	2000	None	✔	✔	✔	✖️	✖️	✖️	✖️	✖️	-
Webots	2004	ODE	✔	✔	✔	✔	✖️	✔	✔	✖️	-
CarRacing	2017	None	✔	✖️	✖️	✖️	✔	✔	✖️	✖️	-
CARLA	2017	UE4	✔	✔	✔	✖️	✔	✔	✔	✔	-
SimMobilityST	2017	None	✔	✖️	✖️	✖️	✖️	✖️	✖️	✖️	-
GTA-V	2017	RAGE	✖️	✔	✖️	✖️	✖️	✖️	✖️	✖️	-
highway-env	2018	None	✔	✖️	✔	✖️	✔	✔	✖️	✖️	-
Deepdrive	2018	UE4	✔	✔	✔	✖️	✔	✔	✔	✖️	-
esmini	2018	Unity	✔	✖️	✖️	✖️	✖️	✔	✖️	✖️	-
AutonoViSim	2018	PhysX	✖️	✔	✔	✖️	✖️	✔	✖️	✖️	-
AirSim	2018	UE4	✔	✔	✔	✖️	✔	✔	✔	✖️	-
SUMO	2018	None	✔	✖️	✔	✔	✔	✖️	✔	✖️	-
Apollo	2018	Unity	✔	✔	✔	✔	✔	✔	✔	✖️	-
Sim4CV	2018	UE4	✔	✔	✔	✖️	✔	✔	✖️	✖️	-
MATLAB	2018	MATLAB	✖️	✔	✔	✔	✔	✔	✔	✔	Mathworks
Scenic	2019	None	✔	✔	✔	✔	✔	✔	✖️	✖️	Toyota Research Institute, UC Berkeley
SUMMIT	2020	UE4	✔	✔	✔	✖️	✔	✔	✔	✖️	-
MultiCarRacing	2020	None	✔	✖️	✔	✖️	✔	✔	✖️	✖️	-
SMARTS	2020	None	✔	✔	✔	✔	✔	✔	✖️	✖️	-
LGSVL	2020	Unity	✔	✔	✔	✔	✔	✔	✔	✔	-
CausalCity	2020	UE4	✔	✔	✔	✔	✔	✔	✖️	✖️	-
Vista	2020	None	✔	✔	✔	✔	✖️	✔	✖️	✖️	MIT
MetaDrive	2021	Panda3D	✔	✔	✔	✔	✔	✔	✔	✖️	-
L2R	2021	UE4	✔	✔	✔	✔	✔	✔	✔	✖️	-
AutoDRIVE	2021	Unity	✔	✔	✔	✔	✔	✔	✔	✔	-
Nuplan	2021	None	✔	✔	✔	✔	✔	✔	✖️	✖️	Motional
AWSIM	2021	Unity	✔	✔	✔	✔	✔	✖️	✖️	✔	Autoware
InterSim	2022	None	✔	✔	✔	✔	✖️	✔	✖️	✖️	Tsinghua
Nocturne	2022	None	✔	✔	✔	✔	✔	✔	✔	✖️	Facebook
BeamNG.tech	2022	Soft-body physics	✖️	✔	✔	✖️	✔	✔	✖️	✔	BeamNG GmbH
Waymax	2023	JAX	✔	✔	✔	✖️	✔	✔	✖️	✖️	Waymo
UNISim	2023	None	✖️	✔	✔	✔	✖️	✖️	✔	✖️	Waabi
TBSim	2023	None	✔	✔	✔	✔	✔	✔	✖️	✖️	NVIDIA
Nvidia DriveWorks	2024	Nvidia GPU	✖️	✔	✔	✔	✖️	✔	✔	✖️	NVIDIA

🏆 Foundation Model Benchmark Challenges (2022–2025)

Benchmark Challenges

Autonomous Driving

Name	Host
CARLA AD Challenge	CARLA
DRL4Real	ICCV
Waymo Open Dataset Challenge	Waymo / CVPR WAD
Argoverse 2: Scenario Mining	ArgoAI
Roboflow-20VL	Roboflow-VL / CVPR
AVA Challenge	AVA Challenge Team

Other Fields Related to Generation and Analysis

Name	Host
IGLU Challenge	NeurIPS / IGLU Team
LLM Efficiency Challenge	NeurIPS
Trojan Detection	NeurIPS / CAIS
SMART-101	CVPR
NICE Challenge	CVPR / LG Research
SyntaGen	CVPR
Habitat Challenge	CVPR / FAIR
BIG-bench	Google Research
BIG-bench Hard (BBH)	Google Research
HELM	Stanford CRFM
MMBench	OpenCompass
MMMU	CVPR / U-Waterloo / OSU
Open LLM Leaderboard	VILA-Lab
Text-to-Image Leaderboard	Artificial Analysis
Ego4D	FAIR
VizWiz Grand Challenge	CVPR VizWiz Workshop
MedFM	NeurIPS / Shanghai AI Laboratory
3D Scene Understanding	CVPR

Contributing

We welcome contributions from the community! If you have research papers, tools, or resources to add, please create a pull request or open an issue.

License

This repository is released under the Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
Assets		Assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis 🚗

🔥 Updates

🤝 Citation

📃 Introduction

📈 Publication Timeline

🔍 Search Methodology

🌟 Large Language Models for Autonomous Driving

🌟 Vision-Language Models for Autonomous Driving

🌟 Multimodal Large Language Models for Autonomous Driving

🌟 Diffusion Models for Autonomous Driving

🌟 World Models for Autonomous Driving

📊 Datasets Comparison

🎮 Simulators

🏆 Foundation Model Benchmark Challenges (2022–2025)

Autonomous Driving

Other Fields Related to Generation and Analysis

Contributing

License

About

Uh oh!

Uh oh!

Contributors 2

License

TUM-AVS/FM-AD-Survey

Folders and files

Latest commit

History

Repository files navigation

Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis 🚗

🔥 Updates

🤝 Citation

📃 Introduction

📈 Publication Timeline

🔍 Search Methodology

🌟 Large Language Models for Autonomous Driving

🌟 Vision-Language Models for Autonomous Driving

🌟 Multimodal Large Language Models for Autonomous Driving

🌟 Diffusion Models for Autonomous Driving

🌟 World Models for Autonomous Driving

📊 Datasets Comparison

🎮 Simulators

🏆 Foundation Model Benchmark Challenges (2022–2025)

Autonomous Driving

Other Fields Related to Generation and Analysis

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 2