Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions _posts/2025-08-11-cuda-debugging.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,15 @@ author: "Kaichao You"
image: /assets/logos/vllm-logo-text-light.png
---

TL;DR: If you hit `an illegal memory access was encountered` error, you can enable CUDA core dump to debug the issue. Simply set the following environment variables and run your program again to collect the coredump file, then you can use `cuda-gdb` to debug the issue.

```bash
CUDA_ENABLE_COREDUMP_ON_EXCEPTION=1 \
CUDA_COREDUMP_SHOW_PROGRESS=1 \
CUDA_COREDUMP_GENERATION_FLAGS='skip_nonrelocated_elf_images,skip_global_memory,skip_shared_memory,skip_local_memory,skip_constbank_memory' \
CUDA_COREDUMP_FILE="/tmp/cuda_coredump_%h.%p.%t"
```

# Introduction

Have you ever felt you are developing cuda kernels and your tests often run into illegal memory access (IMA for short) and you have no idea how to debug? We definitely felt this pain again and again while working on vLLM, a high-performance inference engine for LLM models.
Expand Down