To ensure responsible use, access to the dataset is granted after agreeing to our terms of use.
Please complete the user consent form at the following link.
Upon submission of the form, you will be automatically redirected to a page containing the download links for the dataset.
The Trauma THOMPSON Dataset is designed for multimodal research in egocentric video analysis of life-saving medical procedures. It supports multiple tasks including action recognition, hand tracking, object detection, and visual question answering.
- 177 videos of standard (regular) emergency procedures
- 43 videos of just-in-time (JIT) procedures
- 30 videos (subset of the regular procedure dataset)
- Annotations include:
- Hand bounding boxes in COCO format
- Left/right hand identification
- 25,000 frames
- 12 common surgical tools
- Bounding box annotations for tool presence and location in YOLO format
- 600,000 frames with corresponding VQA annotations
Input: Pre-extracted video frames
Folder Structure:
actions/
├── videos/
│ ├── P01_01_00/
│ │ ├── img_00001.jpg
│ │ ├── img_00002.jpg
│ ├── P01_01_01/
│ │ ├── img_00001.jpg
│ │ ├── img_00002.jpg
│ ├── ...
├── annotations.csv
Annotations:
annotations.csvcontains video-level labels (e.g., procedure type, timestamps).
Input: Full videos + per-frame bounding boxes
Folder Structure:
hands/
├── videos/
│ ├── P01_01.mp4
│ ├── P01_02.mp4
│ ├── ...
├── bbx/
│ ├── P01_01/
│ │ ├── P01_01_00001.json
│ │ ├── P01_01_00002.json
│ ├── P01_02/
│ │ ├── P01_02_00001.json
│ │ ├── P01_02_00002.json
│ ├── ...
Annotation Format (COCO-style JSON):
{
"categories": [
{"id": 0, "name": "left hand", "supercategory": "hand"},
{"id": 1, "name": "right hand", "supercategory": "hand"}
],
"images": [
{
"id": "frame_number",
"file_name": "P01_01_00001.jpg",
"height": "h",
"width": "w",
"channel": 3
}
],
"annotations": [
{
"id": "annotation_id",
"image_id": "frame_number",
"category_id": 0,
"bbox": ["x", "y", "w", "h"],
"area": "w * h",
"iscrowd": 0
}
]
}Input: Single frames + tool annotations
Folder Structure:
objects/
├── images/
│ ├── P01_01_00001.jpg
│ ├── P01_01_00002.jpg
│ ├── ...
├── labels/
│ ├── P01_01_00001.txt
│ ├── P01_01_00002.txt
│ ├── ...
Annotation Format (YOLO-style):
- Each
.txtfile contains:<object-class> <x-center> <y-center> <width> <height>
Input: Frames + question-answer pairs
Folder Structure:
vqa/
├── images/
│ ├── P01_01_00001.jpg
│ ├── P01_01_00002.jpg
│ ├── ...
├── questions.json
├── annotations.json
Question Format (questions.json):
{
"info": {
"description": "Trauma THOMPSON VQA dataset",
"version": "1.0",
"year": 2025
},
"data_type" : "mscoco",
"license" : {
"url": "https://creativecommons.org/licenses/by-nc-sa/4.0/",
"name": "CC BY-NC-SA 4.0"
},
"questions": [
{
"image_name": "P01_01_00001.jpg",
"question": "Is the patient bleeding?",
"question_id": 10100001001
}
]
}Answer Format (annotations.json):
{
"info": {
"description": "Trauma THOMPSON VQA dataset",
"version": "1.0",
"year": 2025
},
"data_type" : "mscoco",
"license" : {
"url": "https://creativecommons.org/licenses/by-nc-sa/4.0/",
"name": "CC BY-NC-SA 4.0"
},
"annotations": [
{
"question_id": 10100001001,
"image_name": "P01_01_00001.jpg",
"answers": [
{"answer": "Yes", "confidence": "high", "answer_id": 1}
]
}
]
}✅ Multi-task annotations (action, hands, tools, VQA)
✅ Structured folder hierarchy for easy data loading
✅ Standard formats (COCO for hands, YOLO for tools, VQA-JSON)
CC BY 4.0