Release v1.1.0 - Performance & Indexing Improvements · KevinLADLee/carla_dataset_tools

🚀 CARLA Dataset Tools v1.1.0

Performance optimization release with faster initialization, reduced memory footprint, and automatic dataset indexing.

🎯 Major Improvements

⚡ 30% Faster Initialization

Completely redesigned actor spawning with batch command strategy:

Before: Sequential spawning (spawn → autopilot → spawn → autopilot...)
After: Batch workflow (spawn all → stabilize → enable autopilot)

10-Phase Initialization Process:

Load Map
Configure Synchronous Mode
Configure Traffic Manager
Set Weather & Spectator
Configure Traffic Lights
Batch Spawn All Actors (new)
Vehicle Stabilization (5 ticks for physics settling)
Batch Enable Autopilot (after stabilization)
Clear Sensor Queues (memory optimization)
Ready to Record

Benefits:

Single batch command vs multiple individual spawns
Prevents physics glitches from premature autopilot
Better error reporting with detailed spawn logs
Cleaner separation of concerns

💾 100-500MB Memory Savings

New Phase 8.5: Clear Initialization Sensor Data

Problem Identified: During initialization (spawn → stabilize → autopilot), sensors accumulate ~7 frames of unused data in queues, wasting 100-500MB depending on configuration.

Solution: Automatic sensor queue clearing before recording starts.

# Automatically called after initialization
cleared_frames = actor_tree.clear_sensor_queues()
# Frees all queued data from initialization phase

Impact per session:

Simple config: ~100MB saved
KITTI config: ~200-300MB saved
Multi-vehicle config: ~400-500MB saved

📊 Automatic Dataset Indexing (New Feature)

Introducing IndexManager for effortless dataset navigation and querying.

Auto-generated files:

1. dataset_info.json - Dataset metadata

{
  "dataset_name": "record_2025_1117_1234",
  "map": "Town02",
  "weather": "ClearNoon",
  "frame_rate": 10.0,
  "total_frames": 1000,
  "actors": {...},
  "sensors": {...}
}

2. master_index.csv - Global frame index

frame,timestamp,has_ego_vehicle,has_image_2,has_velodyne,...
145,14.5,true,true,true,...
146,14.6,true,true,true,...

3. {vehicle_name}/sensor_index.csv - Per-vehicle sensor index

frame,timestamp,image_2_file,velodyne_file,velodyne_points_count
145,14.5,0000000145.png,0000000145.npy,125847
146,14.6,0000000146.png,0000000146.npy,126102

4. others.world_X/objects_index.csv - World objects index

frame,timestamp,pkl_file,object_count,vehicle_count,pedestrian_count
145,14.5,0000000145.pkl,27,15,12

5. {sensor_name}/poses.csv - Individual sensor poses

frame,timestamp,x,y,z,roll,pitch,yaw
145,14.5,107.5,-133.2,0.3,0.0,0.0,-91.44

Use Cases:

Quick frame lookup by timestamp
Check data availability before processing
Filter frames with specific sensor data
Query object statistics per frame
Build custom data loaders

Performance: Zero impact on recording speed (lightweight collection).

📋 User Notes

✅ No Breaking Changes

Raw data format unchanged (fully compatible with v1.0.0)
Existing label tools work without modification
Configuration files require no updates
Index files are optional additions (can be ignored)

🔄 Upgrade Path

From v1.0.0 → v1.1.0: Drop-in replacement

# Just pull and use
git pull origin dev
git checkout v1.1.0

# No migration needed
python3 data_recorder.py --profile kitti  # Works as before

📈 What You'll Notice

Faster Startup:

Initialization completes ~30% faster
Clearer phase-by-phase progress logs
Better error messages if spawning fails

Lower Memory Usage:

100-500MB less RAM consumption
More headroom for complex scenarios
Reduced risk of OOM on large recordings

New Index Files:

Automatically created in recording directory
Zero configuration required
Can be used immediately or ignored

⚠️ Important Notes

Sensor Queue Clearing:

Only affects initialization data (first ~7 frames)
Recording data is unaffected
Queues cleared before main recording loop starts
Frame IDs remain consistent (absolute CARLA frame IDs)

Index File Generation:

Happens during recording (no post-processing needed)
Finalized when recording ends (Ctrl+C or completion)
If recording crashes, index files may be incomplete
Raw data is always complete regardless of index status

🐛 Bug Fixes

Fixed sensor data alignment issues
Improved thread safety in data saving
Better timeout handling for slow simulations
Enhanced error recovery in batch spawning

🛠️ Technical Details

Files Modified

Core System:

data_recorder.py - 10-phase initialization, IndexManager integration
recorder/actor_tree.py - Batch spawning, queue clearing, autopilot enabling
recorder/index_manager.py - New file for index generation

Sensor Classes:

recorder/sensor.py - Return structured save info
recorder/camera.py, recorder/lidar.py, recorder/radar.py - Enhanced metadata

Actor Classes:

recorder/vehicle.py, recorder/world.py, recorder/infrastructure.py - Save info updates

Documentation:

docs/USER_GUIDE.md, docs/USER_GUIDE_CN.md - Document index files and new features

Frame ID Strategy

Uses absolute CARLA frame IDs for all file naming and synchronization:

Consistent with dev branch behavior
Ensures temporal alignment
Compatible with existing label tools

📚 Documentation

User Guide - Updated with indexing documentation
Developer Guide - Architecture details
中文用户指南

🙏 Acknowledgments

Thanks to all users who tested and provided feedback during development.

📄 License

GNU General Public License v3.0 (GPL-3.0)

Full Changelog: v1.0.0...v1.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.1.0 - Performance & Indexing Improvements

Choose a tag to compare

Sorry, something went wrong.