-
Notifications
You must be signed in to change notification settings - Fork 2
Add query_intersections() method for efficient AABB pair detection #47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add query_intersections() method for efficient AABB pair detection #47
Conversation
Implements a new query_intersections() method that efficiently finds all pairs of intersecting bounding boxes in the tree, addressing the feature request in issue #46. Key features: - Returns numpy array of shape (n_pairs, 2) with index pairs (i, j) where i < j - Parallel processing using std::thread for improved performance - Automatic double-precision refinement when exact coordinates are available - No duplicate pairs or self-pairs - Similar to scipy.spatial.cKDTree.query_pairs but for AABBs This eliminates the need for manual post-processing of batch_query results using np.vectorize, np.repeat, and np.concatenate, which previously canceled out the performance gains from C++ parallelization. Changes: - cpp/prtree.h: Added query_intersections() method to PRTree class - cpp/main.cc: Added Python bindings for all dimensions (2D, 3D, 4D) - tests/test_PRTree.py: Added comprehensive tests including edge cases - README.md: Updated documentation with usage examples All tests pass (45 new tests + existing tests). Resolves #46
Adds a complete development workflow automation through Make commands, making it easier for developers to build, test, and contribute to the project. Key additions: - Makefile: Comprehensive build automation with 30+ commands - Build: build, build-release, rebuild, debug-build - Test: test, test-verbose, test-fast, test-coverage, test-one - Packaging: wheel, sdist, release - Maintenance: clean, clean-all, format, lint - Development: init, dev, install-deps, check - CONTRIBUTING.md: Complete developer guide - Development setup instructions - Workflow documentation - Coding standards - Troubleshooting guide - Enhanced .gitignore: Additional patterns for build artifacts, coverage reports, and temporary files All content is in English and designed to be generic and reusable for any feature development, not specific to any particular functionality. Benefits: - Streamlined development workflow - Consistent build and test procedures - Easy onboarding for new contributors - CI-compatible check commands
Apply clang-format to all C++ source files for consistent code style. This is an automatic formatting change with no functional modifications. Files affected: - cpp/main.cc: Brace positioning and line wrapping - cpp/parallel.h: Formatting improvements - cpp/prtree.h: Consistent formatting throughout - cpp/small_vector.h: Style consistency updates These changes were generated by running 'make format'.
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Use thread-local storage to collect pairs | ||
| const size_t n_threads = std::min( | ||
| static_cast<size_t>(std::thread::hardware_concurrency()), n_items); | ||
| vec<vec<std::pair<T, T>>> thread_pairs(n_threads); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against zero hardware_concurrency in query_intersections
When std::thread::hardware_concurrency() returns 0 (which is permitted when the runtime cannot detect the CPU count), n_threads is set to zero. In that situation the single-thread path later writes to thread_pairs[0], and the parallel branch never computes any pairs because its loops never execute. Both behaviours are undefined: the code dereferences an empty vector and/or silently returns no intersections even when the tree is non‑empty. The constructor in find_all already guards against this by forcing at least one thread; a similar guard is needed here.
Useful? React with 👍 / 👎.
When std::thread::hardware_concurrency() returns 0 (which is permitted when the runtime cannot detect the CPU count), n_threads was set to zero. This caused undefined behavior: - In single-threaded path: writes to thread_pairs[0] on empty vector - In parallel path: loops never execute, returning no intersections Apply the same guard used in find_all() to ensure at least one thread. This fixes a critical bug that could cause crashes or incorrect results on systems where hardware_concurrency() returns 0. Fixes: P1 Badge Guard against zero hardware_concurrency Reported-by: User feedback
This commit implements the complete Phase 0 infrastructure for performance profiling and validation as outlined in the PRTree Comprehensive Improvement Plan. ## What's Added ### Build System (CMakeLists.txt) - Profiling infrastructure with ENABLE_PROFILING option - Mandatory sanitizer support (TSan, ASan, UBSan) - Benchmark build targets ### Benchmarks (benchmarks/) - benchmark_construction: Tree construction performance - benchmark_query: Query operation performance - benchmark_parallel: Thread scaling analysis - stress_test_concurrent: Concurrency stress testing - workloads.h: Standard workload definitions - benchmark_utils.h: Timing and reporting utilities ### Profiling Automation (scripts/) - profile_all_workloads.sh: Automated perf/cachegrind profiling - analyze_baseline.py: Results analysis and report generation ### Documentation (docs/baseline/) - README.md: Phase 0 guide and requirements - BASELINE_SUMMARY.md: Template for results documentation ### CI/CD (.github/workflows/sanitizers.yml) - Mandatory ThreadSanitizer checks (BLOCKING) - AddressSanitizer checks (BLOCKING) - UBSanitizer checks (BLOCKING) - Performance baseline tracking - Long-running stress tests ### Project Documentation - PHASE0_IMPLEMENTATION.md: Complete implementation guide - QUICKSTART_PHASE0.md: Quick start instructions ## Key Features 1. **Empirical Validation**: Hardware counter-based profiling 2. **Thread Safety**: Mandatory TSan validation 3. **Representative Workloads**: 5 workloads covering real usage 4. **Automated Profiling**: Scripts for consistent measurements 5. **Regression Detection**: CI integration for future validation ## Testing All benchmarks build and run successfully: - Construction benchmark: ✓ - Query benchmark: ✓ - Parallel benchmark: ✓ - Stress test: ✓ (all tests pass) ## Next Steps 1. Run: ./scripts/profile_all_workloads.sh 2. Complete: docs/baseline/BASELINE_SUMMARY.md 3. Get tech lead approval 4. Proceed to Phase 1 This is a critical checkpoint - ALL subsequent phases depend on this baseline for performance validation. Related: #48 (test coverage), #47 (query optimizations)
This commit implements the complete Phase 0 infrastructure for performance profiling and validation as outlined in the PRTree Comprehensive Improvement Plan. ## What's Added ### Build System (CMakeLists.txt) - Profiling infrastructure with ENABLE_PROFILING option - Mandatory sanitizer support (TSan, ASan, UBSan) - Benchmark build targets ### Benchmarks (benchmarks/) - benchmark_construction: Tree construction performance - benchmark_query: Query operation performance - benchmark_parallel: Thread scaling analysis - stress_test_concurrent: Concurrency stress testing - workloads.h: Standard workload definitions - benchmark_utils.h: Timing and reporting utilities ### Profiling Automation (scripts/) - profile_all_workloads.sh: Automated perf/cachegrind profiling - analyze_baseline.py: Results analysis and report generation ### Documentation (docs/baseline/) - README.md: Phase 0 guide and requirements - BASELINE_SUMMARY.md: Template for results documentation ### CI/CD (.github/workflows/sanitizers.yml) - Mandatory ThreadSanitizer checks (BLOCKING) - AddressSanitizer checks (BLOCKING) - UBSanitizer checks (BLOCKING) - Performance baseline tracking - Long-running stress tests ### Project Documentation - PHASE0_IMPLEMENTATION.md: Complete implementation guide - QUICKSTART_PHASE0.md: Quick start instructions ## Key Features 1. **Empirical Validation**: Hardware counter-based profiling 2. **Thread Safety**: Mandatory TSan validation 3. **Representative Workloads**: 5 workloads covering real usage 4. **Automated Profiling**: Scripts for consistent measurements 5. **Regression Detection**: CI integration for future validation ## Testing All benchmarks build and run successfully: - Construction benchmark: ✓ - Query benchmark: ✓ - Parallel benchmark: ✓ - Stress test: ✓ (all tests pass) ## Next Steps 1. Run: ./scripts/profile_all_workloads.sh 2. Complete: docs/baseline/BASELINE_SUMMARY.md 3. Get tech lead approval 4. Proceed to Phase 1 This is a critical checkpoint - ALL subsequent phases depend on this baseline for performance validation. Related: #48 (test coverage), #47 (query optimizations)
Implements a new query_intersections() method that efficiently finds all pairs of intersecting bounding boxes in the tree, addressing the feature request in issue #46.
Key features:
This eliminates the need for manual post-processing of batch_query results using np.vectorize, np.repeat, and np.concatenate, which previously canceled out the performance gains from C++ parallelization.
Changes:
All tests pass (45 new tests + existing tests).
Resolves #46