perf(vehicle): SCM HIP E2E — direct hits map, buffer reserve, dense body reduce#756
Closed
amd-pratmish wants to merge 3 commits into
Closed
perf(vehicle): SCM HIP E2E — direct hits map, buffer reserve, dense body reduce#756amd-pratmish wants to merge 3 commits into
amd-pratmish wants to merge 3 commits into
Conversation
### Summary Adds an **opt-in HIP path** for the SCM Bekker / Mohr-Coulomb / Janosi contact-force loop in `SCMLoader::ComputeInternalForces()`. Ray casting and contact-patch BFS stay on CPU. - CMake: `CH_ENABLE_VEHICLE_SCM_GPU=ON` + `CHRONO_SCM_GPU_INCLUDE_DIR` / `CHRONO_SCM_GPU_LIB_DIR` - Runtime: `CHRONO_SCM_GPU=1` (uniform soil, rigid `ChBody` contactables in v1) - Auto-fallback to CPU when `hits.size() < CHRONO_SCM_GPU_MIN_HITS` (default **8192**) - Host compiler stays **g++**; HIP device code compiles via CMake HIP language (`CMAKE_HIP_COMPILER` auto-detected). Set `-DCMAKE_HIP_ARCHITECTURES=gfx942` (MI300X) or `gfx90a` (MI210). ### OpenMP → HIP split (v1) Porting pattern for CPU/OpenMP Chrono modules — only the dense inner loop moves to HIP: ```text [CPU] Update active domains [CPU] OpenMP ray cast → hits + patch_oob [CPU] BFS contact patches [CPU] Pack batch (OpenMP parallel for) [GPU] scm_compute_forces_kernel (Bekker + Mohr-Coulomb + Janosi per hit) [CPU] Reduce per-body forces → ChLoadBodyForce Ship kit: amd-chronos contrib/upstream_ready/phase2c/v1
3 tasks
Upstream-facing playbook (no private integration references). Co-authored-by: Cursor <cursoragent@cursor.com>
…ody reduce ### Summary Follow-up to the optional SCM HIP backend. Improves end-to-end Chrono performance: - Pass ray-cast `hits` map directly to `ComputeContactForcesGpu` (no intermediate vector copy) - Call `scm_gpu::PrimeBuffers()` from all `SCMTerrain::Initialize` overloads - Dense per-body force accumulation in `SCMTerrainGpu.cpp` ### Depends on - **Merged:** `feat(vehicle): optional HIP SCM contact-force backend` (v1) ### Files in this PR - `src/chrono_vehicle/terrain/SCMTerrain.h` - `src/chrono_vehicle/terrain/SCMTerrain.cpp` - `src/chrono_vehicle/terrain/SCMTerrainGpu.cpp` Rebuild external `scm_gpu` from matching `amd-chronos` ref (see `SCM_GPU_EXTERNAL.md`). Ship kit: amd-chronos contrib/upstream_ready/phase2c/v2
c243acb to
da729b6
Compare
Contributor
Author
9 tasks
Contributor
Author
|
Superseded by combined SCM HIP PR #755 (maintainer requested single PR). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to the optional SCM HIP backend. Improves end-to-end Chrono performance:
hitsmap directly toComputeContactForcesGpu(no intermediate vector copy)scm_gpu::PrimeBuffers()from allSCMTerrain::InitializeoverloadsSCMTerrainGpu.cppDepends on
amd-pratmish:perf/scm-vehicle-hip-e2ebased onfeat/scm-vehicle-hip-gfx942Files in this PR
src/chrono_vehicle/terrain/SCMTerrain.hsrc/chrono_vehicle/terrain/SCMTerrain.cppsrc/chrono_vehicle/terrain/SCMTerrainGpu.cppRebuild external
scm_gpufrom matchingamd-chronosref (seeSCM_GPU_EXTERNAL.mdin amd-chronos ship kit).Test plan
scm_parity_test --n-hits 65536PASSCHRONO_SCM_GPU=1wheel--load— contact timer improved vs v1CHRONO_SCM_GPU_PROFILE=1— steady-state pack/gpu/scatter logged