[arcane, accelerator] Use `malloc` for allocating nvidia GPU memory #1576

cedricchevalier19 · 2024-08-02T10:27:29Z

To exploit ATS on Nvidia Grace-Hopper or HMM-enabled computers.

This PR adds a new ARCANE_CUDA_ALLOC_ATS cmake option, that is disabled by default.

We kept all the prefetch machinery, but most of it should be removed as there is no any more data "migration."
We do not use cudaMemAdvise_v2 to choose memory placement, letting the system do it. If page migration is not enabled, it can lead to performance penalties as the data location is chosen through a first-touch policy.

codecov · 2024-08-02T11:21:19Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.66%. Comparing base (a9a09e3) to head (0fa2411).
Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1576      +/-   ##
==========================================
- Coverage   69.66%   69.66%   -0.01%     
==========================================
  Files        2247     2247              
  Lines      160512   160512              
  Branches    18493    18493              
==========================================
- Hits       111825   111817       -8     
- Misses      42021    42027       +6     
- Partials     6666     6668       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

grospelliergilles · 2024-08-20T08:11:26Z

Thansk for the PR.

I think it is better to use an environment variable to activate this allocator instead of compilation flag. This will allow us to dynamically change the allocator without recompiling (see class CommonUnifiedMemoryAllocatorWrapper in this file as an example).

…ead of managed or device memory.

cedricchevalier19 force-pushed the nvidia-ats branch from 351673a to 30fe40d Compare August 2, 2024 10:36

cedricchevalier19 and others added 3 commits October 2, 2024 20:40

[arcane, accelerator] Add ATS mode for Grace-Hopper superchip

8cfe298

[arcane, accelerator] Enforce ATS alignment of 128 bytes

2776edb

[arcane,cuda] Use an environment variable to select usage of ATS inst…

0fa2411

…ead of managed or device memory.

grospelliergilles force-pushed the nvidia-ats branch from 30fe40d to 0fa2411 Compare October 2, 2024 19:03

grospelliergilles merged commit 7cd6095 into arcaneframework:main Oct 2, 2024
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[arcane, accelerator] Use `malloc` for allocating nvidia GPU memory #1576

[arcane, accelerator] Use `malloc` for allocating nvidia GPU memory #1576

cedricchevalier19 commented Aug 2, 2024

codecov bot commented Aug 2, 2024 •

edited

Loading

grospelliergilles commented Aug 20, 2024

[arcane, accelerator] Use malloc for allocating nvidia GPU memory #1576

[arcane, accelerator] Use malloc for allocating nvidia GPU memory #1576

Conversation

cedricchevalier19 commented Aug 2, 2024

codecov bot commented Aug 2, 2024 • edited Loading

Codecov Report

grospelliergilles commented Aug 20, 2024

[arcane, accelerator] Use `malloc` for allocating nvidia GPU memory #1576

[arcane, accelerator] Use `malloc` for allocating nvidia GPU memory #1576

codecov bot commented Aug 2, 2024 •

edited

Loading