-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Closed
Description
Describe the issue
CreateAndRegisterAllocatorV2 with "CudaPinned" fails if "Cpu" allocator was already registered in the environment.
This happens because "mem_info" is not passed into the AllocatorCreationInfo constructor:
| AllocatorCreationInfo alloc_creation_info{[](int) { return std::make_unique<CPUAllocator>(); }, |
The AllocatorCreationInfo constructor then falls back to "Cpu" memory info.
To reproduce
Run the two functions one after another:
void RegisterCpuEnvAllocator(OrtApi& api, OrtEnv* env)
{
nvtxRangePush("RegisterCpuEnvAllocator");
OrtMemoryInfo* cpuMemoryInfo;
ASSERT_ORT_STATUS(api.CreateMemoryInfo("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault, &cpuMemoryInfo));
OrtArenaCfg* cpuArenaConfig;
ASSERT_ORT_STATUS(api.CreateArenaCfg(0, ArenaExtendStrategy::kNextPowerOfTwo, -1, -1, &cpuArenaConfig));
// This creates an ORT-internal allocator instance and registers it in the environment for sharing
vector<const char*> keys, values;
ASSERT_ORT_STATUS(api.CreateAndRegisterAllocatorV2(env, "CPUExecutionProvider", cpuMemoryInfo, cpuArenaConfig, keys.data(), values.data(), 0));
api.ReleaseArenaCfg(cpuArenaConfig);
api.ReleaseMemoryInfo(cpuMemoryInfo);
nvtxRangePop();
}
void RegisterCudaPinnedEnvAllocator(OrtApi& api, OrtEnv* env)
{
nvtxRangePush("RegisterCudaPinnedEnvAllocator");
OrtMemoryInfo* cudaPinnedMemoryInfo;
ASSERT_ORT_STATUS(api.CreateMemoryInfo("CudaPinned", OrtDeviceAllocator, 0, OrtMemTypeCPUOutput, &cudaPinnedMemoryInfo));
OrtArenaCfg* cudaPinnedArenaConfig;
ASSERT_ORT_STATUS(api.CreateArenaCfg(0, ArenaExtendStrategy::kNextPowerOfTwo, -1, -1, &cudaPinnedArenaConfig));
// This creates an ORT-internal allocator instance and registers it in the environment for sharing
vector<const char*> keys, values;
ASSERT_ORT_STATUS(api.CreateAndRegisterAllocatorV2(env, "CPUExecutionProvider", cudaPinnedMemoryInfo, cudaPinnedArenaConfig, keys.data(), values.data(), 0));
api.ReleaseArenaCfg(cudaPinnedArenaConfig);
api.ReleaseMemoryInfo(cudaPinnedMemoryInfo);
nvtxRangePop();
}
The issue happens only if both Cpu and CudaPinned allocators are registered with OrtDeviceAllocator flag, because the other flag (OrtArenaAllocator) uses the correct constructor:
| [mem_info](int) { return std::make_unique<CPUAllocator>(mem_info); }, |
Urgency
Not urgent
Platform
Windows
OS Version
11
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.22
ONNX Runtime API
C++
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 11.8
Metadata
Metadata
Assignees
Labels
No labels