Skip to content

CreateAndRegisterAllocatorV2 fails to create CudaPinned allocator when shared env. allocators are activated #25212

@AndreyOrb

Description

@AndreyOrb

Describe the issue

CreateAndRegisterAllocatorV2 with "CudaPinned" fails if "Cpu" allocator was already registered in the environment.

This happens because "mem_info" is not passed into the AllocatorCreationInfo constructor:

AllocatorCreationInfo alloc_creation_info{[](int) { return std::make_unique<CPUAllocator>(); },

The AllocatorCreationInfo constructor then falls back to "Cpu" memory info.

To reproduce

Run the two functions one after another:

void RegisterCpuEnvAllocator(OrtApi& api, OrtEnv* env)
{
	nvtxRangePush("RegisterCpuEnvAllocator");

	OrtMemoryInfo* cpuMemoryInfo;
	ASSERT_ORT_STATUS(api.CreateMemoryInfo("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault, &cpuMemoryInfo));

	OrtArenaCfg* cpuArenaConfig;
	ASSERT_ORT_STATUS(api.CreateArenaCfg(0, ArenaExtendStrategy::kNextPowerOfTwo, -1, -1, &cpuArenaConfig));

	// This creates an ORT-internal allocator instance and registers it in the environment for sharing
	vector<const char*> keys, values;
	ASSERT_ORT_STATUS(api.CreateAndRegisterAllocatorV2(env, "CPUExecutionProvider", cpuMemoryInfo, cpuArenaConfig, keys.data(), values.data(), 0));

	api.ReleaseArenaCfg(cpuArenaConfig);
	api.ReleaseMemoryInfo(cpuMemoryInfo);
	nvtxRangePop();
}
void RegisterCudaPinnedEnvAllocator(OrtApi& api, OrtEnv* env)
{
	nvtxRangePush("RegisterCudaPinnedEnvAllocator");

	OrtMemoryInfo* cudaPinnedMemoryInfo;
	ASSERT_ORT_STATUS(api.CreateMemoryInfo("CudaPinned", OrtDeviceAllocator, 0, OrtMemTypeCPUOutput, &cudaPinnedMemoryInfo));

	OrtArenaCfg* cudaPinnedArenaConfig;
	ASSERT_ORT_STATUS(api.CreateArenaCfg(0, ArenaExtendStrategy::kNextPowerOfTwo, -1, -1, &cudaPinnedArenaConfig));

	// This creates an ORT-internal allocator instance and registers it in the environment for sharing
	vector<const char*> keys, values;
	ASSERT_ORT_STATUS(api.CreateAndRegisterAllocatorV2(env, "CPUExecutionProvider", cudaPinnedMemoryInfo, cudaPinnedArenaConfig, keys.data(), values.data(), 0));

	api.ReleaseArenaCfg(cudaPinnedArenaConfig);
	api.ReleaseMemoryInfo(cudaPinnedMemoryInfo);
	nvtxRangePop();
}

The issue happens only if both Cpu and CudaPinned allocators are registered with OrtDeviceAllocator flag, because the other flag (OrtArenaAllocator) uses the correct constructor:

[mem_info](int) { return std::make_unique<CPUAllocator>(mem_info); },

Urgency

Not urgent

Platform

Windows

OS Version

11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.22

ONNX Runtime API

C++

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 11.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions