Skip to content

ORT 1.25.2 Cherry Picks, Round 2#28426

Merged
adrastogi merged 1 commit into
rel-1.25.2from
adrastogi/1.25.2-cherry-pick-2
May 10, 2026
Merged

ORT 1.25.2 Cherry Picks, Round 2#28426
adrastogi merged 1 commit into
rel-1.25.2from
adrastogi/1.25.2-cherry-pick-2

Conversation

@adrastogi
Copy link
Copy Markdown
Contributor

This cherry-picks the following commits for the release:

#28396 Fix loader-refcount leak in ProviderLibrary::Load on GetProvider miss

…#28396)

### Description

Fixes a loader-refcount leak in `ProviderLibrary::Load()` that affects
any caller of `OrtApi::RegisterExecutionProviderLibrary` whose target
DLL does not export `GetProvider` (i.e. plugin EPs that are not legacy
provider-bridge libraries).

In `onnxruntime/core/session/utils.cc`,
[`LoadPluginOrProviderBridge`](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/session/utils.cc#L852)
probes whether the EP DLL is a "provider bridge" by constructing a
temporary `ProviderLibrary` and calling `Load()`:

```cpp
auto provider_library = std::make_unique<ProviderLibrary>(...);
bool is_provider_bridge = provider_library->Load() == Status::OK();   // (+1) loader ref
...
auto ep_library_plugin = std::make_unique<EpLibraryPlugin>(...);
ORT_RETURN_IF_ERROR(ep_library_plugin->Load());                        // (+1) loader ref
```

`ProviderLibrary::Load()` calls `LoadDynamicLibrary` (handle_=1), then
`GetSymbolFromLibrary("GetProvider", ...)`. For a plugin EP DLL that
does not export `GetProvider`, the lookup fails and
`ORT_RETURN_IF_ERROR` returns the `Status` early — without calling
`Unload()`. The surrounding `try/catch` only catches `std::exception`,
so the early return is not covered. `~ProviderLibrary()` is empty
(intentionally, to avoid static-destruction-order issues with the global
`s_library_*` instances), so the temporary's `handle_` is silently
leaked.

`EpLibraryPlugin::Load()` then maps the same DLL again (refcount → 2).
Later, `UnregisterExecutionProviderLibrary` releases only the reference
owned by `EpLibraryPlugin`, leaving the EP DLL mapped (refcount = 1)
after the caller explicitly unregistered it. Static / thread-local
destructors that need the EP DLL outlive `Unregister` and run at process
exit in an inconsistent order — on Windows this often manifests as a
`STATUS_STACK_BUFFER_OVERRUN` (`0xC0000409`) fast-fail during teardown.

### Fix

In `ProviderLibrary::Load()`, when `GetSymbolFromLibrary("GetProvider",
...)` fails, call `Unload()` before returning the error:

```cpp
auto get_provider_status = Env::Default().GetSymbolFromLibrary(handle_, "GetProvider", (void**)&PGetProvider);
if (!get_provider_status.IsOK()) {
  Unload();
  return get_provider_status;
}
provider_ = PGetProvider();
```

This keeps the existing `~ProviderLibrary()` contract (callers are still
expected to call `Unload()` themselves on success paths), but plugs the
early-return-after-`LoadDynamicLibrary` leak.

### Verification

Standalone repro that does only `LoadLibraryExW("onnxruntime.dll")` →
`RegisterExecutionProviderLibrary("VitisAIExecutionProvider",
L"onnxruntime_vitisai_ep.dll")` → `UnregisterExecutionProviderLibrary` →
`ReleaseEnv`, then drains `FreeLibrary` on the EP DLL counting
iterations:

https://github.com/BoarQing/ep_double_load_repro

| build | output |
|---|---|
| `main` (unpatched) | `ep dll leaked refcount after
Unregister+ReleaseEnv = 1` |
| `main` + this PR | `ep dll leaked refcount after Unregister+ReleaseEnv
= 0` |

Tracking issue: #28395

Co-authored-by: BoarQing <boarqing@users.noreply.github.com>
@adrastogi adrastogi merged commit 94bc0cd into rel-1.25.2 May 10, 2026
78 of 90 checks passed
@adrastogi adrastogi deleted the adrastogi/1.25.2-cherry-pick-2 branch May 10, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants