Skip to content

Multi-hosted endpoint lifecycle, keyed DI isolation, and slot-scoped logging#7633

Merged
danielmarbach merged 52 commits into
masterfrom
spike/multihost
Mar 6, 2026
Merged

Multi-hosted endpoint lifecycle, keyed DI isolation, and slot-scoped logging#7633
danielmarbach merged 52 commits into
masterfrom
spike/multihost

Conversation

@danielmarbach
Copy link
Copy Markdown
Contributor

@danielmarbach danielmarbach commented Feb 24, 2026

This PR consolidates and hardens the hosting work needed to run multiple NServiceBus endpoints in the same .NET host while preserving endpoint isolation for dependency resolution, message sessions, and logging context.

The branch replaces ad-hoc startup orchestration with an explicit endpoint lifecycle model (Create -> Start -> Stop -> Dispose), introduces keyed endpoint registration via AddNServiceBusEndpoint, and reworks logging to use endpoint-scoped slots that safely bridge into Microsoft.Extensions.Logging scopes.

In addition to feature work, the PR includes concurrency and correctness fixes for startup/shutdown races, logging cache visibility on weakly-ordered hardware, and endpoint stop semantics.

Problem Statement

When multiple endpoints are hosted in one process, the runtime needs to guarantee:

  • Distinct endpoint identity and registration safety
  • Correct IMessageSession binding per endpoint instance
  • Isolation of keyed services (including nested/keyed dependencies)
  • Deterministic startup/shutdown behavior under concurrent calls
  • Logging context that stays endpoint-specific across async flows and receives

Previous structure made these concerns harder to reason about because creation/start/stop responsibilities were spread across multiple code paths and logging context attachment was not fully integrated with keyed multi-host scenarios.

Architecture

1) Endpoint lifecycle model

Lifecycle handling is now centralized behind IEndpointLifecycle and used by EndpointHostedService.

  • Create(...) prepares a StartableEndpoint
  • Start(...) starts it and returns IEndpointInstance
  • Stop(...) gracefully shuts down if started
  • DisposeAsync(...) is idempotent and guarantees cleanup

Core flow:

  1. EndpointStartupRunner.Create() performs one-time creation (semaphore + double-check)
  2. EndpointPreparation.Prepare() resolves logging factory for the endpoint slot, runs installers, runs setup
  3. StartableEndpoint.Start() is guarded by a semaphore and cached instance
  4. RunningEndpointInstance.Stop() handles concurrent stop calls and slot unregistration

This removes duplication between internal/external container paths and makes lifecycle state transitions explicit.

2) Multi-host endpoint registration

ServiceCollectionExtensions.AddNServiceBusEndpoint(...) is introduced as the primary registration surface for host-based scenarios.

It validates and enforces:

  • Unique endpoint identifiers when multiple endpoints are registered
  • Assembly scanning disabled for all multi-hosted endpoints (with clear error)
  • No transport instance reuse across endpoint registrations

Registration behavior:

  • Single endpoint (no identifier): standard singleton registration
  • Multi-endpoint (identifier provided): keyed registrations of host/lifecycle + keyed hosted service wiring

This gives a consistent host integration model while preventing common misconfigurations early.

3) Keyed service isolation layer

The keyed hosting path now uses:

  • KeyedServiceCollectionAdapter
  • KeyedServiceProviderAdapter
  • KeyedServiceScopeFactory
  • public KeyedServiceKey

Notable characteristics:

  • Composite keys (baseKey, optional serviceKey) to isolate per-endpoint services
  • Support for keyed and non-keyed service descriptors
  • IEnumerable<T> behavior aligned with keyed resolution including an explicit Any key path
  • Shared lock state per underlying service collection to avoid adapter-level deadlock patterns

The result is predictable service resolution for each hosted endpoint, including nested factories and scoped dependencies.

4) Slot-based logging and MEL scope bridge

Logging was refactored around endpoint slots to preserve endpoint identity across runtime operations.

  • LogManager now tracks slot contexts and per-slot factories
  • Deferred slot logs are buffered when slot factory is pending and flushed later
  • Slot factories can be marked unavailable and deferred entries are flushed to the default logger
  • Slot state is fully cleaned up on endpoint shutdown (UnregisterSlot)
  • MicrosoftLoggerFactoryAdapter implements slot scope bridging so MEL receives structured scope values

ReceiveComponent now wraps message receivers (LogWrappedMessageReceiver) so message and error callbacks execute within the correct slot scope, including satellite and instance-specific receiver contexts.

Concurrency and Correctness Improvements

  • Volatile fields and semaphore-guarded initialization/start paths in lifecycle/startup components
  • RunningEndpointInstance.Stop() now handles Stopping state correctly and avoids duplicate stop pipelines
  • Endpoint messaging operations fail with a clear invalid-operation path once stop is triggered
  • MessageSession initialization gating remains async-safe and cancellation-token linking is centralized
  • LogManager cache publication/read ordering was tightened for correctness on weakly-ordered architectures (e.g., ARM)
  • Slot + logger cache is updated atomically via combined cached entry to avoid torn visibility between context/logger references

Public API Impact

Additions:

  • ServiceCollectionExtensions.AddNServiceBusEndpoint(this IServiceCollection, EndpointConfiguration, object? endpointIdentifier = null)
  • KeyedServiceKey (public) with Any and AnyKey(...)

Behavioral impact:

  • Multi-host endpoint registration now enforces identifier/scanning/transport invariants
  • Host-based startup favors lifecycle orchestration through IHostedLifecycleService
  • Message Session is now registered automatically but done in a TryAdd pattern to stay backward compatible

No behavior interface changes were introduced for pipeline/message handling contracts.

Tests and Validation

The PR includes focused tests across hosting, logging, lifecycle, and DI boundaries, including:

  • Distinct IMessageSession resolution for multi-hosted endpoints
  • EndpointHostedService start/stop/dispose behavior
  • KeyedServiceProviderAdapter resolution semantics
  • Endpoint logging scope coverage (including slot flush/unregistration behavior)
  • LogWrappedMessageReceiver slot scoping in receive callbacks
  • Message session and running endpoint stop edge cases

Acceptance coverage and API approvals were updated accordingly.

Migration Notes

  • For host-based registration, use AddNServiceBusEndpoint(...).
  • When hosting more than one endpoint in the same service collection:
    • Provide a unique endpointIdentifier per endpoint
    • Disable assembly scanning for each endpoint and register handlers explicitly
    • Use distinct transport instances per endpoint

These constraints are enforced to keep endpoint isolation deterministic and prevent cross-endpoint bleed-through.

Design Decision: Endpoint Identifier and Keyed Services

During the review of this PR, we revisited the original design decisions around the endpointIdentifier used for keyed service registration. After re-evaluating the tradeoffs and validating the assumptions against real-world scenarios, we have decided to retain the current design without changes.

Decision

No changes are required. The endpointIdentifier remains of type object?, and its usage and semantics stay as originally designed.

Context

This PR introduces a hosting model that enables multiple NServiceBus endpoints to run within the same .NET host while preserving strict isolation of dependency resolution, message sessions, and logging.

Keyed services are fundamental to achieving this isolation when multiple endpoints coexist in a single process.

Rationale

1. Alignment with Microsoft Dependency Injection

The design intentionally mirrors the capabilities of Microsoft.Extensions.DependencyInjection, which supports keyed services using an object as the key. Maintaining this alignment ensures:

  • Consistency with the underlying platform
  • Predictable integration with .NET features
  • Maximum flexibility for advanced scenarios

Restricting the identifier to string would diverge from established DI patterns and unnecessarily limit extensibility.

2. Flexibility for Advanced Scenarios

Allowing an object as the key enables sophisticated use cases such as multi-tenancy and dynamic multi-hosting. For example, tenant context objects can be used directly as keys to resolve tenant-specific dependencies.

This capability integrates naturally with [FromKeyedServices], including inherited key resolution, ensuring the correct services are resolved automatically at runtime.

3. Pit of Success for Typical Users

Although object is supported, the recommended and most common choice remains the endpoint name, which is a string. This provides a simple and predictable default while preserving extensibility for advanced users.

  • Common case: Use the endpoint name as the identifier.
  • Advanced scenarios: Use custom objects when needed.

This approach balances usability with flexibility without imposing unnecessary constraints.

4. Keyed Services as an Advanced Concept

Keyed services are intentionally introduced only when hosting multiple endpoints. This follows a progressive complexity model similar to multi-tenancy, where identifiers become necessary as architectural complexity increases.

  • Single endpoint: No keyed services required; everything works as before.
  • Multiple endpoints: Keyed services ensure isolation and correctness.

This design prevents exposing advanced concepts to users who do not need them.

5. Migration and Backward Compatibility

We revisited whether the endpoint name should always serve as the default identifier. While convenient, enforcing this universally would introduce drawbacks.

Specifically, it would require keyed services even in single-endpoint hosting scenarios, leading to:

  • More Difficult Migrations
    Migration from NServiceBus.Extensions.Hosting would become harder due to the need to adopt keyed dependency resolution.
  • Unnecessary Complexity
    Keyed services would be exposed to users who do not benefit from them.
  • Breaking Changes in Existing Applications
    Applications that inject IMessageSession or ITransactionalSession in generic hosting scenarios would break and require updates to use keyed resolutions.

To avoid these issues, keyed services are introduced only when multiple endpoints are hosted, preserving backward compatibility and ensuring a smooth migration path.

Considered Alternatives

Restricting the Identifier to string

Advantages:

  • Simpler equality semantics
  • Easier discoverability and documentation
  • Reduced risk of incorrectly implemented equality or hash codes

Disadvantages:

  • Diverges from Microsoft DI capabilities
  • Limits advanced scenarios such as multi-tenancy
  • Reduces flexibility for composing modular dependency groups

After careful consideration, these drawbacks were determined to outweigh the benefits.

Conclusion

The existing design provides the best balance between usability, flexibility, and platform alignment.

Aspect Decision
Identifier type object?
Recommended identifier Endpoint name (string)
Alignment with Microsoft DI Preserved
Support for advanced scenarios Enabled
Keyed services required for single endpoint No
Keyed services required for multiple endpoints Yes
Migration impact Minimal and backward compatible
Breaking changes introduced None

Comment thread src/NServiceBus.Core/Hosting/NServiceBusHostedService.cs Outdated
@danielmarbach danielmarbach force-pushed the spike/multihost branch 4 times, most recently from 8ccd077 to 6087365 Compare March 2, 2026 11:10
@danielmarbach danielmarbach added this to the 10.2.0 milestone Mar 2, 2026
@danielmarbach danielmarbach changed the title Multi Hosting Support Multi-hosted endpoint lifecycle, keyed DI isolation, and slot-scoped logging Mar 2, 2026
var instancePump = CreateReceiver(consecutiveFailuresConfiguration, instanceSpecificPump);
var instanceProcessingLogSlot = CreateReceiverProcessingLogSlot(endpointLogSlot, InstanceSpecificReceiverId);
var instancePump = CreateReceiver(consecutiveFailuresConfiguration, instanceSpecificPump, instanceProcessingLogSlot);
var instancePipelineExecutor = new MainPipelineExecutor(builder, pipelineCache, messageOperations, configuration.PipelineCompletedSubscribers, receivePipeline, activityFactory, pipelineMetrics, envelopeUnwrapper);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strictly speaking not necessary but felt cleaner.

@danielmarbach
Copy link
Copy Markdown
Contributor Author

The gap I can think of currently is that if someone uses the extension method outside a generic host with their custom service collection and provider (for example, WPF), then they have no way to start the endpoint.

This could be mitigated by providing a small abstraction like this

/// <summary>
/// Represents a lifecycle abstraction for endpoints that use an externally managed container.
/// </summary>
public interface IExternallyManagedEndpointLifecycle : IAsyncDisposable
{
    /// <summary>
    /// Creates and initializes the endpoint using the provided service provider.
    /// </summary>
    /// <param name="builder">The <see cref="IServiceProvider"/> instance used to resolve dependencies.</param>
    /// <param name="cancellationToken">A <see cref="CancellationToken"/> to observe.</param>
    Task Create(IServiceProvider builder, CancellationToken cancellationToken = default);

    /// <summary>
    /// Starts the endpoint.
    /// </summary>
    /// <param name="cancellationToken">A <see cref="CancellationToken"/> to observe.</param>
    Task Start(CancellationToken cancellationToken = default);

    /// <summary>
    /// Stops the endpoint.
    /// </summary>
    /// <param name="cancellationToken">A <see cref="CancellationToken"/> to observe.</param>
    Task Stop(CancellationToken cancellationToken = default);
}

sealed class ExternallyManagedEndpointLifecycle(Func<IServiceProvider, IEndpointLifecycle> endpointLifecycleFactory) : IExternallyManagedEndpointLifecycle
{
    public async Task Create(IServiceProvider builder, CancellationToken cancellationToken = default)
    {
        ArgumentNullException.ThrowIfNull(builder);

        if (endpointLifecycle is null)
        {
            await createSemaphore.WaitAsync(cancellationToken).ConfigureAwait(false);
            try
            {
                endpointLifecycle ??= endpointLifecycleFactory(builder);
            }
            finally
            {
                createSemaphore.Release();
            }
        }

        await endpointLifecycle.Create(cancellationToken).ConfigureAwait(false);
    }

    public async Task Start(CancellationToken cancellationToken = default)
    {
        if (endpointLifecycle is null)
        {
            throw new InvalidOperationException("The endpoint must be created before it can be started.");
        }

        _ = await endpointLifecycle.CreateAndStart(cancellationToken).ConfigureAwait(false);
    }

    public async Task Stop(CancellationToken cancellationToken = default)
    {
        if (endpointLifecycle is null)
        {
            return;
        }

        await endpointLifecycle.Stop(cancellationToken).ConfigureAwait(false);
    }

    public async ValueTask DisposeAsync()
    {
        if (Interlocked.Exchange(ref isDisposed, 1) == 1)
        {
            return;
        }

        if (endpointLifecycle is not null)
        {
            await endpointLifecycle.DisposeAsync().ConfigureAwait(false);
        }

        createSemaphore.Dispose();
    }

    readonly SemaphoreSlim createSemaphore = new(1, 1);
    volatile IEndpointLifecycle? endpointLifecycle;
    int isDisposed;
}

that is consistently registered in the service collection, like

services.AddSingleton<IExternallyManagedEndpointLifecycle>(_ => new ExternallyManagedEndpointLifecycle(provider => new BaseEndpointLifecycle(externallyManagedContainerHost, provider)));

or during multi hosting

services.AddKeyedSingleton<IExternallyManagedEndpointLifecycle>(endpointIdentifier, (_, _) => new ExternallyManagedEndpointLifecycle(provider => new EndpointLifecycle(externallyManagedContainerHost, provider, endpointIdentifier, keyedServices)));

@danielmarbach
Copy link
Copy Markdown
Contributor Author

@bording and I discussed the above and concluded it is better to not introduce confusing abstractions given that people can pull in the generic host as a library and call the service collection extensions available in this PR.

Comment thread src/NServiceBus.Core.Tests/Pipeline/MainPipelineExecutorTests.cs Outdated
Comment thread src/NServiceBus.Core/Hosting/EndpointHostedService.cs
Comment thread src/NServiceBus.Core/Hosting/ServiceCollectionExtensions.cs
…tralize logic and simplify usage across multiple components
…nsistency in endpoint lifecycle management
…ispose handling

- Introduced unit tests for `EndpointHostedService` to validate stop and dispose behavior.
- Integrated `Stop` method into `IEndpointLifecycle` and updated implementations to support proper endpoint shutdown.
- Improved concurrency safety in `RunningEndpointInstance.Stop`.
…o prevent duplicates and ensure proper source filtering
…ollectionAdapter` validation during acceptance tests
…isses explicit flush during factory registration
…onAdapter` validation during endpoint creation
… hardware, consolidating volatile reads/writes and introducing `CachedSlot` for atomic context+logger updates.
…eters, keyed services support, and advanced scenarios
@danielmarbach danielmarbach marked this pull request as ready for review March 5, 2026 14:57
@danielmarbach danielmarbach merged commit 93c640f into master Mar 6, 2026
4 checks passed
@danielmarbach danielmarbach deleted the spike/multihost branch March 6, 2026 10:16
@danielmarbach danielmarbach restored the spike/multihost branch March 6, 2026 22:41
@danielmarbach danielmarbach deleted the spike/multihost branch March 6, 2026 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants