Skip to content

Use orchestrator container delay start feature for better WithExplictStart + ContainerLifetime.Persistent support #9471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 3, 2025

Conversation

danegsta
Copy link
Member

Description

The orchestrator now supports explicitly marking a container resource as delayed start. This PR updates the explicit start behavior to create containers with delay start enabled (instead of delaying DCP resource creation until the user manually starts the container). This specifically allows us to better support persistent containers combined with explicit start.

If a container is marked as both persistent and delayed start, DCP will still check for an existing valid container and transition the resource to running if one is found, but won't create a new container if one isn't found.

Fixes #8073

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?

@github-actions github-actions bot added the area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication label May 22, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables the orchestrator’s delay-start feature by adding a Start flag to containers and updating snapshot building and execution logic to honor it, improving support for persistent containers with explicit starts.

  • Add Start property to ContainerSpec to mark a resource for delayed start.
  • Treat pending container state as NotStarted when Start is false in snapshots and status checks.
  • Update executor to set Spec.Start on explicit start and adjust creation flow for delayed-start containers.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/Aspire.Hosting/Dcp/ResourceSnapshotBuilder.cs Treat pending container states as NotStarted when Spec.Start is false
src/Aspire.Hosting/Dcp/Model/Container.cs Add nullable Start flag with XML docs to enable delayed-start marking
src/Aspire.Hosting/Dcp/DcpExecutor.cs Honor Spec.Start in status checks, set flag on explicit start, and adjust creation
Comments suppressed due to low confidence (2)

src/Aspire.Hosting/Dcp/ResourceSnapshotBuilder.cs:29

  • Consider adding unit tests for this new delay-start handling to verify that pending states are correctly mapped to NotStarted when Spec.Start is false.
if (container.Spec.Start == false && (state == null || state == ContainerState.Pending))

src/Aspire.Hosting/Dcp/DcpExecutor.cs:1234

  • After setting Spec.Start = false for delayed-start containers, we should restore the continue; to skip scheduling creation (the original logic prevented creating containers until an explicit start). Without it, CreateContainerAsyncCore will still run and may create the container prematurely.
container.Spec.Start = false;

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@@ -1223,8 +1229,10 @@ async Task CreateContainerAsyncCore(AppResource cr, CancellationToken cancellati

if (cr.ModelResource.TryGetLastAnnotation<ExplicitStartupAnnotation>(out _))
{
await _executorEvents.PublishAsync(new OnResourceChangedContext(cancellationToken, KnownResourceTypes.Container, cr.ModelResource, cr.DcpResourceName, new ResourceStatus(KnownResourceStates.NotStarted, null, null), s => s with { State = new ResourceStateSnapshot(KnownResourceStates.NotStarted, null) })).ConfigureAwait(false);
Copy link
Member

@davidfowl davidfowl May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels scary, what else does firing this event do that is being missed now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This event was a virtual status update; the app model was skipping creating the DCP resource and just reporting it as "NotStarted". Now that DCP has support for a resource that should be explicitly started, the resource still ends up with a "NotStarted" status event, but it's in response to the actual DCP resource status (State=pending + Start=false).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the user sets environment variables in before start. Does that fire still?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProcessEnvironmentVariableValuesAsync gets called and the environment gets generated.

@danegsta danegsta closed this May 23, 2025
@danegsta danegsta reopened this May 23, 2025
@danegsta danegsta closed this May 23, 2025
@danegsta danegsta reopened this May 23, 2025
@davidfowl
Copy link
Member

@JamesNK can you review this?

@JamesNK
Copy link
Member

JamesNK commented May 29, 2025

There are no unit tests. Are changes already covered by existing unit tests?

Existing explicit start unit tests:

[Fact]
public async Task ExplicitStart_StartExecutable()
{
const string testName = "explicit-start-executable";
using var testProgram = CreateTestProgram(testName, randomizePorts: false);
SetupXUnitLogging(testProgram.AppBuilder.Services);
var notStartedResourceName = $"{testName}-servicea";
var dependentResourceName = $"{testName}-serviceb";
testProgram.ServiceABuilder.WithExplicitStart();
testProgram.ServiceBBuilder.WaitFor(testProgram.ServiceABuilder);
using var app = testProgram.Build();
var rns = app.Services.GetRequiredService<ResourceNotificationService>();
var orchestrator = app.Services.GetRequiredService<ApplicationOrchestrator>();
var logger = app.Services.GetRequiredService<ILogger<DistributedApplicationTests>>();
var startTask = app.StartAsync();
// On start, one resource won't be started and the other is waiting on it.
var notStartedResourceEvent = await rns.WaitForResourceAsync(notStartedResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.NotStarted).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
var dependentResourceEvent = await rns.WaitForResourceAsync(dependentResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Waiting).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
// Source should be populated on non-started resources.
Assert.Contains("TestProject.ServiceA.csproj", notStartedResourceEvent.Snapshot.Properties.Single(p => p.Name == "project.path").Value?.ToString());
Assert.Contains("TestProject.ServiceB.csproj", dependentResourceEvent.Snapshot.Properties.Single(p => p.Name == "project.path").Value?.ToString());
Assert.Collection(notStartedResourceEvent.Snapshot.Urls, u =>
{
Assert.Equal("http://localhost:5156", u.Url);
Assert.Equal("http", u.Name);
Assert.True(u.IsInactive);
});
Assert.Collection(dependentResourceEvent.Snapshot.Urls, u =>
{
Assert.Equal("http://localhost:5254", u.Url);
Assert.Equal("http", u.Name);
Assert.True(u.IsInactive);
});
logger.LogInformation("Start explicit start resource.");
await orchestrator.StartResourceAsync(notStartedResourceEvent.ResourceId, CancellationToken.None).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
var runningResourceEvent = await rns.WaitForResourceAsync(notStartedResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Running).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
Assert.Collection(runningResourceEvent.Snapshot.Urls, u =>
{
Assert.Equal("http://localhost:5156", u.Url);
Assert.Equal("http", u.Name);
});
// Dependent resource should now run.
var dependentResourceRunningEvent = await rns.WaitForResourceAsync(dependentResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Running).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
Assert.Collection(dependentResourceRunningEvent.Snapshot.Urls, u =>
{
Assert.Equal("http://localhost:5254", u.Url);
Assert.Equal("http", u.Name);
});
logger.LogInformation("Stop resource.");
await orchestrator.StopResourceAsync(notStartedResourceEvent.ResourceId, CancellationToken.None).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
await rns.WaitForResourceAsync(notStartedResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Finished).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
logger.LogInformation("Start resource again");
await orchestrator.StartResourceAsync(notStartedResourceEvent.ResourceId, CancellationToken.None).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
await rns.WaitForResourceAsync(notStartedResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Running).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
await startTask.DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
await app.StopAsync().DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
}
[Fact]
[RequiresDocker]
public async Task ExplicitStart_StartContainer()
{
const string testName = "explicit-start-container";
using var testProgram = CreateTestProgram(testName, randomizePorts: false);
SetupXUnitLogging(testProgram.AppBuilder.Services);
var notStartedResourceName = $"{testName}-redis";
var dependentResourceName = $"{testName}-serviceb";
var containerBuilder = AddRedisContainer(testProgram.AppBuilder, notStartedResourceName)
.WithEndpoint(port: 6379, targetPort: 6379, name: "tcp", env: "REDIS_PORT")
.WithExplicitStart();
containerBuilder.WithExplicitStart();
testProgram.ServiceBBuilder.WaitFor(containerBuilder);
using var app = testProgram.Build();
var rns = app.Services.GetRequiredService<ResourceNotificationService>();
var orchestrator = app.Services.GetRequiredService<ApplicationOrchestrator>();
var logger = app.Services.GetRequiredService<ILogger<DistributedApplicationTests>>();
var startTask = app.StartAsync();
// On start, one resource won't be started and the other is waiting on it.
var notStartedResourceEvent = await rns.WaitForResourceAsync(notStartedResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.NotStarted).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
var dependentResourceEvent = await rns.WaitForResourceAsync(dependentResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Waiting).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
Assert.Collection(notStartedResourceEvent.Snapshot.Urls, u =>
{
Assert.Equal("tcp://localhost:6379", u.Url);
Assert.True(u.IsInactive);
});
Assert.Collection(dependentResourceEvent.Snapshot.Urls, u =>
{
Assert.Equal("http://localhost:5254", u.Url);
Assert.Equal("http", u.Name);
Assert.True(u.IsInactive);
});
// Source should be populated on non-started resources.
Assert.Equal(RedisImageSource, notStartedResourceEvent.Snapshot.Properties.Single(p => p.Name == "container.image").Value?.ToString());
Assert.Contains("TestProject.ServiceB.csproj", dependentResourceEvent.Snapshot.Properties.Single(p => p.Name == "project.path").Value?.ToString());
logger.LogInformation("Start explicit start resource.");
await orchestrator.StartResourceAsync(notStartedResourceEvent.ResourceId, CancellationToken.None).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
var runningResourceEvent = await rns.WaitForResourceAsync(notStartedResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Running).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
Assert.Collection(runningResourceEvent.Snapshot.Urls, u =>
{
Assert.Equal("tcp://localhost:6379", u.Url);
});
// Dependent resource should now run.
var dependentRunningResourceEvent = await rns.WaitForResourceAsync(dependentResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Running).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
Assert.Collection(dependentRunningResourceEvent.Snapshot.Urls, u =>
{
Assert.Equal("http://localhost:5254", u.Url);
Assert.Equal("http", u.Name);
});
logger.LogInformation("Stop resource.");
await orchestrator.StopResourceAsync(notStartedResourceEvent.ResourceId, CancellationToken.None).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
await rns.WaitForResourceAsync(notStartedResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Exited).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
logger.LogInformation("Start resource again");
await orchestrator.StartResourceAsync(notStartedResourceEvent.ResourceId, CancellationToken.None).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
await rns.WaitForResourceAsync(notStartedResourceName, e => e.Snapshot.State?.Text == KnownResourceStates.Running).DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
await startTask.DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
await app.StopAsync().DefaultTimeout(TestConstants.LongTimeoutTimeSpan);
}

@danegsta danegsta requested a review from mitchdenny as a code owner May 31, 2025 00:07
@danegsta
Copy link
Member Author

ExplicitStart_StartExecutable

I've added a new test case ExplicitStart_StartPersistentContainer to cover the expected changes in behavior with regards to persistent containers.

@danegsta danegsta merged commit 3627efa into dotnet:main Jun 3, 2025
746 of 750 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WithExplicitStart clashes with WithLifetime(ContainerLifetime.Persistent)
3 participants