Skip to content

fix(provisioning): remove app from apps map when its provision failed #7070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alexandre-daubois
Copy link

@alexandre-daubois alexandre-daubois commented Jun 16, 2025

Fix dunglas/symfony-docker#801

When starting FrankenPHP with Mercure and the default prod config, the publisher and subscriber keys are empty. This causes the Mercure app provision to fail, return an error in Provision and be partially started. It is then in an inconsistent state, and Caddy panics (as shown in the stacks trace of the original issue).

By adding this new cleanup, we ensure the app isn't in the apps map anymore. Caddy doesn't panic anymore and the error message returned by Mercure is now successfully displayed in the console.

cc @dunglas

@CLAassistant
Copy link

CLAassistant commented Jun 16, 2025

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooo, thanks, that's an interesting case. Yeah it's probably best if the failed app is removed...

@mholt mholt enabled auto-merge (squash) June 16, 2025 19:16
@francislavoie
Copy link
Member

francislavoie commented Jun 16, 2025

Hmm. I'm not sure this is a good enough solution. I think what might happen is if the app is referenced by more than one module (e.g. HTTP handler module gets a ref to the app), then it would try to provision the app many times (and probably fail every time), causing a lot of noise. Maybe the context should hold onto a map of failed apps (e.g. ctx.cfg.failedApps or something), skip trying to load that app again if it's in failedApps, and then get rid of failedApps after the whole provision step is done, but before starting the apps.

@mholt mholt disabled auto-merge June 16, 2025 19:29
@mholt
Copy link
Member

mholt commented Jun 16, 2025

@francislavoie Interesting... (I've disabled auto-merge while we figure this out)

Once a module fails to load, it should bubble the error up all the way, and cease trying to provision more modules, yeah? The config will proceed to unload and roll back to its previous state.

@francislavoie
Copy link
Member

I don't understand why Caddy continues to start then 🤔 With a bad config it should just throw it all away, no? Maybe I don't understand how Mercure is being used here.

@alexandre-daubois
Copy link
Author

Maybe I don't understand how Mercure is being used here.

Nothing fancy here actually, Mercure app provision just starts by checking that provided keys are not empty, and it returns an error if they are

@mholt
Copy link
Member

mholt commented Jun 16, 2025

@francislavoie Good question -- the stack trace shows that it's starting the http app, I'll study this a bit more.

I'm not strictly opposed to "noisy" logs though, but I do want to know why a failed module provisioning during config load allows the config to continue loading... I'll dig.

@mholt
Copy link
Member

mholt commented Jun 18, 2025

Okay, I had some time to digest this. I don't fully understand it yet.

Thanks @alexandre-daubois for proposing a fix. Even though I can't explain why, if the mercure module returns an error on Provision(), we would call Start() on any app, I do think it's the right thing to remove the app from the map if it fails to provision.

This does feel like a band-aid (but also proper bookkeeping).

As I don't have Docker to reproduce the issue, can anyone explain how it's possible that we're calling Start() here:

caddy/caddy.go

Lines 431 to 435 in 2f0fc62

// Start
err = func() error {
started := make([]string, 0, len(ctx.cfg.apps))
for name, a := range ctx.cfg.apps {
err := a.Start()

after an error is returned here:

caddy/caddy.go

Lines 401 to 405 in 2f0fc62

ctx, err := provisionContext(newCfg, start)
if err != nil {
globalMetrics.configSuccess.Set(0)
return ctx, err
}

a few lines above?

I pushed a commit to do the cleanup using defer, which is a little more robust and takes care of a failure from Validate() as well. We can probably merge this since it's at least more correct. (Thanks again!)

@mholt mholt added the under review 🧐 Review is pending before merging label Jun 18, 2025
@mholt mholt added this to the v2.10.1 milestone Jun 18, 2025
@mholt
Copy link
Member

mholt commented Jun 19, 2025

@alexandre-daubois -- or anyone else getting the error -- would you be able to confirm that the latest commit still "fixes" the issue?

And if anyone knows why Start() proceeds to be called despite an error being returned, I'd love to be enlightened 😅

@alexandre-daubois
Copy link
Author

I just tested your fix and everything seems fine! Thank you Matt!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
under review 🧐 Review is pending before merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic: runtime error: invalid memory address or nil pointer dereference
5 participants