Skip to content

fix(plugin): keep active plugin on reload failure#680

Merged
intel352 merged 2 commits into
mainfrom
fix/external-plugin-safe-reload
May 15, 2026
Merged

fix(plugin): keep active plugin on reload failure#680
intel352 merged 2 commits into
mainfrom
fix/external-plugin-safe-reload

Conversation

@intel352
Copy link
Copy Markdown
Contributor

@intel352 intel352 commented May 15, 2026

Summary

  • start and validate replacement external plugin processes before swapping the active client
  • preserve the existing plugin client when candidate reload fails
  • reject invalid candidate launches that lack a client or adapter
  • document reload as a local try-activate primitive, not artifact trust or fleet rollout

Verification

  • GOWORK=off go test ./plugin/external -count=1
  • GOWORK=off go test ./plugin/... -count=1
  • GOWORK=off go test ./plugin/external -coverprofile=/tmp/external.out -count=1
  • git diff --check

Regression Proof

With adapter validation absent:

GOWORK=off go test ./plugin/external -run 'TestExternalPluginManager(LoadPluginRejectsInvalidCandidate|ReloadWithoutActivePluginRejectsInvalidCandidate|ReloadLoadedRejectsInvalidCandidate)' -count=1

failed because nil-adapter launches were accepted and registered.

With the validation restored:

GOWORK=off go test ./plugin/external -run 'TestExternalPluginManager' -count=1

passes.

Addresses the external plugin process handoff slice of #667. The broader config-engine probe/rollback contract remains tracked by #667.

Copilot AI review requested due to automatic review settings May 15, 2026 00:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors ExternalPluginManager.ReloadPlugin so the replacement plugin subprocess is started and validated before the previously active client is killed. On candidate failure the old plugin remains registered, addressing the try-activate/rollback requirement from issue #667. Documentation and a comment are updated to reflect the new local-activation semantics.

Changes:

  • Extract subprocess startup into startPluginLocked returning a pluginLaunch, injectable via a startPlugin test seam.
  • Rewrite ReloadPlugin to load+validate the candidate first, only killing the old client on success, and to load directly when no plugin is currently active.
  • Update handleReload comment and docs/PLUGIN_DEVELOPMENT_GUIDE.md to document try-activate semantics and clarify that artifact trust is out of scope.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
plugin/external/manager.go Splits start logic out of LoadPlugin and reimplements ReloadPlugin to swap clients only after successful candidate launch.
plugin/external/manager_test.go Adds tests covering reload failure preserving the active client and reload success swapping clients only after candidate start.
plugin/external/handler.go Updates the handleReload doc comment to reflect new behavior.
docs/PLUGIN_DEVELOPMENT_GUIDE.md Documents try-activate reload semantics and explicit non-goals (trust/fleet rollout).

@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

❌ Patch coverage is 97.67442% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
plugin/external/manager.go 97.67% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 15, 2026

⏱ Benchmark Results

No significant performance regressions detected.

benchstat comparison (baseline → PR)
## benchstat: baseline → PR
baseline-bench.txt:276: parsing iteration count: invalid syntax
baseline-bench.txt:320599: parsing iteration count: invalid syntax
baseline-bench.txt:596423: parsing iteration count: invalid syntax
baseline-bench.txt:837117: parsing iteration count: invalid syntax
baseline-bench.txt:1153135: parsing iteration count: invalid syntax
baseline-bench.txt:1437233: parsing iteration count: invalid syntax
benchmark-results.txt:276: parsing iteration count: invalid syntax
benchmark-results.txt:331513: parsing iteration count: invalid syntax
benchmark-results.txt:652718: parsing iteration count: invalid syntax
benchmark-results.txt:925758: parsing iteration count: invalid syntax
benchmark-results.txt:1184602: parsing iteration count: invalid syntax
benchmark-results.txt:1447964: parsing iteration count: invalid syntax
goos: linux
goarch: amd64
pkg: github.com/GoCodeAlone/workflow/dynamic
cpu: AMD EPYC 7763 64-Core Processor                
                            │ baseline-bench.txt │
                            │       sec/op       │
InterpreterCreation-4               6.419m ± 53%
ComponentLoad-4                     3.786m ±  9%
ComponentExecute-4                  1.982µ ±  2%
PoolContention/workers-1-4          1.106µ ±  1%
PoolContention/workers-2-4          1.111µ ±  1%
PoolContention/workers-4-4          1.109µ ±  4%
PoolContention/workers-8-4          1.113µ ±  1%
PoolContention/workers-16-4         1.109µ ±  0%
ComponentLifecycle-4                3.667m ±  1%
SourceValidation-4                  2.330µ ±  0%
RegistryConcurrent-4                792.3n ±  3%
LoaderLoadFromString-4              3.695m ±  1%
geomean                             18.86µ

                            │ baseline-bench.txt │
                            │        B/op        │
InterpreterCreation-4               2.027Mi ± 0%
ComponentLoad-4                     2.180Mi ± 0%
ComponentExecute-4                  1.203Ki ± 0%
PoolContention/workers-1-4          1.203Ki ± 0%
PoolContention/workers-2-4          1.203Ki ± 0%
PoolContention/workers-4-4          1.203Ki ± 0%
PoolContention/workers-8-4          1.203Ki ± 0%
PoolContention/workers-16-4         1.203Ki ± 0%
ComponentLifecycle-4                2.183Mi ± 0%
SourceValidation-4                  1.984Ki ± 0%
RegistryConcurrent-4                1.133Ki ± 0%
LoaderLoadFromString-4              2.182Mi ± 0%
geomean                             15.25Ki

                            │ baseline-bench.txt │
                            │     allocs/op      │
InterpreterCreation-4                15.68k ± 0%
ComponentLoad-4                      18.02k ± 0%
ComponentExecute-4                    25.00 ± 0%
PoolContention/workers-1-4            25.00 ± 0%
PoolContention/workers-2-4            25.00 ± 0%
PoolContention/workers-4-4            25.00 ± 0%
PoolContention/workers-8-4            25.00 ± 0%
PoolContention/workers-16-4           25.00 ± 0%
ComponentLifecycle-4                 18.07k ± 0%
SourceValidation-4                    32.00 ± 0%
RegistryConcurrent-4                  2.000 ± 0%
LoaderLoadFromString-4               18.06k ± 0%
geomean                               183.3

cpu: AMD EPYC 9V74 80-Core Processor                
                            │ benchmark-results.txt │
                            │        sec/op         │
InterpreterCreation-4                  5.305m ± 89%
ComponentLoad-4                        3.525m ±  1%
ComponentExecute-4                     1.856µ ±  0%
PoolContention/workers-1-4             1.025µ ±  2%
PoolContention/workers-2-4             1.020µ ±  1%
PoolContention/workers-4-4             1.021µ ±  1%
PoolContention/workers-8-4             1.019µ ±  1%
PoolContention/workers-16-4            1.033µ ±  2%
ComponentLifecycle-4                   3.637m ±  1%
SourceValidation-4                     2.104µ ±  0%
RegistryConcurrent-4                   771.5n ±  8%
LoaderLoadFromString-4                 3.586m ±  2%
geomean                                17.50µ

                            │ benchmark-results.txt │
                            │         B/op          │
InterpreterCreation-4                  2.027Mi ± 0%
ComponentLoad-4                        2.180Mi ± 0%
ComponentExecute-4                     1.203Ki ± 0%
PoolContention/workers-1-4             1.203Ki ± 0%
PoolContention/workers-2-4             1.203Ki ± 0%
PoolContention/workers-4-4             1.203Ki ± 0%
PoolContention/workers-8-4             1.203Ki ± 0%
PoolContention/workers-16-4            1.203Ki ± 0%
ComponentLifecycle-4                   2.183Mi ± 0%
SourceValidation-4                     1.984Ki ± 0%
RegistryConcurrent-4                   1.133Ki ± 0%
LoaderLoadFromString-4                 2.182Mi ± 0%
geomean                                15.25Ki

                            │ benchmark-results.txt │
                            │       allocs/op       │
InterpreterCreation-4                   15.68k ± 0%
ComponentLoad-4                         18.02k ± 0%
ComponentExecute-4                       25.00 ± 0%
PoolContention/workers-1-4               25.00 ± 0%
PoolContention/workers-2-4               25.00 ± 0%
PoolContention/workers-4-4               25.00 ± 0%
PoolContention/workers-8-4               25.00 ± 0%
PoolContention/workers-16-4              25.00 ± 0%
ComponentLifecycle-4                    18.07k ± 0%
SourceValidation-4                       32.00 ± 0%
RegistryConcurrent-4                     2.000 ± 0%
LoaderLoadFromString-4                  18.06k ± 0%
geomean                                  183.3

pkg: github.com/GoCodeAlone/workflow/middleware
cpu: AMD EPYC 7763 64-Core Processor                
                                  │ baseline-bench.txt │
                                  │       sec/op       │
CircuitBreakerDetection-4                 294.2n ± 20%
CircuitBreakerExecution_Success-4         21.38n ±  0%
CircuitBreakerExecution_Failure-4         66.85n ±  1%
geomean                                   74.92n

                                  │ baseline-bench.txt │
                                  │        B/op        │
CircuitBreakerDetection-4                 144.0 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

                                  │ baseline-bench.txt │
                                  │     allocs/op      │
CircuitBreakerDetection-4                 1.000 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

cpu: AMD EPYC 9V74 80-Core Processor                
                                  │ benchmark-results.txt │
                                  │        sec/op         │
CircuitBreakerDetection-4                    299.3n ± 13%
CircuitBreakerExecution_Success-4            22.66n ±  0%
CircuitBreakerExecution_Failure-4            70.95n ±  0%
geomean                                      78.36n

                                  │ benchmark-results.txt │
                                  │         B/op          │
CircuitBreakerDetection-4                    144.0 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

                                  │ benchmark-results.txt │
                                  │       allocs/op       │
CircuitBreakerDetection-4                    1.000 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/module
cpu: AMD EPYC 7763 64-Core Processor                
                                 │ baseline-bench.txt │
                                 │       sec/op       │
IaCStateBackend_InProcess-4              299.9n ±  3%
IaCStateBackend_GRPC-4                   10.04m ±  9%
JQTransform_Simple-4                     680.2n ± 30%
JQTransform_ObjectConstruction-4         1.463µ ±  1%
JQTransform_ArraySelect-4                3.423µ ±  1%
JQTransform_Complex-4                    39.13µ ±  1%
JQTransform_Throughput-4                 1.790µ ±  1%
SSEPublishDelivery-4                     67.08n ±  1%
geomean                                  3.849µ

                                 │ baseline-bench.txt │
                                 │        B/op        │
IaCStateBackend_InProcess-4              416.0 ± 0%
IaCStateBackend_GRPC-4                 5.840Mi ± 8%
JQTransform_Simple-4                   1.273Ki ± 0%
JQTransform_ObjectConstruction-4       1.773Ki ± 0%
JQTransform_ArraySelect-4              2.625Ki ± 0%
JQTransform_Complex-4                  16.22Ki ± 0%
JQTransform_Throughput-4               1.984Ki ± 0%
SSEPublishDelivery-4                     0.000 ± 0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

                                 │ baseline-bench.txt │
                                 │     allocs/op      │
IaCStateBackend_InProcess-4              2.000 ± 0%
IaCStateBackend_GRPC-4                  6.835k ± 1%
JQTransform_Simple-4                     10.00 ± 0%
JQTransform_ObjectConstruction-4         15.00 ± 0%
JQTransform_ArraySelect-4                30.00 ± 0%
JQTransform_Complex-4                    324.0 ± 0%
JQTransform_Throughput-4                 17.00 ± 0%
SSEPublishDelivery-4                     0.000 ± 0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

cpu: AMD EPYC 9V74 80-Core Processor                
                                 │ benchmark-results.txt │
                                 │        sec/op         │
IaCStateBackend_InProcess-4                 289.6n ± 25%
IaCStateBackend_GRPC-4                      9.843m ±  2%
JQTransform_Simple-4                        668.9n ± 27%
JQTransform_ObjectConstruction-4            1.492µ ±  4%
JQTransform_ArraySelect-4                   3.697µ ±  0%
JQTransform_Complex-4                       43.24µ ±  3%
JQTransform_Throughput-4                    1.763µ ±  1%
SSEPublishDelivery-4                        62.74n ±  0%
geomean                                     3.869µ

                                 │ benchmark-results.txt │
                                 │         B/op          │
IaCStateBackend_InProcess-4                416.0 ±  0%
IaCStateBackend_GRPC-4                   5.758Mi ± 13%
JQTransform_Simple-4                     1.273Ki ±  0%
JQTransform_ObjectConstruction-4         1.773Ki ±  0%
JQTransform_ArraySelect-4                2.625Ki ±  0%
JQTransform_Complex-4                    16.22Ki ±  0%
JQTransform_Throughput-4                 1.984Ki ±  0%
SSEPublishDelivery-4                       0.000 ±  0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                 │ benchmark-results.txt │
                                 │       allocs/op       │
IaCStateBackend_InProcess-4                 2.000 ± 0%
IaCStateBackend_GRPC-4                     6.865k ± 1%
JQTransform_Simple-4                        10.00 ± 0%
JQTransform_ObjectConstruction-4            15.00 ± 0%
JQTransform_ArraySelect-4                   30.00 ± 0%
JQTransform_Complex-4                       324.0 ± 0%
JQTransform_Throughput-4                    17.00 ± 0%
SSEPublishDelivery-4                        0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/schema
cpu: AMD EPYC 7763 64-Core Processor                
                                    │ baseline-bench.txt │
                                    │       sec/op       │
SchemaValidation_Simple-4                   1.105µ ± 19%
SchemaValidation_AllFields-4                1.661µ ±  7%
SchemaValidation_FormatValidation-4         1.586µ ±  2%
SchemaValidation_ManySchemas-4              1.838µ ±  3%
geomean                                     1.521µ

                                    │ baseline-bench.txt │
                                    │        B/op        │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                    │ baseline-bench.txt │
                                    │     allocs/op      │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

cpu: AMD EPYC 9V74 80-Core Processor                
                                    │ benchmark-results.txt │
                                    │        sec/op         │
SchemaValidation_Simple-4                      1.078µ ± 22%
SchemaValidation_AllFields-4                   1.651µ ±  2%
SchemaValidation_FormatValidation-4            1.568µ ±  1%
SchemaValidation_ManySchemas-4                 1.610µ ±  1%
geomean                                        1.456µ

                                    │ benchmark-results.txt │
                                    │         B/op          │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

                                    │ benchmark-results.txt │
                                    │       allocs/op       │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/store
cpu: AMD EPYC 7763 64-Core Processor                
                                   │ baseline-bench.txt │
                                   │       sec/op       │
EventStoreAppend_InMemory-4                1.202µ ± 17%
EventStoreAppend_SQLite-4                  1.389m ±  7%
GetTimeline_InMemory/events-10-4           14.03µ ±  2%
GetTimeline_InMemory/events-50-4           77.04µ ±  3%
GetTimeline_InMemory/events-100-4          126.3µ ±  2%
GetTimeline_InMemory/events-500-4          675.5µ ±  4%
GetTimeline_InMemory/events-1000-4         1.326m ±  8%
GetTimeline_SQLite/events-10-4             109.1µ ±  1%
GetTimeline_SQLite/events-50-4             250.2µ ±  2%
GetTimeline_SQLite/events-100-4            431.0µ ±  2%
GetTimeline_SQLite/events-500-4            1.826m ±  3%
GetTimeline_SQLite/events-1000-4           3.546m ±  1%
geomean                                    223.5µ

                                   │ baseline-bench.txt │
                                   │        B/op        │
EventStoreAppend_InMemory-4                  831.0 ± 8%
EventStoreAppend_SQLite-4                  1.984Ki ± 2%
GetTimeline_InMemory/events-10-4           7.953Ki ± 0%
GetTimeline_InMemory/events-50-4           46.62Ki ± 0%
GetTimeline_InMemory/events-100-4          94.48Ki ± 0%
GetTimeline_InMemory/events-500-4          472.8Ki ± 0%
GetTimeline_InMemory/events-1000-4         944.3Ki ± 0%
GetTimeline_SQLite/events-10-4             16.74Ki ± 0%
GetTimeline_SQLite/events-50-4             87.14Ki ± 0%
GetTimeline_SQLite/events-100-4            175.4Ki ± 0%
GetTimeline_SQLite/events-500-4            846.1Ki ± 0%
GetTimeline_SQLite/events-1000-4           1.639Mi ± 0%
geomean                                    67.63Ki

                                   │ baseline-bench.txt │
                                   │     allocs/op      │
EventStoreAppend_InMemory-4                  7.000 ± 0%
EventStoreAppend_SQLite-4                    53.00 ± 0%
GetTimeline_InMemory/events-10-4             125.0 ± 0%
GetTimeline_InMemory/events-50-4             653.0 ± 0%
GetTimeline_InMemory/events-100-4           1.306k ± 0%
GetTimeline_InMemory/events-500-4           6.514k ± 0%
GetTimeline_InMemory/events-1000-4          13.02k ± 0%
GetTimeline_SQLite/events-10-4               382.0 ± 0%
GetTimeline_SQLite/events-50-4              1.852k ± 0%
GetTimeline_SQLite/events-100-4             3.681k ± 0%
GetTimeline_SQLite/events-500-4             18.54k ± 0%
GetTimeline_SQLite/events-1000-4            37.29k ± 0%
geomean                                     1.162k

cpu: AMD EPYC 9V74 80-Core Processor                
                                   │ benchmark-results.txt │
                                   │        sec/op         │
EventStoreAppend_InMemory-4                   1.092µ ± 26%
EventStoreAppend_SQLite-4                     1.084m ±  6%
GetTimeline_InMemory/events-10-4              12.62µ ±  4%
GetTimeline_InMemory/events-50-4              55.54µ ± 17%
GetTimeline_InMemory/events-100-4             111.6µ ±  1%
GetTimeline_InMemory/events-500-4             570.1µ ±  1%
GetTimeline_InMemory/events-1000-4            1.167m ±  2%
GetTimeline_SQLite/events-10-4                85.35µ ±  1%
GetTimeline_SQLite/events-50-4                222.0µ ±  2%
GetTimeline_SQLite/events-100-4               386.0µ ±  2%
GetTimeline_SQLite/events-500-4               1.704m ±  2%
GetTimeline_SQLite/events-1000-4              3.346m ±  3%
geomean                                       192.4µ

                                   │ benchmark-results.txt │
                                   │         B/op          │
EventStoreAppend_InMemory-4                     803.5 ± 7%
EventStoreAppend_SQLite-4                     1.983Ki ± 2%
GetTimeline_InMemory/events-10-4              7.953Ki ± 0%
GetTimeline_InMemory/events-50-4              46.62Ki ± 0%
GetTimeline_InMemory/events-100-4             94.48Ki ± 0%
GetTimeline_InMemory/events-500-4             472.8Ki ± 0%
GetTimeline_InMemory/events-1000-4            944.3Ki ± 0%
GetTimeline_SQLite/events-10-4                16.74Ki ± 0%
GetTimeline_SQLite/events-50-4                87.14Ki ± 0%
GetTimeline_SQLite/events-100-4               175.4Ki ± 0%
GetTimeline_SQLite/events-500-4               846.1Ki ± 0%
GetTimeline_SQLite/events-1000-4              1.639Mi ± 0%
geomean                                       67.44Ki

                                   │ benchmark-results.txt │
                                   │       allocs/op       │
EventStoreAppend_InMemory-4                     7.000 ± 0%
EventStoreAppend_SQLite-4                       53.00 ± 0%
GetTimeline_InMemory/events-10-4                125.0 ± 0%
GetTimeline_InMemory/events-50-4                653.0 ± 0%
GetTimeline_InMemory/events-100-4              1.306k ± 0%
GetTimeline_InMemory/events-500-4              6.514k ± 0%
GetTimeline_InMemory/events-1000-4             13.02k ± 0%
GetTimeline_SQLite/events-10-4                  382.0 ± 0%
GetTimeline_SQLite/events-50-4                 1.852k ± 0%
GetTimeline_SQLite/events-100-4                3.681k ± 0%
GetTimeline_SQLite/events-500-4                18.54k ± 0%
GetTimeline_SQLite/events-1000-4               37.29k ± 0%
geomean                                        1.162k

Benchmarks run with go test -bench=. -benchmem -count=6.
Regressions ≥ 20% are flagged. Results compared via benchstat.

Reject nil reload adapters and expand manager reload/load tests for PR #680 patch coverage.
@intel352 intel352 merged commit 45cf66c into main May 15, 2026
26 checks passed
@intel352 intel352 deleted the fix/external-plugin-safe-reload branch May 15, 2026 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants