EEBus: fix LPC/LPP CS failsafe-exit path#29705
Merged
Conversation
Per LPP §1.1.8 / LPC equivalents, the CS state machine must: - on heartbeat return: leave Failsafe immediately and apply the freshly received EG limit (transitions 8/9/10, LPP-918/919/920) - on prolonged heartbeat loss: keep applying a self-determined protective default (Unlimited-autonomous, LPP-921) The previous run() loop: - waited for failsafeDurationMinimum to elapse before leaving Failsafe, ignoring fresh limits from a healthy EG - on duration expiry transitioned to StatusNormal with limit=0 even when heartbeat was still missing, dropping all grid protection Restructure run() so heartbeat health, not duration, decides the exit: - HB present in Failsafe -> leave immediately; the LPC-914/1 block applies whatever active limit the EG has sent - HB still missing -> stay applying the failsafe limit indefinitely Also drop the c.statusUpdated = time.Now() writes in events.go: limit updates are not status transitions, so they should not reset the failsafe-duration clock. Add unit tests covering the four state-machine transitions.
Contributor
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The tests initialise a real in-memory SQLite instance via db.NewInstance for each test case, which introduces global state and potential cross-test interference; consider abstracting the persistence dependency behind an interface or a test helper that reuses/cleans up the DB (e.g. via t.Cleanup) to keep these unit tests isolated and faster.
- The failsafe exit path in run() still calls setConsumptionLimit(0) unconditionally before the LPC-914/1 block re-applies any fresh limit; if this zero-limit transition is not mandated by the spec, consider simplifying the state machine by collapsing these into a single, spec-driven limit update to reduce transient state changes and mock expectations.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The tests initialise a real in-memory SQLite instance via db.NewInstance for each test case, which introduces global state and potential cross-test interference; consider abstracting the persistence dependency behind an interface or a test helper that reuses/cleans up the DB (e.g. via t.Cleanup) to keep these unit tests isolated and faster.
- The failsafe exit path in run() still calls setConsumptionLimit(0) unconditionally before the LPC-914/1 block re-applies any fresh limit; if this zero-limit transition is not mandated by the spec, consider simplifying the state machine by collapsing these into a single, spec-driven limit update to reduce transient state changes and mock expectations.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to the EEBus verification surfaced via #29701. The LPC/LPP CS state machine in
hems/eebushad two regulatory-grade bugs on the failsafe-exit path:Failsafe was held for the full
failsafeDurationMinimum(default 2 h) even when the heartbeat returned with an EG limit pending. Spec LPP-918/919/920 (and LPC equivalents) require leaving Failsafe immediately on HB+limit and applying that limit. The old code waited for the duration and heartbeat both to be satisfied, then dropped to a 0 W limit — ignoring whatever value the EG had sent.When
failsafeDurationMinimumexpired with the heartbeat still missing, the CS transitioned toStatusNormalwith limit=0 — i.e. unlimited. After ≥2 h of EG outage the building became completely unprotected. Spec LPP-921 / LPC-921 say the CS shall apply a self-determined protective default in this Unlimited-autonomous state.Restructure
run()so heartbeat health (not duration) decides the exit:c.consumptionLimit/c.productionLimit(or releases ifIsActive=false).Also drop the
c.statusUpdated = time.Now()writes inevents.go. Limit updates aren't status transitions, so they shouldn't have been resetting the failsafe-duration clock — that overload would have trapped evcc in Failsafe indefinitely as long as the EG kept resending limits.Add unit tests for the four state-machine transitions:
TestRun_HeartbeatLost_EntersFailsafeTestRun_FailsafeStaysOnMissingHeartbeatTestRun_HeartbeatReturned_AppliesFreshLimitTestRun_HeartbeatReturned_NoFreshLimitOut of scope (follow-up)
This PR does not enforce LPP-913 ("writes received in Init/Failsafe/Unlimited-autonomous shall only be evaluated within 60 s of a heartbeat"). On HB return evcc applies whatever EG limit is currently stored — including a value that may have been buffered during the outage. A separate change in the WriteApproval path is needed to filter writes by heartbeat age and CS state.
Test plan
go test ./hems/eebus/...passes (4 new tests)go test ./hems/...passesgo vet ./hems/eebus/...cleangolangci-lint run ./hems/eebus/...0 issuesgofmt -lclean