-
Notifications
You must be signed in to change notification settings - Fork 12
feat: refactor backend state management and send in heartbeat #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Go test coverage
Total coverage: 62.8% |
jajeffries
commented
Oct 13, 2025
leoparente
approved these changes
Oct 13, 2025
Contributor
leoparente
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
|
🎉 This PR is included in version 2.5.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
|
🎉 This PR is included in version 2.5.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces a significant refactor to how backend state is managed and monitored in the agent. The main improvement is the introduction of a centralized
StateManagerfor backend state, which enables more robust monitoring, error handling, and reporting, including integration with the Fleet heartbeat. The changes also propagate backend state information into heartbeats sent to Fleet, improving observability. Several interfaces and constructors are updated to support this new state management approach.Key changes include:
Backend State Management Refactor:
backend.StateManagerinagent/backend/backend_state.go, which centralizes backend state tracking, periodic monitoring, error registration, and restart logic. The manager uses a dedicated goroutine to monitor each backend and triggers restarts if a backend is unhealthy and the minimum restart interval has elapsed. ([agent/backend/backend_state.goR1-R111](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-6301055a8fa5f25637f457f26fa4d9556ff42cfe18d83199b773f78de9332819R1-R111))backendStatemap inagent/agent.gowith the newStateManager, updating all relevant methods to interact with the manager for backend state, error, and restart registration. ([[1]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL35-R45),[[2]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bR63-R78),[[3]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL81),[[4]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL122-R157),[[5]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL201-R212),[[6]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL222-R231))Fleet Integration and Heartbeats:
StateManager(as aStateRetrieverinterface) down through the config manager and MQTT connection layers, and updating the heartbeat payload to include backend status, errors, and restart information. ([[1]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-b0dddd7590e09bc17b468db04b48a21f0d508a17eca4c086eb98c87325e5ba98R23-R33),[[2]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-1783db06b73b72c164682aec902cffc352ef29c1eee8cd92929e7637e72f8047L28-R32),[[3]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-796a5a43e9dd714c20b0d1863be358a6f2b4f3d5035e9ab9c4303f61fb2f24f2R9),[[4]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-796a5a43e9dd714c20b0d1863be358a6f2b4f3d5035e9ab9c4303f61fb2f24f2R21-R29),[[5]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-796a5a43e9dd714c20b0d1863be358a6f2b4f3d5035e9ab9c4303f61fb2f24f2R43-R45),[[6]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-796a5a43e9dd714c20b0d1863be358a6f2b4f3d5035e9ab9c4303f61fb2f24f2R61-R76))Testing and Mocking Updates:
[[1]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-e8832cdc90dafca1118e69654b2c59f5c6c3e40fbee158eaafba88b3d7243232R57-R73),[[2]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-e8832cdc90dafca1118e69654b2c59f5c6c3e40fbee158eaafba88b3d7243232L79-R90),[[3]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-61bcdb2570d98db360b12b22aede9c54e0c453085acd5e60d5902c8f92380e72R17),[[4]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-61bcdb2570d98db360b12b22aede9c54e0c453085acd5e60d5902c8f92380e72R38-R48),[[5]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-61bcdb2570d98db360b12b22aede9c54e0c453085acd5e60d5902c8f92380e72L57-R70))These changes lay the groundwork for more reliable backend lifecycle management and improved system observability.
Backend State Management:
backend.StateManagerto centralize backend state tracking, monitoring, and restart logic, replacing per-backend state maps. ([agent/backend/backend_state.goR1-R111](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-6301055a8fa5f25637f457f26fa4d9556ff42cfe18d83199b773f78de9332819R1-R111))agent/agent.goto useStateManagerfor backend lifecycle events, error registration, and restart handling. ([[1]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL35-R45),[[2]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bR63-R78),[[3]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL81),[[4]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL122-R157),[[5]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL201-R212),[[6]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-62b6ad581fe3a3059ae8c85ef0f31dde4092bfdecfa7d6857c470bcacaa8cc8bL222-R231))Fleet Heartbeat and Config Integration:
StateManagerthrough config and MQTT layers to allow heartbeats to report backend state to Fleet. ([[1]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-b0dddd7590e09bc17b468db04b48a21f0d508a17eca4c086eb98c87325e5ba98R23-R33),[[2]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-1783db06b73b72c164682aec902cffc352ef29c1eee8cd92929e7637e72f8047L28-R32))[[1]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-796a5a43e9dd714c20b0d1863be358a6f2b4f3d5035e9ab9c4303f61fb2f24f2R9),[[2]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-796a5a43e9dd714c20b0d1863be358a6f2b4f3d5035e9ab9c4303f61fb2f24f2R21-R29),[[3]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-796a5a43e9dd714c20b0d1863be358a6f2b4f3d5035e9ab9c4303f61fb2f24f2R43-R45),[[4]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-796a5a43e9dd714c20b0d1863be358a6f2b4f3d5035e9ab9c4303f61fb2f24f2R61-R76))Testing Improvements:
[[1]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-e8832cdc90dafca1118e69654b2c59f5c6c3e40fbee158eaafba88b3d7243232R57-R73),[[2]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-e8832cdc90dafca1118e69654b2c59f5c6c3e40fbee158eaafba88b3d7243232L79-R90),[[3]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-61bcdb2570d98db360b12b22aede9c54e0c453085acd5e60d5902c8f92380e72R17),[[4]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-61bcdb2570d98db360b12b22aede9c54e0c453085acd5e60d5902c8f92380e72R38-R48),[[5]](https://github.com/netboxlabs/orb-agent/pull/201/files#diff-61bcdb2570d98db360b12b22aede9c54e0c453085acd5e60d5902c8f92380e72L57-R70))This pull request introduces a newBackendStateManagerto centralize and improve backend state tracking, error handling, and automated restart logic. It refactors how backend state is managed, replacing direct state maps with a dedicated manager, and enhances the reporting of backend status in fleet heartbeats. The changes also ensure that backend state is accessible throughout various components, including configuration management and fleet communication.Backend state management and monitoring:
BackendStateManagerinagent/backend/backend_state.goto handle backend state, monitor backend health, trigger restarts when necessary, and provide a unified interface for accessing backend state. This includes automated backend monitoring, error registration, and restart logic.orbAgentinagent/agent.goto useBackendStateManagerinstead of a localbackendStatemap, and to start backend monitors and handle restart requests through a channel. [1] [2] [3] [4] [5] [6] [7]Integration with configuration management and fleet communication:
configmgrand fleet-related components to accept and use the newBackendStateManagerfor backend state, including changes to constructors and dependency injection inagent/configmgr/fleet.goandagent/configmgr/fleet/connection.go. [1] [2]agent/configmgr/fleet/heartbeats.goto include backend state in heartbeat messages, providing richer status information to fleet. [1] [2] [3] [4]Testing and mocks:
agent/configmgr/fleet/connection_test.goandagent/configmgr/fleet/heartbeats_test.go. [1] [2] [3] [4] [5]