-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bacalhau node list
returns error failed request: invalid node type: nodeTypeUndefined
#4024
Comments
I am working from the staging cluster which contains 4 nodes.. It appears there is an extra node (a 5th) in the store somewhere that is of type undefined causing these errors:
|
I believe this may be an issue related to previous state in the clusters node store, as filtering for nodes that are connected work, but disconnected does not:
|
I have performed the following operations on each node in the cluster to remove the invalid node from the state of the requester node: bacalhau-vm-stage-0 (requester+compute)
bacalhau-vm-stage-1 (compute)
bacalhau-vm-stage-2 (compute)
bacalhau-vm-stage-3 (compute)
The node list command is now working as expected:
|
The cause of this issue relates to changes in the state contained within the NodeStore (NATS kv Store) between v1.3.0 and v1.3.1-rc-1. In v1.3.0 the NodeStore operates over, and contains, NodeInfo: bacalhau/pkg/routing/kvstore/kvstore.go Lines 70 to 82 in b09858f
bacalhau/pkg/model/nodeinfo.go Lines 25 to 31 in b09858f
In v1.3.1-rc1 the NodeStore operates over, and contains NodeState bacalhau/pkg/routing/kvstore/kvstore.go Lines 70 to 82 in 3b3d8a8
bacalhau/pkg/models/node_state.go Lines 5 to 9 in 3b3d8a8
NodeInfo cannot be unmarshaled into a NodeState type which is why list show a node with undefined fields. Its data from v1.3.0 contained in the store that no longer meets the requirements of v1.3.1-rc1 How did we get here?
The problem here is that it was never validated to ensure a v1.3.1-rc Requester could open a v1.3.0 Requester store. The fix here appears to be one of:
|
- fixes #4024 - this ensures that each time a compute node is started it attempts to register itself wit the requester. This is imporatant since in the event a requester loses it state compute nodes will re-register themselves with it. If they have already registered with a requester node registering again idempotent. - pairing this with the parent commit regarding the V3 Migration is required.
- fixes #4024 --------- Co-authored-by: frrist <forrest@expanso.io>
Bug Description
See title
Expected Behavior
It lists the nodes
Steps to Reproduce
Bacalhau Versions
v1.3.1-rc1
Host Environment
Provide details about the environment where the bug occurred:
The text was updated successfully, but these errors were encountered: