Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/control/api: Improve node registration status clarity #5256

Merged
merged 1 commit into from May 3, 2023

Conversation

abukosek
Copy link
Contributor

@abukosek abukosek commented Apr 26, 2023

This PR adds three new fields to the node's control status output under the registration status section:

  • last_attempt_successful - true if the last registration attempt succeeded.
  • last_attempt_error_message - error message if the last registration attempt failed.
  • last_attempt - time of the last registration attempt.

Also, if the registration descriptor is expired, it is no longer shown in the output.

@abukosek abukosek force-pushed the andrej/feature/node-reg-status branch 3 times, most recently from 56faf51 to d243d97 Compare April 26, 2023 13:11
@codecov
Copy link

codecov bot commented Apr 26, 2023

Codecov Report

Merging #5256 (d243d97) into master (139111d) will increase coverage by 0.22%.
The diff coverage is 30.64%.

❗ Current head d243d97 differs from pull request most recent head b8a572a. Consider uploading reports for the commit b8a572a to get more accurate results

@@            Coverage Diff             @@
##           master    #5256      +/-   ##
==========================================
+ Coverage   66.87%   67.09%   +0.22%     
==========================================
  Files         515      516       +1     
  Lines       54757    54814      +57     
==========================================
+ Hits        36620    36780     +160     
+ Misses      13663    13574      -89     
+ Partials     4474     4460      -14     
Impacted Files Coverage Δ
go/worker/registration/worker.go 61.29% <20.45%> (-3.04%) ⬇️
go/runtime/host/sgx/metrics.go 42.85% <42.85%> (ø)
go/runtime/host/sgx/sgx.go 70.13% <100.00%> (+1.32%) ⬆️

... and 58 files with indirect coverage changes

@abukosek abukosek force-pushed the andrej/feature/node-reg-status branch 2 times, most recently from 0328a4f to 5bab05a Compare April 27, 2023 13:14
@abukosek abukosek marked this pull request as ready for review April 27, 2023 14:03
Copy link
Contributor

@peternose peternose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and works. Left some comments.

  "registration": {
    "successful": false,
    "error": "registry: no such entity",
    "last_registration": "0001-01-01T00:00:00Z"
  },

go/worker/registration/worker.go Show resolved Hide resolved
// Successful is true if the current registration has been successful.
Successful bool `json:"successful"`

// Error contains the error message if the current registration has not been successful.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Error contains the error message if the current registration has not been successful.
// Error contains the error message if the last registration has not been successful.

To be concise with last successful registration. Same for Successful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last registration can be successful, but the current one can fail -- this field only shows the status of the current one.

For example, the node successfully registers and sets Successful to true, LastRegistration contains the timestamp and Descriptor the descriptor. Then before the existing registration expires, the node can try to re-register and fail, in which case, its registration would still be valid (as well as the Descriptor), but Successful would be false.

Successful bool `json:"successful"`

// Error contains the error message if the current registration has not been successful.
Error string `json:"error,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Error string `json:"error,omitempty"`
ErrorMessage string `json:"error_message,omitempty"`

Maybe I would change this, as I expect that Error has error type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially had this as the error type, but apparently it can't be serialized properly, so this is the most straightforward way to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type error is an interface and will marshal to {}, so you cannot use it, unfortunately.

@@ -124,6 +124,12 @@ type IdentityStatus struct {

// RegistrationStatus is the node registration status.
type RegistrationStatus struct {
// Successful is true if the current registration has been successful.
Successful bool `json:"successful"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed fillings about this field as it duplicates data (e.g currently a bug could return true and error) and is ambiguous (looking only at the json without reading the comment).

I prefer having the following data: Registered (bool), LastRegistration (timestamp), LastAttempt (timestamp) and ErrorMessage (string) if last attempt failed. I think that when we delete the descriptor we say that we are not registered anymore, so the Registration field could be optional/removed or maybe not, as timestamp is kept.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure... Registered in your case is basically the same as checking if Descriptor is not nil. Adding a timestamp for the last registration attempt is a good idea, I'll do that. And I need to add better comments for the existing fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this, I:

  • add the field LastAttempt (timestamp)
  • rename Successful to LastAttemptSuccessful
  • rename Error to LastAtemptError

This will make it more clear what the new fields actually mean. I think that having the bool and the error separate is still useful for quickly checking if the node has problems (especially when using automation) and once I change the status updating as you've suggested above, there's really no way that the bool and the error could be out of sync :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, have no better suggestions. Putting this under last_attempt with fields timestamp, successful and error_message would probably be an overkill.

@abukosek abukosek force-pushed the andrej/feature/node-reg-status branch from 5bab05a to 2922d29 Compare April 28, 2023 15:05
under the registration status section:

- `successful` - true if the registration succeeded.
- `error` - error message if the registration failed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed! :)

go/control/api/api.go Show resolved Hide resolved
go/worker/registration/worker.go Outdated Show resolved Hide resolved
go/worker/registration/worker.go Outdated Show resolved Hide resolved
go/worker/registration/worker.go Outdated Show resolved Hide resolved
@abukosek abukosek force-pushed the andrej/feature/node-reg-status branch 2 times, most recently from 2ea8ba4 to a8edce8 Compare May 3, 2023 07:38
Copy link
Contributor

@peternose peternose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

"registration": {
    "last_attempt_successful": false,
    "last_attempt_error_message": "registry: no such entity",
    "last_attempt": "2023-05-03T11:37:29+02:00",
    "last_registration": "2023-05-03T11:36:37+02:00",
}

addrs, err := w.gatherConsensusAddresses(sentryConsensusAddrs)
if err != nil {
return fmt.Errorf("error gathering consensus addresses: %w", err)
addrs, grr := w.gatherConsensusAddresses(sentryConsensusAddrs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
addrs, grr := w.gatherConsensusAddresses(sentryConsensusAddrs)
addrs, err := w.gatherConsensusAddresses(sentryConsensusAddrs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linter complains about this, so I'll revert this change :)

@@ -941,28 +965,22 @@ func (w *Worker) registerNode(epoch beacon.EpochTime, hook RegisterNodeHook) err
nodeSigners = append([]signature.Signer{w.identity.NodeSigner}, nodeSigners...)
}

sigNode, err := node.MultiSignNode(nodeSigners, registry.RegisterNodeSignatureContext, &nodeDesc)
if err != nil {
sigNode, grr := node.MultiSignNode(nodeSigners, registry.RegisterNodeSignatureContext, &nodeDesc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sigNode, grr := node.MultiSignNode(nodeSigners, registry.RegisterNodeSignatureContext, &nodeDesc)
sigNode, err := node.MultiSignNode(nodeSigners, registry.RegisterNodeSignatureContext, &nodeDesc)

w.logger.Error("failed to register node",
"err", err,
)
return err
}

// Update the registration status on successful registration.
w.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this has been a bug for 3 years.

@abukosek abukosek force-pushed the andrej/feature/node-reg-status branch from a8edce8 to 97d1487 Compare May 3, 2023 10:46
@abukosek abukosek force-pushed the andrej/feature/node-reg-status branch from 97d1487 to b8a572a Compare May 3, 2023 10:50
@abukosek abukosek merged commit f642890 into master May 3, 2023
3 checks passed
@abukosek abukosek deleted the andrej/feature/node-reg-status branch May 3, 2023 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants