Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last Finalized Snapshot Detection and Alert System #76

Closed
Seth-Schmidt opened this issue Apr 9, 2024 · 4 comments
Closed

Last Finalized Snapshot Detection and Alert System #76

Seth-Schmidt opened this issue Apr 9, 2024 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@Seth-Schmidt
Copy link

Is your feature request related to a problem?
There is no system setup to notify when a master node has not been processing snapshots or has gone down.

Describe the solution you'd like
The /health endpoint of the core-api should periodically query the protocol state contract for the current epoch and compare the epochId against active projects to detect how many epochs have passed since the last finalization. If any sampled projects have not finalized within a reasonable time (20-30 epochs?), then pooler will send a Slack alert notifying of a failure to finalize.

Describe alternatives you've considered
It might be worth exploring an external monitoring service that tracks the status independently since finalization requires submissions from multiple master nodes. However, it would be possible to get additional information for specific nodes if the monitoring is done internally.

@Seth-Schmidt Seth-Schmidt added the enhancement New feature or request label Apr 9, 2024
@Seth-Schmidt Seth-Schmidt self-assigned this Apr 9, 2024
@anomit
Copy link
Member

anomit commented Apr 22, 2024

The associated PR has been reviewed by me. Ideally would need another review from one of the @PowerLoom/backend-engineering team members before merging. Moving completion date to this week, Tuesday EOD.

@Seth-Schmidt
Copy link
Author

Moving completion date to allow for final review of changes mentioned here: #77 (review)

@SwaroopH
Copy link
Member

SwaroopH commented May 7, 2024

@Seth-Schmidt update this please ^

@Seth-Schmidt
Copy link
Author

merged to nms_master: #77

closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants