Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit warnings for re-orgs (Bitcoin + Stacks) #519

Closed
lgalabru opened this issue Mar 8, 2024 · 2 comments · Fixed by #524
Closed

Emit warnings for re-orgs (Bitcoin + Stacks) #519

lgalabru opened this issue Mar 8, 2024 · 2 comments · Fixed by #524
Assignees
Labels

Comments

@lgalabru
Copy link
Member

lgalabru commented Mar 8, 2024

I'd like to get Stacks re-orgs being babysitted for a few weeks, just to make sure that they are being correctly handled.
Stacks re-orgs are way more subtile than Bitcoin re-orgs, we have extensive test vectors in place, but special manual daily attention, just for a few weeks, would be great.

@MicaiahReid
Copy link
Collaborator

Should be lumped in with the changes for #498

@MicaiahReid MicaiahReid self-assigned this Mar 10, 2024
@smcclellan smcclellan added this to the Production Reliability milestone Mar 12, 2024
MicaiahReid added a commit that referenced this issue Mar 27, 2024
This PR introduces a few fixes in an effort to improve reliability and
debugging problems when running Chainhook as a service:
- Revisits log levels throughout the tool (fixes #498, fixes #521). The
general approach for the logs were:
- `crit` - fatal errors that will crash mission critical component of
Chainhook. In these cases, Chainhook should automatically kill all main
threads (not individual scanning threads, which is tracked by #404) to
crash the service.
- `erro` - something went wrong the could lead to a critical error, or
that could impact all users
- `warn` - something went wrong that could impact an end user (usually
due to user error)
- `info` - control flow logging and updates on the state of _all_
registered predicates
   - `debug` - updates on the state of _a_ predicate
- Crash the service if a mission critical thread fails (see
#517 (comment)
for a list of these threads). Previously, if one of these threads
failed, the remaining services would keep running. For example, if the
event observer handler crashed, the event observer API would keep
running. This means that the stacks node is successfully emitting blocks
that Chainhook is acknowledging but not ingesting. This causes gaps in
our database Fixes #517
- Removes an infinite loop with bitcoin ingestion, crashing the service
instead: Fixes #506
- Fixes how we delete predicates from our db when one is deregistered.
This should reduce the number of logs we have on startup. Fixes #510
 - Warns on all reorgs. Fixes #519
MicaiahReid added a commit that referenced this issue Mar 27, 2024
This PR introduces a few fixes in an effort to improve reliability and
debugging problems when running Chainhook as a service:
- Revisits log levels throughout the tool (fixes #498, fixes #521). The
general approach for the logs were:
- `crit` - fatal errors that will crash mission critical component of
Chainhook. In these cases, Chainhook should automatically kill all main
threads (not individual scanning threads, which is tracked by #404) to
crash the service.
- `erro` - something went wrong the could lead to a critical error, or
that could impact all users
- `warn` - something went wrong that could impact an end user (usually
due to user error)
- `info` - control flow logging and updates on the state of _all_
registered predicates
   - `debug` - updates on the state of _a_ predicate
- Crash the service if a mission critical thread fails (see
#517 (comment)
for a list of these threads). Previously, if one of these threads
failed, the remaining services would keep running. For example, if the
event observer handler crashed, the event observer API would keep
running. This means that the stacks node is successfully emitting blocks
that Chainhook is acknowledging but not ingesting. This causes gaps in
our database Fixes #517
- Removes an infinite loop with bitcoin ingestion, crashing the service
instead: Fixes #506
- Fixes how we delete predicates from our db when one is deregistered.
This should reduce the number of logs we have on startup. Fixes #510
 - Warns on all reorgs. Fixes #519
github-actions bot pushed a commit that referenced this issue Mar 27, 2024
## [1.4.0](v1.3.1...v1.4.0) (2024-03-27)

### Features

* detect http / rpc errors as early as possible ([ad78669](ad78669))
* use stacks.rocksdb for predicate scan ([#514](#514)) ([a4f1663](a4f1663)), closes [#513](#513) [#485](#485)

### Bug Fixes

* enable debug logs in release mode ([#537](#537)) ([fb49e28](fb49e28))
* improve error handling, and more! ([#524](#524)) ([86b5c78](86b5c78)), closes [#498](#498) [#521](#521) [#404](#404) [/github.com//issues/517#issuecomment-1992135101](https://github.com/hirosystems//github.com/hirosystems/chainhook/issues/517/issues/issuecomment-1992135101) [#517](#517) [#506](#506) [#510](#510) [#519](#519)
* log errors on block download failure; implement max retries ([#503](#503)) ([0fc38cb](0fc38cb))
* **metrics:** update latest ingested block on reorg ([#515](#515)) ([8f728f7](8f728f7))
* order and filter blocks used to seed forking block pool ([#534](#534)) ([a11bc1c](a11bc1c))
* seed forking handler with unconfirmed blocks to improve startup stability ([#505](#505)) ([485394e](485394e)), closes [#487](#487)
* skip db consolidation if no new dataset was downloaded ([#513](#513)) ([983a165](983a165))
* update scan status for non-triggering predicates ([#511](#511)) ([9073f42](9073f42)), closes [#498](#498)
Copy link

🎉 This issue has been resolved in version 1.4.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants