-
Notifications
You must be signed in to change notification settings - Fork 861
State Store Metrics PebbleDB #2463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2463 +/- ##
=======================================
Coverage ? 41.14%
=======================================
Files ? 1602
Lines ? 135020
Branches ? 0
=======================================
Hits ? 55560
Misses ? 74373
Partials ? 5087
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| } | ||
|
|
||
| func (b *RawBatch) Write() (err error) { | ||
| startTime := time.Now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be neater to pass this as argument to defer function. It gives it a tight scope at compile time.
sei-db/common/metrics/metrics.go
Outdated
| // PebbleDB State Store Metrics | ||
| PebbleDBMetrics = struct { | ||
| // Operation Latencies | ||
| GetLatency metric.Int64Histogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the get latency might actually add too much overhead if each get would need to update the metric, I'd recommend let's remove this for perf concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this and we will leave in for now until we leave in for now and observe perf impact later. So far on detached rpc node have not noticed perf impacts
masih
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me know if I can help expand on any of the comments. Thank you for adding all these metrics @Kbhat1 - they are a great investment in understanding the KV store performance 🍻
| CommitLatency metric.Int64Histogram | ||
| ApplyChangesetLatency metric.Int64Histogram | ||
| CommitLatency metric.Float64Histogram | ||
| ApplyChangesetLatency metric.Float64Histogram |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beyond the scope this PR, but all the fields of this struct also shouldn't be exported? Un-exporting the struct would limit the future scope creep and we can revisit this in a separate refactor PR if that's easier.
|
@masih take a look again when you can |
| time.Since(startTime).Seconds(), | ||
| metric.WithAttributes( | ||
| attribute.Bool("success", _err == nil), | ||
| attribute.String("store", storeKey), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fyi @yzang2019 added attribute for storeKey for both Get and iteration
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
| } | ||
|
|
||
| func (b *Batch) Write() (err error) { | ||
| startTime := time.Now() |
Check warning
Code scanning / CodeQL
Calling the system time Warning
| } | ||
|
|
||
| func (b *RawBatch) Write() (err error) { | ||
| startTime := time.Now() |
Check warning
Code scanning / CodeQL
Calling the system time Warning
|
|
||
| func (db *Database) Get(storeKey string, targetVersion int64, key []byte) ([]byte, error) { | ||
| func (db *Database) Get(storeKey string, targetVersion int64, key []byte) (_ []byte, _err error) { | ||
| startTime := time.Now() |
Check warning
Code scanning / CodeQL
Calling the system time Warning
| // ApplyChangesetSync apply all changesets for a single version in blocking way | ||
| func (db *Database) ApplyChangesetSync(version int64, changeset []*proto.NamedChangeSet) error { | ||
| func (db *Database) ApplyChangesetSync(version int64, changeset []*proto.NamedChangeSet) (_err error) { | ||
| startTime := time.Now() |
Check warning
Code scanning / CodeQL
Calling the system time Warning
|
|
||
| func (db *Database) ApplyChangesetAsync(version int64, changesets []*proto.NamedChangeSet) error { | ||
| func (db *Database) ApplyChangesetAsync(version int64, changesets []*proto.NamedChangeSet) (_err error) { | ||
| startTime := time.Now() |
Check warning
Code scanning / CodeQL
Calling the system time Warning
|
|
||
| func (db *Database) computeHashForRange(beginBlock, endBlock int64) error { | ||
| func (db *Database) computeHashForRange(beginBlock, endBlock int64) (_err error) { | ||
| startTime := time.Now() |
Check warning
Code scanning / CodeQL
Calling the system time Warning
| // This is not a large issue given the next time that module is updated, it will be properly pruned thereafter. | ||
| func (db *Database) Prune(version int64) error { | ||
| func (db *Database) Prune(version int64) (_err error) { | ||
| startTime := time.Now() |
Check warning
Code scanning / CodeQL
Calling the system time Warning
masih
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One blocker on background metric collection context cancel function, and left some questions about cardinality just to double check they are not explosively large as they increase the number of metrics exponentially.
Thank you so much for addressing all the comments @Kbhat1, cannot wait to dig through the data being collected by the metrics 🚀
| // Start background metrics collection | ||
| metricsCtx, metricsCancel := context.WithCancel(context.Background()) | ||
| database.metricsCancel = metricsCancel | ||
| go database.collectMetricsInBackground(metricsCtx) |
Check notice
Code scanning / CodeQL
Spawning a Go routine Note
Describe your changes and provide context
Testing performed to validate your change