-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrument Storage Layer #9958
Instrument Storage Layer #9958
Conversation
@@ -14,8 +16,10 @@ type Renewer struct { | |||
} | |||
|
|||
func NewRenewer(ctx context.Context, tr track.Tracker, name string, ttl time.Duration) *Renewer { | |||
ctx = pctx.Child(ctx, "trackerRenewer", pctx.WithCounter("renewals", 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You only need a WithCounter
in the context if you want to do aggregation. If you're OK logging at every call to Inc, then you don't need to bother with this. (With no options, the value will be written to the logs at most every 10 seconds.)
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9958 +/- ##
=======================================
Coverage 58.21% 58.22%
=======================================
Files 614 614
Lines 75461 75527 +66
=======================================
+ Hits 43931 43976 +45
- Misses 30978 30998 +20
- Partials 552 553 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This is a first pass at instrumenting the storage layer. There's a few high level changes you'll generally see throughout this PR: 1. Child contexts have been passed throughout. Particular areas of focus are distributed systems like the task service and low level constructs like file set, index, and chunk readers & writers. 2. Metrics have been added. The metrics vary based on what is relevant, but generally we're reporting bytes read/written as well as indicators like cache misses vs cache hits, indices skipped vs indices read, and so forth. I'm sure I've missed areas that we want to instrument, but that's fine. We can add those in a follow up PR. Metrics are visible when running at a 'debug' level. Here are some snippets of metrics: ```json { "severity": "debug", "time": "2024-04-18T19:30:24.935036734Z", "logger": "grpc.pfs_v2.API/ModifyFile.fileSetWriter.indexWriter(deletive-index-writer)", "caller": "index/writer.go:85", "message": "meter: bytes", "service": "pfs_v2.API", "method": "ModifyFile", "command": [ "pachctl put file images@master -f liberty.jpg" ], "peer": "127.0.0.1:48528", "type": "int", "delta": 35, "meter": "bytes" } { "severity": "debug", "time": "2024-04-18T19:30:24.935196695Z", "logger": "grpc.pfs_v2.API/ModifyFile.fileSetWriter.batcher(chunk-batcher).taskChain.upload", "caller": "chunk/uploader.go:186", "message": "meter: tx_bytes", "service": "pfs_v2.API", "method": "ModifyFile", "command": [ "pachctl put file images@master -f liberty.jpg" ], "peer": "127.0.0.1:48528", "type": "int", "delta": 133858, "meter": "tx_bytes" } { "severity": "debug", "time": "2024-04-18T19:30:24.954077375Z", "logger": "grpc.pfs_v2.API/ModifyFile.fileSetWriter.indexWriter(additive-batched-index-writer)", "caller": "index/writer.go:85", "message": "meter: bytes", "service": "pfs_v2.API", "method": "ModifyFile", "command": [ "pachctl put file images@master -f liberty.jpg" ], "peer": "127.0.0.1:48528", "type": "int", "delta": 155, "meter": "bytes" } ```
This is a first pass at instrumenting the storage layer. There's a few high level changes you'll generally see throughout this PR:
I'm sure I've missed areas that we want to instrument, but that's fine. We can add those in a follow up PR.
Metrics are visible when running at a 'debug' level.
Here are some snippets of metrics: