-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telemetry: Record DB creation time #1210
Conversation
e58a8f7
to
4a41610
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks mostly sensible, but I wonder whether the term "database creation time" could be misleading, as it's not clear to me whether it includes tracing or not. "Extraction time" is another option, however this also seems unclear. If it's sufficient to just look at scanned languages, we could go for something like "scanned_languages_database_creation_time"?
Hmm, indeed this is why opened this for review before doing the backend changes so we can decide on precisely what we want to measure and how to call it. As it's set up at the moment, this records (extraction of scanned languages + TRAP import time for all languages). I'd be keen to not add Edit: Hmm, part of that isn't quite true. We actually don't call finalize on the whole cluster, so we could separate out the TRAP import of individual languages if we wanted to. I still think it's a bad idea for the same reasons that separating out individual extractions of scanned languages is a bad idea. |
4a41610
to
37c2cbb
Compare
@henrymercer How about the slightly finer-grained version I've just pushed that splits up extraction of scanned languages from TRAP import? That's as fine-grained as we can go without adding per-language fields. |
That looks reasonable to me. We already need to add DB columns whenever we add a language, so we wouldn't be introducing additional friction, however I can appreciate the desire to avoid adding too much to what is already a long list of columns. |
37c2cbb
to
9defedb
Compare
9defedb
to
8b45ef3
Compare
Now that we've established that TRAP caching seems to be working as expected, I'd like to move on to measuring what performance gains it's giving us. In order to do that, we actually need some telemetry covering the extraction time, which we do not have at the moment. For scanned languages, this happens as part of the DB creation (alongside finalization). We could try and record a more granular time for just the extraction time of individual languages, and separately recording the time for TRAP import. However, I'm hesitant to add more language-specific fields to our telemetry so I think for now just recording the total DB creation time will be enough and is certainly an improvement on what we have now (nothing).
Merge / deployment checklist