Telemetry: Record DB creation time #1210

edoardopirovano · 2022-08-24T11:05:00Z

Now that we've established that TRAP caching seems to be working as expected, I'd like to move on to measuring what performance gains it's giving us. In order to do that, we actually need some telemetry covering the extraction time, which we do not have at the moment. For scanned languages, this happens as part of the DB creation (alongside finalization). We could try and record a more granular time for just the extraction time of individual languages, and separately recording the time for TRAP import. However, I'm hesitant to add more language-specific fields to our telemetry so I think for now just recording the total DB creation time will be enough and is certainly an improvement on what we have now (nothing).

Merge / deployment checklist

Confirm this change is backwards compatible with existing workflows.
Confirm the readme has been updated if necessary.
Confirm the changelog has been updated if necessary.

henrymercer

This looks mostly sensible, but I wonder whether the term "database creation time" could be misleading, as it's not clear to me whether it includes tracing or not. "Extraction time" is another option, however this also seems unclear. If it's sufficient to just look at scanned languages, we could go for something like "scanned_languages_database_creation_time"?

edoardopirovano · 2022-08-24T12:45:35Z

This looks mostly sensible, but I wonder whether the term "database creation time" could be misleading, as it's not clear to me whether it includes tracing or not. "Extraction time" is another option, however this also seems unclear. If it's sufficient to just look at scanned languages, we could go for something like "scanned_languages_database_creation_time"?

Hmm, indeed this is why opened this for review before doing the backend changes so we can decide on precisely what we want to measure and how to call it. As it's set up at the moment, this records (extraction of scanned languages + TRAP import time for all languages). I'd be keen to not add extaction_time_$lang fields for each scanned language since then we have to add DB columns whenever we add a language. We can't easily separate out the TRAP import times by language either because it's all done by one CLI command on the cluster. We could separate out scanned_languages_extraction_time from trap_import_time if we wanted to, by plumbing this down a bit deeper.

Edit: Hmm, part of that isn't quite true. We actually don't call finalize on the whole cluster, so we could separate out the TRAP import of individual languages if we wanted to. I still think it's a bad idea for the same reasons that separating out individual extractions of scanned languages is a bad idea.

edoardopirovano · 2022-08-24T13:00:20Z

@henrymercer How about the slightly finer-grained version I've just pushed that splits up extraction of scanned languages from TRAP import? That's as fine-grained as we can go without adding per-language fields.

src/analyze.ts

henrymercer · 2022-08-24T13:22:25Z

That looks reasonable to me. We already need to add DB columns whenever we add a language, so we wouldn't be introducing additional friction, however I can appreciate the desire to avoid adding too much to what is already a long list of columns.

src/analyze.ts

edoardopirovano requested a review from a team as a code owner August 24, 2022 11:05

edoardopirovano force-pushed the edoardo/record-db-creation-time branch from e58a8f7 to 4a41610 Compare August 24, 2022 12:35

henrymercer reviewed Aug 24, 2022

View reviewed changes

edoardopirovano force-pushed the edoardo/record-db-creation-time branch from 4a41610 to 37c2cbb Compare August 24, 2022 12:59

github-advanced-security bot found potential problems Aug 24, 2022

View reviewed changes

src/analyze.ts Fixed Show resolved Hide resolved

edoardopirovano force-pushed the edoardo/record-db-creation-time branch from 37c2cbb to 9defedb Compare August 24, 2022 13:23

henrymercer approved these changes Aug 24, 2022

View reviewed changes

src/analyze.ts Show resolved Hide resolved

src/analyze.ts Outdated Show resolved Hide resolved

Telemetry: Record DB creation time

8b45ef3

edoardopirovano force-pushed the edoardo/record-db-creation-time branch from 9defedb to 8b45ef3 Compare August 24, 2022 13:31

edoardopirovano enabled auto-merge August 24, 2022 13:35

edoardopirovano merged commit 92c650b into main Aug 24, 2022

edoardopirovano deleted the edoardo/record-db-creation-time branch August 24, 2022 14:14

github-actions bot mentioned this pull request Aug 25, 2022

Merge main into releases/v2 #1212

Merged

6 tasks

henrymercer mentioned this pull request Aug 25, 2022

Merge releases/v2 into releases/v1 #1214

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Telemetry: Record DB creation time #1210

Telemetry: Record DB creation time #1210

edoardopirovano commented Aug 24, 2022

henrymercer left a comment •

edited

Loading

edoardopirovano commented Aug 24, 2022 •

edited

Loading

edoardopirovano commented Aug 24, 2022

henrymercer commented Aug 24, 2022

Telemetry: Record DB creation time #1210

Telemetry: Record DB creation time #1210

Conversation

edoardopirovano commented Aug 24, 2022

Merge / deployment checklist

henrymercer left a comment • edited Loading

Choose a reason for hiding this comment

edoardopirovano commented Aug 24, 2022 • edited Loading

edoardopirovano commented Aug 24, 2022

henrymercer commented Aug 24, 2022

henrymercer left a comment •

edited

Loading

edoardopirovano commented Aug 24, 2022 •

edited

Loading