You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.
Warning! This might be a matter of micro-optimization.
When you use MIN_AGE_LAST_MODIFIED_HOURS=24 INVENTORIES=firefox latest-inventory-to-kinto it loops over almost 900,000 records from multiple CSV files. That's just for inventory=firefox. Each record is extracted and analyzed and we do a datetime comparison using MIN_AGE_LAST_MODIFIED_HOURS and entry['LastModifiedDate'].
The problem is that we're doing this check quite late. If you run the cron job with something short like MIN_AGE_LAST_MODIFIED_HOURS=24 you're likely to skip about 99% of the entries. So a bunch of stuff is done converting the entry (from the CSV reader) to a record (a dict) that in 99% of the times is all done in vain.
I measured the total time doing all the things we could skip for the 99% and it amounts to a total of ~40 seconds(*). Not much but it's also nice, now that we have the hindsight of context, to do this piece code more correctly.
(*) That's for just 'firefox' inventory. And in that experiment I used my macOS host and not Docker. In Docker it's likely to be much more.
The text was updated successfully, but these errors were encountered:
Warning! This might be a matter of micro-optimization.
When you use
MIN_AGE_LAST_MODIFIED_HOURS=24 INVENTORIES=firefox latest-inventory-to-kinto
it loops over almost 900,000 records from multiple CSV files. That's just forinventory=firefox
. Each record is extracted and analyzed and we do a datetime comparison usingMIN_AGE_LAST_MODIFIED_HOURS
andentry['LastModifiedDate']
.The problem is that we're doing this check quite late. If you run the cron job with something short like
MIN_AGE_LAST_MODIFIED_HOURS=24
you're likely to skip about 99% of the entries. So a bunch of stuff is done converting theentry
(from the CSV reader) to arecord
(a dict) that in 99% of the times is all done in vain.I measured the total time doing all the things we could skip for the 99% and it amounts to a total of ~40 seconds(*). Not much but it's also nice, now that we have the hindsight of context, to do this piece code more correctly.
(*) That's for just 'firefox' inventory. And in that experiment I used my macOS host and not Docker. In Docker it's likely to be much more.
The text was updated successfully, but these errors were encountered: