Kill bin/update-downloads, update the counters directly at the endpoint
#660
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note: This PR is purposely broken into two commits, and they should be deployed separately. The first commit stops adding data for the update-downloads script to touch, the second commit removes the script entirely. The second commit shouldn't be deployed until
SELECT COUNT(*) FROM version_downloads WHERE downloads != countedreturns 0 (should happen pretty much immediately after deploying the first commit)The current process of updating the download counters does some fairly unneccessary punting to a background job. It's possible that this was a premature performance optimization, but based on the comment which was there before, I suspect that it was due to a misunderstanding of how locks work in PG and done this way in an attempt to make it less "racy".
The comment mentioned that the update was non-atomic and racy. Neither of those statements are in any way true, which is why we use an RBDMS in the first place. Each transaction will lock the rows it updates until the transaction commits or rolls back, and any other transactions will block until then.
With this change, all hits to
/downloadwill block each other on the final query, when they try to update themetadatatable. However, as this is the final query before the transaction is committed, this shouldn't be a bottleneck until we reach millions of downloads per minute (at which point a periodic backround worker which doesSELECT sum(downloads) FROM cratesis probably simpler)