Fix our download count infrastructure. #1089

Closed
arthurnn opened this Issue Oct 18, 2015 · 5 comments

Comments

Projects
None yet
2 participants
@arthurnn
Member

arthurnn commented Oct 18, 2015

Problem

We need to save download counts for gems in a fast and reliable way.

Our current setup

  1. https://rubygems.org/downloads/rails-4.2.4.gem will hit our Ngnix.
  2. Ngnix will do a call to stat-update to increment the download count for that gem version.
  3. Ngnix will send back a redirect header with the right CDN address which has the .gem file

Issues with the current setup

Stat-update save download counts on redis, which makes harder for us to mix that data with the data on Postgresql on the Rails side of things.

  • i.e: https://github.com/rubygems/rubygems.org/blob/master/app/controllers/stats_controller.rb#L6 We sort the top 10 gems using the downloads column from Postgresql, and after we get those 10 values from redis and sort the list again. However, not necessarily, those first 10 gems are the most downloaded. For instance, bundler should be on that list, but it is not as the download count is out of date.
  • The download code is hard to understand, as we update those values from our c-ngnix module, and dont do it from ruby.
  • We cannot have better stats, as this data is mixed in two data sources.

My propose fix

Uses redis as a way to offload the counts, and not as a data store for the counts.

How

  • Make stat-update LPUSH the gem information(i.e. rails-4.2.4) to a redis queue.
  • LPOP the gem info in a ruby process, and update Postgresql count.

Migration steps

  1. Finish Ruby side to consume the queue and update the counter on Postgresql
  2. Change stat-update code to do the LPUSH, and remove the INCR code.
  3. Update all Postgres counters using the legacy redis counter
  4. Enable the Ruby side to consume the new counters comming

With this approach, we wont overload Postgresql with updates for download counts, which is not a fast operation for the DB, and we wont make nginx calls slower, waiting on DB IO calls. We also can scale the ruby side workers if the queue is getting behind.
After that, Rails can only trust the counter from Postgresql, and it wont need to read redis, which will bring us a more reliable site.

review @dwradcliffe @evanphx @qrush @indirect
(I am working on it, just want some approach review before I start rolling things out)

@arthurnn

This comment has been minimized.

Show comment
Hide comment
@arthurnn

arthurnn Oct 18, 2015

Member

Just for the record, @dwradcliffe and I, discussed the option of using the link-redis Postgresql feature. But this is out of the table, as it will add more setup, and also, for been a new feature that as far as I can tell is not well tested at scale.

Member

arthurnn commented Oct 18, 2015

Just for the record, @dwradcliffe and I, discussed the option of using the link-redis Postgresql feature. But this is out of the table, as it will add more setup, and also, for been a new feature that as far as I can tell is not well tested at scale.

@arthurnn arthurnn self-assigned this Oct 18, 2015

@dwradcliffe

This comment has been minimized.

Show comment
Hide comment
@dwradcliffe

dwradcliffe Oct 18, 2015

Member

Doesn't this mean that every download will make a Postgres update?

Member

dwradcliffe commented Oct 18, 2015

Doesn't this mean that every download will make a Postgres update?

@arthurnn

This comment has been minimized.

Show comment
Hide comment
@arthurnn

arthurnn Oct 18, 2015

Member

Doesn't this mean that every download will make a Postgres update?

Yes, I am sure Postgres is capable of handling those updates, specially because they will be out-of-band, so we can even throttle them if we need to. The number of workers will dictate that update ratio in our DB.

Member

arthurnn commented Oct 18, 2015

Doesn't this mean that every download will make a Postgres update?

Yes, I am sure Postgres is capable of handling those updates, specially because they will be out-of-band, so we can even throttle them if we need to. The number of workers will dictate that update ratio in our DB.

@arthurnn

This comment has been minimized.

Show comment
Hide comment
@arthurnn

arthurnn Oct 18, 2015

Member

Another option I was wondering we could do, to make the ruby side even easier would be to use something like Sidekiq for the LPOP part.
So the idea would be, the C side, still uses a LPUSH, and pushes the Sidekiq job compatible message, and we would use Sidekiq to pop from that queue. So we could use a code(sidekiq) that it is already been used in production for lots of other companies.

Member

arthurnn commented Oct 18, 2015

Another option I was wondering we could do, to make the ruby side even easier would be to use something like Sidekiq for the LPOP part.
So the idea would be, the C side, still uses a LPUSH, and pushes the Sidekiq job compatible message, and we would use Sidekiq to pop from that queue. So we could use a code(sidekiq) that it is already been used in production for lots of other companies.

@arthurnn

This comment has been minimized.

Show comment
Hide comment
@arthurnn

arthurnn Oct 18, 2015

Member

Closing this, as it seems like we are moving away from ngnix and stat-update all together.

Member

arthurnn commented Oct 18, 2015

Closing this, as it seems like we are moving away from ngnix and stat-update all together.

@arthurnn arthurnn closed this Oct 18, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment