New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix stats pipeline #35

Closed
dwradcliffe opened this Issue Nov 12, 2015 · 28 comments

Comments

Projects
None yet
9 participants
@dwradcliffe
Member

dwradcliffe commented Nov 12, 2015

Summary:

We need to revamp our stats pipeline, to accommodate serving all request from fastly without hitting nginx for some requests.

Currently:

We currently have a slightly complex system to collect stats for gem downloads. In nginx, for every gem download we fire off an internal sub request to a local server running stat-update. This small C program will parse the request, and save a variety of stats to redis. Then from the application, we query redis for stats.

Why we're making a change:

Two reasons. First, we are moving to use Fastly for everything, which means gem downloads (among other things) can be served from fastly cache in one request and not ever hit our servers or nginx. Also, the application has legacy code looking at the database for stats as well as redis. Some pages have invalid data showing because the legacy code is used. We want to clean this up.

Inputs

Our incoming data will be in text files on AWS S3. Every x minutes we will flush the access logs from Fastly to S3. There will be multiple files per each timespan.

Outputs

This is the first thing we need to decide. What stats to do we want to save. We currently save a lot of data, but we've never used most of it. The obvious things we want are:

  • Total gem downloads per gem (all time)
  • Total gem downloads per gem version (all time)

We can probably stop collecting:

  • Gem downloads per gem per day
  • Gem downloads per gem version per day

Process

Final plan TBD when we decide the outputs. I'm leaning toward a small async background process that will pull files out of S3, process them and save the final stats to ___. This process could be stopped anytime and would catch up when started again. From the application perspective, having stats right in the Postgresql database would be the simplest final data store.

ping: @evanphx @indirect @arthurnn @qrush @segiddins for comments

@dwradcliffe dwradcliffe referenced this issue Nov 12, 2015

Closed

Fastly #30

16 of 16 tasks complete
@arthurnn

This comment has been minimized.

Show comment
Hide comment
@arthurnn

arthurnn Nov 12, 2015

Member

For the output, I can think about two solutions.

  1. We only save what we need, straight to postgres
  2. we create some stats data pipeline. Something that would push all the stats logs to Kafka, and we could write consumers on top of it. One consumer, would be another process that would get the download counts and update postgres.

Regardless the choices from above, I think the outcome of this should be having the final counts on Postgresql so Rails dont need to do much gymnastics to aggregate that data.
Now, the first solution is something way simpler than the second, but the second would gives more flexibility in what we wanna do with our stats.

Member

arthurnn commented Nov 12, 2015

For the output, I can think about two solutions.

  1. We only save what we need, straight to postgres
  2. we create some stats data pipeline. Something that would push all the stats logs to Kafka, and we could write consumers on top of it. One consumer, would be another process that would get the download counts and update postgres.

Regardless the choices from above, I think the outcome of this should be having the final counts on Postgresql so Rails dont need to do much gymnastics to aggregate that data.
Now, the first solution is something way simpler than the second, but the second would gives more flexibility in what we wanna do with our stats.

@jnunemaker

This comment has been minimized.

Show comment
Hide comment
@jnunemaker

jnunemaker Nov 13, 2015

@arthurnn pinged me offline and this sounded interesting so I thought I would drop some ideas.

Our incoming data will be in text files on AWS S3. Every x minutes we will flush the access logs from Fastly to S3. There will be multiple files per each timespan.

S3 can push new file info into SQS. That helps avoid polling all the files or something like that. It would also allow easily storing those files in postgres to avoid incurring the slowness and cost of listing files (optional, but sometimes helpful). From there it would be pretty easy to build a process to pop from sqs, download or stream the file and aggregate the _____ (ie: postgres) updates.

I've played with the streaming part of the aws-sdk s3 library a bit. It sounds great because then you don't have to download the whole file to disk (not sure how big the files will be, also great for heroku based environments), but I've had issues with how it chunks things. I kept getting partial lines and things if I remember right. Could have been the files I was working with or the chunking, I didn't get a chance to dig in. Definitely something that could be worked around, but worth noting. With streaming you could easily keep track of how far in the file (checkpoints) and what not to allow reprocessing files with idempotence.

I'd recommend only saving what you need. Since you expose downloads total and per version, starting with those sounds great. Since the files are in S3, you can always go back later and add per day or other aggregations and backfill. Since some of that data exists currently, I'd make sure to back it up in S3 or something for historical purposes.

As far as where to store the outputs, postgres would be great. A single column could be an issue (contention for one row), but since they could be aggregated in memory and flushed to postgres at the end, it would probably be fine. The other option is to have a slotted counter to spread out the contention.

I've actually been playing with an S3 log streaming/aggregation ruby lib a bit in my spare time, but I hit the chunk issue and have not had time to dig in since. Let me know if I wasn't clear about anything. Happy to help talk stuff through and maybe write some code (if I can carve out the time). It would be fun to give back to rubygems a bit after all my use of it.

jnunemaker commented Nov 13, 2015

@arthurnn pinged me offline and this sounded interesting so I thought I would drop some ideas.

Our incoming data will be in text files on AWS S3. Every x minutes we will flush the access logs from Fastly to S3. There will be multiple files per each timespan.

S3 can push new file info into SQS. That helps avoid polling all the files or something like that. It would also allow easily storing those files in postgres to avoid incurring the slowness and cost of listing files (optional, but sometimes helpful). From there it would be pretty easy to build a process to pop from sqs, download or stream the file and aggregate the _____ (ie: postgres) updates.

I've played with the streaming part of the aws-sdk s3 library a bit. It sounds great because then you don't have to download the whole file to disk (not sure how big the files will be, also great for heroku based environments), but I've had issues with how it chunks things. I kept getting partial lines and things if I remember right. Could have been the files I was working with or the chunking, I didn't get a chance to dig in. Definitely something that could be worked around, but worth noting. With streaming you could easily keep track of how far in the file (checkpoints) and what not to allow reprocessing files with idempotence.

I'd recommend only saving what you need. Since you expose downloads total and per version, starting with those sounds great. Since the files are in S3, you can always go back later and add per day or other aggregations and backfill. Since some of that data exists currently, I'd make sure to back it up in S3 or something for historical purposes.

As far as where to store the outputs, postgres would be great. A single column could be an issue (contention for one row), but since they could be aggregated in memory and flushed to postgres at the end, it would probably be fine. The other option is to have a slotted counter to spread out the contention.

I've actually been playing with an S3 log streaming/aggregation ruby lib a bit in my spare time, but I hit the chunk issue and have not had time to dig in since. Let me know if I wasn't clear about anything. Happy to help talk stuff through and maybe write some code (if I can carve out the time). It would be fun to give back to rubygems a bit after all my use of it.

@qrush

This comment has been minimized.

Show comment
Hide comment
@qrush

qrush Nov 23, 2015

Member

I think starting with what we have for stats already is fine. I don't see how daily/hourly data would be useful and interesting yet, but if storing it is a pain, let's just start small and simple.

Any thoughts about using statsd to collect this data instead of postgres?

Member

qrush commented Nov 23, 2015

I think starting with what we have for stats already is fine. I don't see how daily/hourly data would be useful and interesting yet, but if storing it is a pain, let's just start small and simple.

Any thoughts about using statsd to collect this data instead of postgres?

@evanphx

This comment has been minimized.

Show comment
Hide comment
@evanphx

evanphx Nov 24, 2015

Member

I believe we should not add additional stats beyond what we're currently managing. Specificly:

Total gem+version download count
Total gem download count (i.e. all versions)

Those can easily be in postgres, no big reason to keep them anywhere else. Thats on the order of one million rows, which is no biggy.

I'd go with the simple approach, using SNS to get notified about new batches and process them immediately.

Lastly, we should have the processed logs gzip'd and uploaded to an archive bucket. At some future time, those logs can be processed for more info or simply rotated off.

Member

evanphx commented Nov 24, 2015

I believe we should not add additional stats beyond what we're currently managing. Specificly:

Total gem+version download count
Total gem download count (i.e. all versions)

Those can easily be in postgres, no big reason to keep them anywhere else. Thats on the order of one million rows, which is no biggy.

I'd go with the simple approach, using SNS to get notified about new batches and process them immediately.

Lastly, we should have the processed logs gzip'd and uploaded to an archive bucket. At some future time, those logs can be processed for more info or simply rotated off.

@dwradcliffe

This comment has been minimized.

Show comment
Hide comment
@dwradcliffe

dwradcliffe Nov 24, 2015

Member

Cool. So to recap: we're settled on only keeping:

  • Total gem downloads per gem (all time)
  • Total gem downloads per gem version (all time)

And we're going to store the final numbers in our primary postgres database (in the rubygems and versions tables).

@ktheory has some experience and ideas about building the workers to process the stats from S3 into the db using SNS.

Member

dwradcliffe commented Nov 24, 2015

Cool. So to recap: we're settled on only keeping:

  • Total gem downloads per gem (all time)
  • Total gem downloads per gem version (all time)

And we're going to store the final numbers in our primary postgres database (in the rubygems and versions tables).

@ktheory has some experience and ideas about building the workers to process the stats from S3 into the db using SNS.

@arthurnn

This comment has been minimized.

Show comment
Hide comment
@arthurnn

arthurnn Nov 24, 2015

Member

👍 this is a simple and a pretty good start.

Member

arthurnn commented Nov 24, 2015

👍 this is a simple and a pretty good start.

@qrush

This comment has been minimized.

Show comment
Hide comment
@qrush

qrush Nov 25, 2015

Member

Big 👍 for starting small and simple.

Member

qrush commented Nov 25, 2015

Big 👍 for starting small and simple.

@ktheory

This comment has been minimized.

Show comment
Hide comment
@ktheory

ktheory Nov 25, 2015

Member

Hi folks. 👋

Looking forward to helping with this. I'll read the docs to get familiar with rubygems infrastructure. Then I'll share a proposal, or ask @dwradcliffe if I have questions.

Member

ktheory commented Nov 25, 2015

Hi folks. 👋

Looking forward to helping with this. I'll read the docs to get familiar with rubygems infrastructure. Then I'll share a proposal, or ask @dwradcliffe if I have questions.

@ktheory

This comment has been minimized.

Show comment
Hide comment
@ktheory

ktheory Nov 30, 2015

Member

Hi folks, here's some more details of how I'd approach this. I'm looking for a 👍 on this, especially from @dwradcliffe, and someone familiar with the rubygems.org codebase.

Assumptions:

Just wanted to make sure there's consensus about this up front:

  • The download count metrics are not mission critical data. I.e., we make a best effort to make them accurate. But a bug in our data pipeline could cause inaccurate download counts that are difficult to reconcile. If download counts are slightly incorrect, it's no big deal, and users likely won't notice.
  • App code is preferable to infrastructure code. There are more devs familiar with maintaining app code than infrastructure. I.e., keep biz logic in the rubygems.org app, and keep rubygems-infrastructure simple.
  • Metric staleness is tolerable. It can take 5min or more for logs to be published, and then some more time to process them. If an author publishes a new gem then immediately downloads it, it may take, say 15min for that download to be reflected on the web page.

Fastly & AWS config

(I'd need @dwradcliffe or someone w/ admin access to Fastly & AWS to set this up.)

  • Fastly publishes access logs to an S3 bucket. (One logfile for each of 9 POPs every 5 minutes: ~2 new files per minute to process.)
  • Create an SQS queue
  • On the S3 bucket, enable notifications to publish to the SQS queue when files are created per this guide
  • Make sure the AWS access key used by rubygems.org has permissions to read the fastly logs.

Shoryuken: a ruby tool to process SQS messages

A shoryuken worker processes jobs off the SQS queue. Shoryuken simply turns each SQS message into a DelayedJob: Delayed::Job.enqueue FastlyLogProcessor.new, 's3://path/to/log/file'

Since SQS & shoryuken unfamiliar tools, I'm trying to use them as trivially as possible while we gain confidence and understanding.

I'd make a pull request to rubygems-infrastructure to keep a few shoryuken processes running. One is sufficient, but run a few for HA.

Rubygems.org app changes:

  • Add shoryuken as a dependency with the logic to convert an SQS messaget to a DJ.
  • Add a FastlyLogProcessor delayed job. It works by:
  1. Downloading a log from S3 (stream it in chunks using aws-sdk)
  2. reduce over each line to compute the metrics we want
  3. Inserts the metrics into whatever data store we want.

For now, it sounds like there's concensus to just store the download count per gem version in postgres. So it would reduce to a hash like:

{
'rails' => {
  '4.0.0' => 1, # 1 download
  '4.1.0' => 2 #  2 downloads
  }
'rake' => {
  '10.4.2' => 3
  }
}

To better ensure that we don't double-count log files (i.e., to be idempotent), set a redis key with the S3 path to better ensure this job is idempotent. E.g.:

key = "fastly-log-#{log_path}"
redis.setnx key 
redis.expires_at key, 1.month_from_now # Ample time to avoid dupes

If the key is already set (setnx returns an error), then the data is already inserted. This job is redundant, so return without inserting data.

If setnx sets the key, then bulk-update the download counts in postgres.

That's it. 😄

Member

ktheory commented Nov 30, 2015

Hi folks, here's some more details of how I'd approach this. I'm looking for a 👍 on this, especially from @dwradcliffe, and someone familiar with the rubygems.org codebase.

Assumptions:

Just wanted to make sure there's consensus about this up front:

  • The download count metrics are not mission critical data. I.e., we make a best effort to make them accurate. But a bug in our data pipeline could cause inaccurate download counts that are difficult to reconcile. If download counts are slightly incorrect, it's no big deal, and users likely won't notice.
  • App code is preferable to infrastructure code. There are more devs familiar with maintaining app code than infrastructure. I.e., keep biz logic in the rubygems.org app, and keep rubygems-infrastructure simple.
  • Metric staleness is tolerable. It can take 5min or more for logs to be published, and then some more time to process them. If an author publishes a new gem then immediately downloads it, it may take, say 15min for that download to be reflected on the web page.

Fastly & AWS config

(I'd need @dwradcliffe or someone w/ admin access to Fastly & AWS to set this up.)

  • Fastly publishes access logs to an S3 bucket. (One logfile for each of 9 POPs every 5 minutes: ~2 new files per minute to process.)
  • Create an SQS queue
  • On the S3 bucket, enable notifications to publish to the SQS queue when files are created per this guide
  • Make sure the AWS access key used by rubygems.org has permissions to read the fastly logs.

Shoryuken: a ruby tool to process SQS messages

A shoryuken worker processes jobs off the SQS queue. Shoryuken simply turns each SQS message into a DelayedJob: Delayed::Job.enqueue FastlyLogProcessor.new, 's3://path/to/log/file'

Since SQS & shoryuken unfamiliar tools, I'm trying to use them as trivially as possible while we gain confidence and understanding.

I'd make a pull request to rubygems-infrastructure to keep a few shoryuken processes running. One is sufficient, but run a few for HA.

Rubygems.org app changes:

  • Add shoryuken as a dependency with the logic to convert an SQS messaget to a DJ.
  • Add a FastlyLogProcessor delayed job. It works by:
  1. Downloading a log from S3 (stream it in chunks using aws-sdk)
  2. reduce over each line to compute the metrics we want
  3. Inserts the metrics into whatever data store we want.

For now, it sounds like there's concensus to just store the download count per gem version in postgres. So it would reduce to a hash like:

{
'rails' => {
  '4.0.0' => 1, # 1 download
  '4.1.0' => 2 #  2 downloads
  }
'rake' => {
  '10.4.2' => 3
  }
}

To better ensure that we don't double-count log files (i.e., to be idempotent), set a redis key with the S3 path to better ensure this job is idempotent. E.g.:

key = "fastly-log-#{log_path}"
redis.setnx key 
redis.expires_at key, 1.month_from_now # Ample time to avoid dupes

If the key is already set (setnx returns an error), then the data is already inserted. This job is redundant, so return without inserting data.

If setnx sets the key, then bulk-update the download counts in postgres.

That's it. 😄

@qrush

This comment has been minimized.

Show comment
Hide comment
@qrush

qrush Nov 30, 2015

Member

Those assumptions sound fine to me. I like this approach so far and shoryuken looks great!

Member

qrush commented Nov 30, 2015

Those assumptions sound fine to me. I like this approach so far and shoryuken looks great!

@dwradcliffe

This comment has been minimized.

Show comment
Hide comment
@dwradcliffe

dwradcliffe Nov 30, 2015

Member

Sounds pretty good! I agree with those assumptions. Staleness is tolerable - I would say even up to a day is fine. 1st priority is gem push/download. Stats are always second.

My only concern with using the same app is performance. A worker that doesn't need to boot the entire rails app would probably perform better. Maybe it's not enough of a difference to matter?

I'll get started on the infrastructure changes mentioned above.

Member

dwradcliffe commented Nov 30, 2015

Sounds pretty good! I agree with those assumptions. Staleness is tolerable - I would say even up to a day is fine. 1st priority is gem push/download. Stats are always second.

My only concern with using the same app is performance. A worker that doesn't need to boot the entire rails app would probably perform better. Maybe it's not enough of a difference to matter?

I'll get started on the infrastructure changes mentioned above.

@ktheory

This comment has been minimized.

Show comment
Hide comment
@ktheory

ktheory Dec 1, 2015

Member

(summarizing a chat w/ @dwradcliffe)

A worker that doesn't need to boot the entire rails app would probably perform better. Maybe it's not enough of a difference to matter?

It would perform better, but YAGNI to process a few messages per minute. Also, there's the institutional knowledge + process overhead to maintain a separate app. There's plenty of free memory on the app servers to run another app process.

It's be rather straightforward to extract the shoryuken and delayed job classes to a separate app down the road if we want.

Member

ktheory commented Dec 1, 2015

(summarizing a chat w/ @dwradcliffe)

A worker that doesn't need to boot the entire rails app would probably perform better. Maybe it's not enough of a difference to matter?

It would perform better, but YAGNI to process a few messages per minute. Also, there's the institutional knowledge + process overhead to maintain a separate app. There's plenty of free memory on the app servers to run another app process.

It's be rather straightforward to extract the shoryuken and delayed job classes to a separate app down the road if we want.

@dwradcliffe

This comment has been minimized.

Show comment
Hide comment
@dwradcliffe

dwradcliffe Dec 1, 2015

Member

👍

Member

dwradcliffe commented Dec 1, 2015

👍

@ktheory ktheory self-assigned this Dec 29, 2015

@indirect

This comment has been minimized.

Show comment
Hide comment
@indirect

indirect Dec 31, 2015

Member

My single suggestion here is to SETNX the key to expire in, say, 1.hour at the start of the job, and then 1.month at the end of the job, immediately after the postgres UPDATE succeeds. That way if the job crashes in the middle of execution, it will re-run later.

Member

indirect commented Dec 31, 2015

My single suggestion here is to SETNX the key to expire in, say, 1.hour at the start of the job, and then 1.month at the end of the job, immediately after the postgres UPDATE succeeds. That way if the job crashes in the middle of execution, it will re-run later.

@skottler

This comment has been minimized.

Show comment
Hide comment
@skottler

skottler Dec 31, 2015

Member

This all sounds great! We've talked about it before, but I'd like to bring up the fact that moving off of Redis would be operationally advantageous for us. The predictability of maxmemory has been a challenge to say the least and for what we're using it for creating immutable aggregation keys in some other K/V storage would likely be better.

Probably not something that needs to be done in the first pass, but avoiding Redis-specific features as the new stats system gets designed would be ideal.

Thanks for all of your hard work on this! Let me know if you need any help.

Member

skottler commented Dec 31, 2015

This all sounds great! We've talked about it before, but I'd like to bring up the fact that moving off of Redis would be operationally advantageous for us. The predictability of maxmemory has been a challenge to say the least and for what we're using it for creating immutable aggregation keys in some other K/V storage would likely be better.

Probably not something that needs to be done in the first pass, but avoiding Redis-specific features as the new stats system gets designed would be ideal.

Thanks for all of your hard work on this! Let me know if you need any help.

@krainboltgreene

This comment has been minimized.

Show comment
Hide comment
@krainboltgreene

krainboltgreene Dec 31, 2015

Member

I am super into not being on Redis.

Member

krainboltgreene commented Dec 31, 2015

I am super into not being on Redis.

@jnunemaker

This comment has been minimized.

Show comment
Hide comment
@jnunemaker

jnunemaker Jan 2, 2016

I was doing some unrelated spelunking around gem download stuff and noticed that rubygems exposes a top endpoint in the API which uses downloads per day per gem.

https://github.com/rubygems/rubygems.org/blob/a0c5a8caaa34ff91b6623e3122599a28ae76a570/app/controllers/api/v1/downloads_controller.rb#L25-L32

I remembered this issue and figured it was worth dropping here as this issue discusses removing them. Not sure how seriously rubygems takes breaking API changes, but it would require a breaking change to remove per day counts completely.

jnunemaker commented Jan 2, 2016

I was doing some unrelated spelunking around gem download stuff and noticed that rubygems exposes a top endpoint in the API which uses downloads per day per gem.

https://github.com/rubygems/rubygems.org/blob/a0c5a8caaa34ff91b6623e3122599a28ae76a570/app/controllers/api/v1/downloads_controller.rb#L25-L32

I remembered this issue and figured it was worth dropping here as this issue discusses removing them. Not sure how seriously rubygems takes breaking API changes, but it would require a breaking change to remove per day counts completely.

@dwradcliffe

This comment has been minimized.

Show comment
Hide comment
@dwradcliffe

dwradcliffe Jan 2, 2016

Member

@jnunemaker Good catch. I think I'm ok removing this from the API. On the other hand - is there an easy way for us to track the downloads for just one previous day? (not every day forever)

Member

dwradcliffe commented Jan 2, 2016

@jnunemaker Good catch. I think I'm ok removing this from the API. On the other hand - is there an easy way for us to track the downloads for just one previous day? (not every day forever)

@ktheory

This comment has been minimized.

Show comment
Hide comment
@ktheory

ktheory Jan 2, 2016

Member

As I familiarize myself with the rubygems.org code, it seems simpler in the short term to continue using the Download.incr method to update download stats, and use Version.rubygem_name_for to parse the gem name out of download slugs. Both methods use redis.

In the interest of making small, incremental changes, I can implement this in a few stages:

Phase 1: FastlyLogProcessor DJ uses the Download.incr method to update stats in redis. Disable the Hostess middleware currently using Download.incr. I'd modify the method to take an count argument for bulk updating. It wouldn't change what stats are available. At that point, we can enable Fastly caching to reduce origin requests.

Phase 2 (optional): Remove the daily stats from API endpoints. Stop storing daily stats. Or change how we store daily stats to use less redis memory.

Phase 3 (optional): Refactor Download.incr to use update the postgres DB columns instead of redis, reducing load on redis.

Sound chill? 😎

Member

ktheory commented Jan 2, 2016

As I familiarize myself with the rubygems.org code, it seems simpler in the short term to continue using the Download.incr method to update download stats, and use Version.rubygem_name_for to parse the gem name out of download slugs. Both methods use redis.

In the interest of making small, incremental changes, I can implement this in a few stages:

Phase 1: FastlyLogProcessor DJ uses the Download.incr method to update stats in redis. Disable the Hostess middleware currently using Download.incr. I'd modify the method to take an count argument for bulk updating. It wouldn't change what stats are available. At that point, we can enable Fastly caching to reduce origin requests.

Phase 2 (optional): Remove the daily stats from API endpoints. Stop storing daily stats. Or change how we store daily stats to use less redis memory.

Phase 3 (optional): Refactor Download.incr to use update the postgres DB columns instead of redis, reducing load on redis.

Sound chill? 😎

@ktheory

This comment has been minimized.

Show comment
Hide comment
@ktheory

ktheory Jan 2, 2016

Member

@indirect:

SETNX the key to expire in, say, 1.hour at the start of the job, and then 1.month at the end of the job

Good idea. 👍

Member

ktheory commented Jan 2, 2016

@indirect:

SETNX the key to expire in, say, 1.hour at the start of the job, and then 1.month at the end of the job

Good idea. 👍

@dwradcliffe

This comment has been minimized.

Show comment
Hide comment
@dwradcliffe

dwradcliffe Jan 3, 2016

Member

Download.incr is not currently used (hasn't been used in several years) and the Hostess middleware is never used either. I'm not certain the logic in the Download class is correct since it hasn't been used in several years.

Member

dwradcliffe commented Jan 3, 2016

Download.incr is not currently used (hasn't been used in several years) and the Hostess middleware is never used either. I'm not certain the logic in the Download class is correct since it hasn't been used in several years.

@jnunemaker

This comment has been minimized.

Show comment
Hide comment
@jnunemaker

jnunemaker Jan 3, 2016

@dwradcliffe where is the code that is incrementing redis in the same way then? Definitely seeing big redis keys in the data dumps. Just curious. :)

jnunemaker commented Jan 3, 2016

@dwradcliffe where is the code that is incrementing redis in the same way then? Definitely seeing big redis keys in the data dumps. Just curious. :)

@dwradcliffe

This comment has been minimized.

Show comment
Hide comment
@dwradcliffe

dwradcliffe Jan 3, 2016

Member

The code that actually updates redis is stat-update.

Member

dwradcliffe commented Jan 3, 2016

The code that actually updates redis is stat-update.

@jnunemaker

This comment has been minimized.

Show comment
Hide comment
@jnunemaker

jnunemaker Jan 4, 2016

@dwradcliffe thanks. I'll take a peek at that when I get a chance.

jnunemaker commented Jan 4, 2016

@dwradcliffe thanks. I'll take a peek at that when I get a chance.

@ktheory ktheory referenced this issue Jan 15, 2016

Merged

Add Fastly log processor #1176

3 of 4 tasks complete
@ktheory

This comment has been minimized.

Show comment
Hide comment
@ktheory

ktheory Jan 15, 2016

Member

Much progress here: rubygems/rubygems.org#1176

Member

ktheory commented Jan 15, 2016

Much progress here: rubygems/rubygems.org#1176

ktheory added a commit to ktheory/rubygems-infrastructure that referenced this issue Jan 15, 2016

Add recipe to run script/shoryuken
Copied from the delayed_job recipe.

We want to run shoryuken as part of #35

@ktheory ktheory referenced this issue Jan 15, 2016

Merged

Add recipe to run script/shoryuken #38

1 of 1 task complete

ktheory added a commit to ktheory/rubygems-infrastructure that referenced this issue Feb 17, 2016

Add recipe to run script/shoryuken
Copied from the delayed_job recipe.

We want to run shoryuken as part of #35

ktheory added a commit to ktheory/rubygems-infrastructure that referenced this issue Feb 17, 2016

Add recipe to run script/shoryuken
Copied from the delayed_job recipe.

We want to run shoryuken as part of #35

ktheory added a commit to ktheory/rubygems-infrastructure that referenced this issue Feb 18, 2016

Add recipe to run script/shoryuken
Copied from the delayed_job recipe.

We want to run shoryuken as part of #35
@ktheory

This comment has been minimized.

Show comment
Hide comment
@ktheory

ktheory Feb 18, 2016

Member

Ok, rolling this out gradually involves several deploys, so I wanted to track that here:

  • Deploy app code for shoryuken, fastly log processor. It's disabled though. rubygems/rubygems.org#1176
  • Add chef secrets with staging, production SQS queues (@dwradcliffe)
  • Add chef config to run shoryuken process #38
  • Deploy app change to restart shoryuken process on deploy (rubygems/rubygems.org#1198)
  • Now that FastlyProcessorJobs are logging download stats, check that they look reasonable in the logs
  • Add statsd metrics so we can monitor the stats pipeline (@arthurnn)
  • Deploy chef change to toggle off the stat-update nginx post_action, and change FastlyLogProcessor to update redis.
  • Profit
Member

ktheory commented Feb 18, 2016

Ok, rolling this out gradually involves several deploys, so I wanted to track that here:

  • Deploy app code for shoryuken, fastly log processor. It's disabled though. rubygems/rubygems.org#1176
  • Add chef secrets with staging, production SQS queues (@dwradcliffe)
  • Add chef config to run shoryuken process #38
  • Deploy app change to restart shoryuken process on deploy (rubygems/rubygems.org#1198)
  • Now that FastlyProcessorJobs are logging download stats, check that they look reasonable in the logs
  • Add statsd metrics so we can monitor the stats pipeline (@arthurnn)
  • Deploy chef change to toggle off the stat-update nginx post_action, and change FastlyLogProcessor to update redis.
  • Profit

@dwradcliffe dwradcliffe referenced this issue Feb 27, 2016

Closed

Remove redis #1208

5 of 5 tasks complete
@dwradcliffe

This comment has been minimized.

Show comment
Hide comment
@dwradcliffe

dwradcliffe May 9, 2016

Member

I'm closing this since we've rolled this out 100%.

Member

dwradcliffe commented May 9, 2016

I'm closing this since we've rolled this out 100%.

@dwradcliffe dwradcliffe closed this May 9, 2016

@qrush

This comment has been minimized.

Show comment
Hide comment
@qrush

qrush May 9, 2016

Member

👏 amazing work folks! I think a blog post about how this has evolved and what the setup we have now would be a huge win. (It would also be some good documentation and a waypoint for people who want to learn!)

Member

qrush commented May 9, 2016

👏 amazing work folks! I think a blog post about how this has evolved and what the setup we have now would be a huge win. (It would also be some good documentation and a waypoint for people who want to learn!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment