Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Fastly log processor #1176

Merged
merged 9 commits into from Feb 18, 2016
Merged

Conversation

ktheory
Copy link
Member

@ktheory ktheory commented Jan 15, 2016

πŸ‘‹ Happy Friday πŸ™†β€β™€οΈ

This implements much of rubygems/rubygems-infrastructure#35, allowing us to update Rubygem download counts in bulk using Fastly access logs rather than on each HTTP request.

I still have some cleanup to do, but wanted to get some πŸ‘€ earlier.

To Do

  • add a feature flag to control whether to have FastlyLogProcessor actually update stats, or just log results. (It would only log initially.)
  • figure out how to disable the stat-update C library as part of the feature flag
  • Cleanup & test shoryuken config, preferably loading SQS queue names from ENV vars
  • Make another PR to run a shoryuken process on deploy once Chef provisions a runit script for it

Deploy plan

NB that this code is disabled right now. We're not currently running a shoryuken worker, and so nothing would enqueue a FastlyLogProcessor job.

Once this is merged, the next step is to start running a shoryuken process, with FastlyLogProcessor only logging results & not updating redis.

After that I'll toggle the feature flag so FastlyLogProcessor updates stats, and disable stat-update per request.

FYI @dwradcliffe

@ktheory
Copy link
Member Author

ktheory commented Jan 15, 2016

Rubocop is not a happy. 😱

I'll amend the commits. πŸ˜„

@ktheory ktheory force-pushed the fastly-log-processor branch 2 times, most recently from 4158ccb to 13f1bab Compare Jan 21, 2016
@@ -3,4 +3,4 @@
Delayed::Worker.max_run_time = 5.minutes
Delayed::Worker.logger = Logger.new(Rails.root.join('log/delayed_job.log'))

PRIORITIES = { push: 1, download: 2, web_hook: 3 }.freeze
PRIORITIES = { push: 1, download: 2, web_hook: 3, download_metrics: 4 }.freeze
Copy link
Member

@dwradcliffe dwradcliffe Feb 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we shorten this to stats?

Copy link
Member

@arthurnn arthurnn Feb 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate this constant been here :( eventually i wanna moved it out. I had that somewhere. Anyways, not in the scope of this PR

@dwradcliffe
Copy link
Member

dwradcliffe commented Feb 2, 2016

I added a few comments and it looks like this needs a rebase.


# TODO: set real queue name
# TODO: set auto_delete: true after testing
shoryuken_options queue: 'TODO-add-real-queue', body_parser: :json, auto_delete: false
Copy link
Member

@qrush qrush Feb 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need Shoryuken here? What's the advantage to using this when we have Delayed::Job?

Copy link
Member Author

@ktheory ktheory Feb 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qrush:

Do we actually need Shoryuken here? What's the advantage to using this when we have Delayed::Job?

We need a tool that uses AWS SQS as the queue. See this comment & discussion on rubygems-infrastructure.

@ktheory ktheory force-pushed the fastly-log-processor branch 2 times, most recently from b100d0c to 15de30b Compare Feb 17, 2016
ktheory added 8 commits Feb 17, 2016
We’ll use shoryuken for processing SQS messages; and aws-sdk for
downloading S3 files
It takes an S3 bucket & key, parses out the download counts, and bulk
updated download counts in redis
The Shoryuken worker reads S3 ObjectCreated messages from SQS, and
enqueues an associated FastlyLogProcessor job.
- count 304 responses as download attempts
- rename download_metrics queue to stats

Also add a test that 404 responses are not counted as downloads.
Now it relies on an SQS_QUEUE env var
@dwradcliffe
Copy link
Member

dwradcliffe commented Feb 18, 2016

LGTM 🚒

@ktheory ktheory changed the title Add Fastly log processor (work-in-progress) Add Fastly log processor Feb 18, 2016
ktheory added a commit that referenced this issue Feb 18, 2016
@ktheory ktheory merged commit 7f8a32b into rubygems:master Feb 18, 2016
1 check passed
@ktheory ktheory deleted the fastly-log-processor branch Feb 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants