New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Postgres CPU burn #573

Merged
merged 2 commits into from Jun 29, 2017

Conversation

Projects
None yet
6 participants
@schneems
Member

schneems commented Jun 29, 2017

According to Heroku, we have a query that accounts for 80% of all time spent in postgres

$ heroku pg:outliers
 total_exec_time  | prop_exec_time |   ncalls    |   sync_io_time   |                                                                                       query
------------------+----------------+-------------+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 3790:50:52.62102 | 80.2%          | 100,727,265 | 727:08:40.969477 | SELECT  ? AS one FROM "repos" WHERE LOWER("repos"."name") = LOWER($1) AND ("repos"."id" != $2) AND "repos"."user_name" = $3 LIMIT $4

Where did it come from? Turns out, it's not in my code. It was added by a rails validation.

But, you might be thinking. We only have 2,000 repos in the whole database, why is it updating so much?

Well, it turns out this validation is called on EVERY update.

For example this sidekiq task:

Jun 29 07:00:32 issuetriage app/scheduler.8183:  [ActiveJob] Enqueued PopulateIssuesJob (Job ID: 9e04e63f-a515-4dcd-947f-0f777e56dd1b) to Sidekiq(default) with arguments: #<GlobalID:0x00000004f98a68 @uri=#<URI::GID gid://code-triage/Repo/1008>>
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b] Performing PopulateIssuesJob (Job ID: 9e04e63f-a515-4dcd-947f-0f777e56dd1b) from Sidekiq(default) with arguments: #<GlobalID:0x007fd098f6bdf8 @uri=#<URI::GID gid://code-triage/Repo/1008>>
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]   User Load (10.4ms)  SELECT  "users".* FROM "users" WHERE ("users"."github_access_token" IS NOT NULL) ORDER BY RANDOM() LIMIT $1
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]    (35.4ms)  BEGIN
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]   Repo Exists (352.9ms)  SELECT  1 AS one FROM "repos" WHERE LOWER("repos"."name") = LOWER($1) AND ("repos"."id" != $2) AND "repos"."user_name" = $3 LIMIT $4
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]   SQL (3.7ms)  UPDATE "repos" SET "github_error_msg" = $1, "updated_at" = $2 WHERE "repos"."id" = $3
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]    (4.5ms)  COMMIT
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b] Performed PopulateIssuesJob (Job ID: 9e04e63f-a515-4dcd-947f-0f777e56dd1b) from Sidekiq(default) in 629.22ms

Going to remove the validation and do it in the DB instead, which is better anyway. However it doesn't give us the nice errors of a rails validation (I don't think). Would be nice to integrate this in the future.

Ideally validates would see if there is a unique index and use that instead, give us a warning if no unique index existed, because race conditions.

Fix Postgres CPU burn
According to Heroku, we have a query that accounts for 80% of all time spent in postgres

```
$ heroku pg:outliers
 total_exec_time  | prop_exec_time |   ncalls    |   sync_io_time   |                                                                                       query
------------------+----------------+-------------+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 3790:50:52.62102 | 80.2%          | 100,727,265 | 727:08:40.969477 | SELECT  ? AS one FROM "repos" WHERE LOWER("repos"."name") = LOWER($1) AND ("repos"."id" != $2) AND "repos"."user_name" = $3 LIMIT $4
```

Where did it come from? Turns out, it's not in my code. It was added by a rails validation.

But, you might be thinking. We only have 2,000 repos in the whole database, why is it updating so much?

Well, it turns out this validation is called on EVERY update.

For example this sidekiq task:

```
Jun 29 07:00:32 issuetriage app/scheduler.8183:  [ActiveJob] Enqueued PopulateIssuesJob (Job ID: 9e04e63f-a515-4dcd-947f-0f777e56dd1b) to Sidekiq(default) with arguments: #<GlobalID:0x00000004f98a68 @uri=#<URI::GID gid://code-triage/Repo/1008>>
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b] Performing PopulateIssuesJob (Job ID: 9e04e63f-a515-4dcd-947f-0f777e56dd1b) from Sidekiq(default) with arguments: #<GlobalID:0x007fd098f6bdf8 @uri=#<URI::GID gid://code-triage/Repo/1008>>
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]   User Load (10.4ms)  SELECT  "users".* FROM "users" WHERE ("users"."github_access_token" IS NOT NULL) ORDER BY RANDOM() LIMIT $1
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]    (35.4ms)  BEGIN
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]   Repo Exists (352.9ms)  SELECT  1 AS one FROM "repos" WHERE LOWER("repos"."name") = LOWER($1) AND ("repos"."id" != $2) AND "repos"."user_name" = $3 LIMIT $4
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]   SQL (3.7ms)  UPDATE "repos" SET "github_error_msg" = $1, "updated_at" = $2 WHERE "repos"."id" = $3
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b]    (4.5ms)  COMMIT
Jun 29 07:11:05 issuetriage app/worker.1:  [ActiveJob] [PopulateIssuesJob] [9e04e63f-a515-4dcd-947f-0f777e56dd1b] Performed PopulateIssuesJob (Job ID: 9e04e63f-a515-4dcd-947f-0f777e56dd1b) from Sidekiq(default) in 629.22ms
```

Going to remove the validation and do it in the DB instead, which is better anyway. However it doesn't give us the nice errors of a rails validation (I don't think). Would be nice to integrate this in the future.

Ideally `validates` would see if there is a unique index and use that instead, give us a warning if no unique index existed, because race conditions.

@schneems schneems temporarily deployed to codetriage-staging-pr-573 Jun 29, 2017 Inactive

@@ -0,0 +1,5 @@
class AddUniqueIndexToRepos < ActiveRecord::Migration[5.1]
def change
add_index :repos, [:name, :user_name], :unique => true

This comment has been minimized.

@tcrayford

tcrayford Jun 29, 2017

this needs to use LOWER, right?

@tcrayford

tcrayford Jun 29, 2017

this needs to use LOWER, right?

This comment has been minimized.

@schneems

schneems Jun 29, 2017

Member

I'm already lowercasing name and user_name in a pre-commit hook in Ruby. That part of the validation was redundant.

@schneems

schneems Jun 29, 2017

Member

I'm already lowercasing name and user_name in a pre-commit hook in Ruby. That part of the validation was redundant.

This comment has been minimized.

@tcrayford
@tcrayford

This comment has been minimized.

@sgrif

sgrif Jun 29, 2017

You should add a separate uniqueness constraint. The fact that indices happen to constrain is an impl detail.

@sgrif

sgrif Jun 29, 2017

You should add a separate uniqueness constraint. The fact that indices happen to constrain is an impl detail.

This comment has been minimized.

@schneems

schneems Jun 29, 2017

Member

Great idea, honestly didn't know you could have a constraint without an index.

Looks like you can alter and add

A possible solution is to simply use DROP IF EXISTS before creating the new constraint.

ALTER TABLE foo ADD CONSTRAINT bar ...;

Can use a unique constraint https://www.postgresql.org/docs/9.4/static/ddl-constraints.html

@schneems

schneems Jun 29, 2017

Member

Great idea, honestly didn't know you could have a constraint without an index.

Looks like you can alter and add

A possible solution is to simply use DROP IF EXISTS before creating the new constraint.

ALTER TABLE foo ADD CONSTRAINT bar ...;

Can use a unique constraint https://www.postgresql.org/docs/9.4/static/ddl-constraints.html

This comment has been minimized.

@tcrayford

tcrayford Jun 29, 2017

note that a CONSTRAINT implies an index, and the other way around, as far as I know

@tcrayford

tcrayford Jun 29, 2017

note that a CONSTRAINT implies an index, and the other way around, as far as I know

This comment has been minimized.

@bf4

bf4 Jun 30, 2017

We use an active record patch that adds errors on save, and raises on save!

It matches a MySQL exception but can easily be adapted for PG

# Adapted from 
#  https://gist.github.com/bf4/5594532#file-validations-rb-L5
#  https://speakerdeck.com/erniemiller/an-intervention-for-activerecord?slide=117
# Per https://github.com/rails/rails/blob/5-1-stable/activerecord/lib/active_record/validations/uniqueness.rb#L140-L147
#   Using this validation method in conjunction with
#   {ActiveRecord::Base#save}[rdoc-ref:Persistence#save]
#   does not guarantee the absence of duplicate record insertions,
module ActiveRecordUniquenessExceptions
  # @example $!.message
  #   Mysql2::Error: Duplicate entry '1-1-Facility' for key 'idx_user_managed_object': INSERT INTO
  # @example usage
  #
  # class User < ActiveRecord::Base
  #   validates_database_uniqueness
  # end
  def validates_database_uniqueness
    define_method(:save) do |*args|
      begin
        super(*args)
      rescue ActiveRecord::RecordNotUnique
        match_data = $!.message.match(/Duplicate entry '(?<value>[^']+)' for key '(?<key>[^']+)'/) # rubocop:disable Style/SpecialGlobalVars
        message = "Duplicate entry for #{match_data[:key]} '#{match_data[:value]}'"
        errors.add(:uniqueness, message)
        false
      end
    end
  end
end

ActiveSupport.on_load(:active_record) do
  extend ActiveRecordUniquenessExceptions
end
@bf4

bf4 Jun 30, 2017

We use an active record patch that adds errors on save, and raises on save!

It matches a MySQL exception but can easily be adapted for PG

# Adapted from 
#  https://gist.github.com/bf4/5594532#file-validations-rb-L5
#  https://speakerdeck.com/erniemiller/an-intervention-for-activerecord?slide=117
# Per https://github.com/rails/rails/blob/5-1-stable/activerecord/lib/active_record/validations/uniqueness.rb#L140-L147
#   Using this validation method in conjunction with
#   {ActiveRecord::Base#save}[rdoc-ref:Persistence#save]
#   does not guarantee the absence of duplicate record insertions,
module ActiveRecordUniquenessExceptions
  # @example $!.message
  #   Mysql2::Error: Duplicate entry '1-1-Facility' for key 'idx_user_managed_object': INSERT INTO
  # @example usage
  #
  # class User < ActiveRecord::Base
  #   validates_database_uniqueness
  # end
  def validates_database_uniqueness
    define_method(:save) do |*args|
      begin
        super(*args)
      rescue ActiveRecord::RecordNotUnique
        match_data = $!.message.match(/Duplicate entry '(?<value>[^']+)' for key '(?<key>[^']+)'/) # rubocop:disable Style/SpecialGlobalVars
        message = "Duplicate entry for #{match_data[:key]} '#{match_data[:value]}'"
        errors.add(:uniqueness, message)
        false
      end
    end
  end
end

ActiveSupport.on_load(:active_record) do
  extend ActiveRecordUniquenessExceptions
end
@@ -2,7 +2,8 @@
class Repo < ActiveRecord::Base
validate :github_url_exists, on: :create
validates :name, uniqueness: {scope: :user_name, case_sensitive: false }
# Now done at the DB level # validates :name, uniqueness: {scope: :user_name, case_sensitive: false }

This comment has been minimized.

@Dorian

Dorian Jun 29, 2017

Space after opening { :) (also dead code in comments, and doesn't need to have a comment if there is a spec for it)

@Dorian

Dorian Jun 29, 2017

Space after opening { :) (also dead code in comments, and doesn't need to have a comment if there is a spec for it)

This comment has been minimized.

@schneems

schneems Jun 29, 2017

Member

Intentional, intentional, and intentional.

@schneems

schneems Jun 29, 2017

Member

Intentional, intentional, and intentional.

@schneems schneems merged commit ed7a961 into master Jun 29, 2017

2 checks passed

continuous-integration/heroku Heroku CI tests passed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@amcaplan

This comment has been minimized.

Show comment
Hide comment
@amcaplan

amcaplan Jun 29, 2017

However it doesn't give us the nice errors of a rails validation (I don't think).

Having tried this route in the past in my own apps and being surprised by the effects, I'll note that the difference goes beyond the nice errors. Unless something has changed recently in ActiveRecord, the failure behavior on #save is different.

Previously, if you tried to save a non-unique record, ActiveRecord would find the conflicting record, add a validation error, and and #save would return false (#save! would, of course, raise an error).

With this change, if you try to save a non-unique record, ActiveRecord just knows that it tried to do a DB operation and it blew up, so even the non-bang version of #save will raise an error. Furthermore, if you rescue the error, the next DB operation (if there is one in the same transaction) will fail because Postgres stops accepting DB operations until the transaction has been closed.

You seem to have accounted for some of this in the changed unit test, but I thought it would be worth noting in case this is something you (or other readers who found this PR) haven't experienced in the past. It can cause unexpected and surprising failures down the line, generally at the whole-request level. It's been bad enough for me that I decided to always match DB constraints with AR validations, but maybe performance effects are more important here. Generally, YMMV.

amcaplan commented Jun 29, 2017

However it doesn't give us the nice errors of a rails validation (I don't think).

Having tried this route in the past in my own apps and being surprised by the effects, I'll note that the difference goes beyond the nice errors. Unless something has changed recently in ActiveRecord, the failure behavior on #save is different.

Previously, if you tried to save a non-unique record, ActiveRecord would find the conflicting record, add a validation error, and and #save would return false (#save! would, of course, raise an error).

With this change, if you try to save a non-unique record, ActiveRecord just knows that it tried to do a DB operation and it blew up, so even the non-bang version of #save will raise an error. Furthermore, if you rescue the error, the next DB operation (if there is one in the same transaction) will fail because Postgres stops accepting DB operations until the transaction has been closed.

You seem to have accounted for some of this in the changed unit test, but I thought it would be worth noting in case this is something you (or other readers who found this PR) haven't experienced in the past. It can cause unexpected and surprising failures down the line, generally at the whole-request level. It's been bad enough for me that I decided to always match DB constraints with AR validations, but maybe performance effects are more important here. Generally, YMMV.

@schneems

This comment has been minimized.

Show comment
Hide comment
@schneems

schneems Jun 29, 2017

Member

@amcaplan that's a great note and a great point.

In the controller action which is the only place we allow for Repo creation, i'm was already doing a search in the DB first to make sure the record didn't already exist. In theory there could be a race condition, between that and the time the record uniqueness check actually happens. In practice there's only 2K records here.

Here's the query

@repo = Repo.search_by(params[:repo][:name], params[:repo][:user_name]).first unless params_blank?
.

So basically i'm counting on the user never seeing that exception.

I actually first wrapped a bunch of begin/rescue-s in that action, but then realized I could avoid the whole thing.

It would be cool if we could tell ActiveRecord when we've enabled a uniqueness constraint (or other constraint) and it could coerce that into behavior that is closer to what we come to expect from validates

Member

schneems commented Jun 29, 2017

@amcaplan that's a great note and a great point.

In the controller action which is the only place we allow for Repo creation, i'm was already doing a search in the DB first to make sure the record didn't already exist. In theory there could be a race condition, between that and the time the record uniqueness check actually happens. In practice there's only 2K records here.

Here's the query

@repo = Repo.search_by(params[:repo][:name], params[:repo][:user_name]).first unless params_blank?
.

So basically i'm counting on the user never seeing that exception.

I actually first wrapped a bunch of begin/rescue-s in that action, but then realized I could avoid the whole thing.

It would be cool if we could tell ActiveRecord when we've enabled a uniqueness constraint (or other constraint) and it could coerce that into behavior that is closer to what we come to expect from validates

@amcaplan

This comment has been minimized.

Show comment
Hide comment
@amcaplan

amcaplan Jun 29, 2017

It would be cool if we could tell ActiveRecord when we've enabled a uniqueness constraint (or other constraint) and it could coerce that into behavior that is closer to what we come to expect from validates

The challenge there is that #valid? returns true but saving fails (and if #save just returned false, the next DB query would error out unexpectedly). I suspect there is a way to make the experience better here, but I don't have a concrete enough suggestion to make it a feature request for AR. If you have ideas, I'd love to hear!

In the meantime, I'm glad this has a single known use case with an extremely unlikely error. Given what I've seen, the race condition would cause a 500 with (hopefully) not much information sent back to the client, but that's a pretty tight race condition, and actually is the same race condition as ActiveModel validations would have anyway (which is why it's always recommended to pair with a DB constraint)!

amcaplan commented Jun 29, 2017

It would be cool if we could tell ActiveRecord when we've enabled a uniqueness constraint (or other constraint) and it could coerce that into behavior that is closer to what we come to expect from validates

The challenge there is that #valid? returns true but saving fails (and if #save just returned false, the next DB query would error out unexpectedly). I suspect there is a way to make the experience better here, but I don't have a concrete enough suggestion to make it a feature request for AR. If you have ideas, I'd love to hear!

In the meantime, I'm glad this has a single known use case with an extremely unlikely error. Given what I've seen, the race condition would cause a 500 with (hopefully) not much information sent back to the client, but that's a pretty tight race condition, and actually is the same race condition as ActiveModel validations would have anyway (which is why it's always recommended to pair with a DB constraint)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment