Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom checks against the TableDefinition object in the create_table block #170

Closed
andylouisqin opened this issue Feb 8, 2022 · 18 comments

Comments

@andylouisqin
Copy link

Hi! There's currently a check against adding foreign key constraints unsafely. This problem is that this doesn't include adding foreign keys using the create_table method, e.g.:

create_table(:menus) do |t|
  t.references :restaurant, null: false, foreign_key: true, type: :uuid
end

This circumvents the check, which can cause locking and downtime. Some options in the solution space here:

  1. Have the foreign key check include looking at the TableDefinition object passed into create_table for foreign key additions
  2. More generally, allow strong_migrations users to configure custom checks against the TableDefinition
@ankane
Copy link
Owner

ankane commented Feb 8, 2022

Hey @andylouisqin, as far as I know, it should be fine to add a foreign key to a new table since there's no potentially expensive validation step.

Are there other use cases you're thinking of for TableDefinition checks?

@bullfight
Copy link

bullfight commented Feb 8, 2022

Hi @ankane, we thought the same thing, and got bitten by a table lock on a busy referenced table.

It turns out that Postgres locks and scans the foreign referenced table column when adding a validated foreign key to a new table.

Reading the Postgres Documentation for CREATE TABLE > REFERENCES

The addition of a foreign key constraint requires a SHARE ROW EXCLUSIVE lock on the referenced table

And now the important bit: a SHARE ROW EXCLUSIVE lock:

protects a table against concurrent data changes, and is self-exclusive so that only one session can hold it at a time.

Once the lock is created Postgres kicks off a table scan of the referenced table foreign key to verify that it is a valid foreign key.

For reference, the following migration with a call to references creates SQL that looks like the following:

create_table(:menus) do |t|
  t.references :restaurant, null: false, foreign_key: true
end
BEGIN;
CREATE TABLE menus (
  id bigserial primary key, 
  restaurants_id bigserial NOT NULL, 
  CONSTRAINT fk_rails_c0fe971f03
    FOREIGN KEY (restaurants_id)  
    REFERENCES restaurants (id)
);
CREATE INDEX index_menus_on_restaurants_id ON menus (restaurants_id);
COMMIT;

@ankane
Copy link
Owner

ankane commented Feb 8, 2022

It requires a lock, but I don't think it scans the table.

CREATE TABLE restaurants (id bigserial primary key);
INSERT INTO restaurants SELECT i FROM generate_series(1, 10000000) i;

\timing on

BEGIN;
CREATE TABLE menus (
  id bigserial primary key, 
  restaurants_id bigserial NOT NULL, 
  CONSTRAINT fk_rails_c0fe971f03
    FOREIGN KEY (restaurants_id)  
    REFERENCES restaurants (id)
);
CREATE INDEX index_menus_on_restaurants_id ON menus (restaurants_id);
COMMIT;

Output

BEGIN
Time: 1.887 ms
CREATE TABLE
Time: 53.575 ms
CREATE INDEX
Time: 3.250 ms
COMMIT
Time: 0.745 ms

Feel free to edit the script to reproduce the issue you saw.

@fatkodima
Copy link
Contributor

It requires a lock on the referenced table, but does not scan it, because there is no need to do so - the new table is empty.
But while it waits for a SHARE ROW EXCLUSIVE lock on a busy table, writes are queued up, waiting for that lock to be released.

Did you configured lock_timeout?

@bullfight
Copy link

But while it waits for a SHARE ROW EXCLUSIVE lock on a busy table, writes are queued up, waiting for that lock to be released.

Ok, just restating to make sure I understand:

  1. CREATE TABLE menus issues a SHARE ROW EXCLUSIVE on restaurants, this hits the database queue where it waits for a lock.
  2. Other sessions continue to INSERT INTO restaurants, these inserts are queued behind the CREATE TABLE lock.
  3. The lock queue grows in size, increasing the database load until no work is able to complete.

Does this sound right?

Did you configured lock_timeout?

I sure have set a lock_timeout now! Thanks.

@bullfight
Copy link

Going back to @andylouisqin's point,

  1. Adding a foreign key and a create_table still seems dangerous to me, it is the kind of thing we would normally want to be warned about with add_reference or add_foreign_key.
  2. Another useful check for TableDefinition could be a warning about the use of disable_ddl_transaction. If ddl transactions are disabled, the TableDefinition should be wrapped in ActiveRecord::Base.transaction

@fatkodima
Copy link
Contributor

Yes, basically the process is as you described. You can read more here https://joinhandshake.com/blog/our-team/postgresql-and-lock-queue/ and https://github.com/tbicr/django-pg-zero-downtime-migrations#transactions-fifo-waiting

  1. Adding a foreign key and a create_table still seems dangerous to me, it is the kind of thing we would normally want to be warned about with add_reference or add_foreign_key.

This is indeed dangerous, if lock_timeout is not set. Once granted, it is instantaneous. But otherwise, there is no safer way to add a foreign key from an empty table that just add it (or at least I do not know it).

Also a useful technique would be to use lock retries - https://postgres.ai/blog/20210923-zero-downtime-postgres-schema-migrations-lock-timeout-and-retries I saw Andrew have plans to support it.

  1. Another useful check for TableDefinition could be a warning about the use of disable_ddl_transaction. If ddl transactions are disabled, the TableDefinition should be wrapped in ActiveRecord::Base.transaction

Can you provide a specific example?

@bullfight
Copy link

bullfight commented Feb 8, 2022

But otherwise, there is no safer way to add a foreign key from an empty table that just add it (or at least I do not know it).

Exactly! It isn't safe. Foreign keys should be added in separate statements to be safe, e.g.

create_table :menus do |t|
  t.references :restaurant, null: false
end

add_foreign_key :menus, :restaurants, validate: false
validate_foreign_key :menus, :restaurants

Also a useful technique would be to use lock retries

Cool! This is an excellent article.

Can you provide a specific example?

This applies to any migration where the wrapping ddl transaction is disabled to support adding an index and a table is created or altered.

Bad

class CreateMenus < ActiveRecord::Migration[6.1]
  disable_ddl_transaction!

  def change
    add_column :menus, :name, :text
    add_index :menus, :restaurant_id, algorithm: :concurrently
  end
end

Good

class CreateMenus < ActiveRecord::Migration[6.1]
  disable_ddl_transaction!

  def change
    ActiveRecord::Base.transaction do
      add_column :menus, :name, :text
    end

    add_index :menus, :restaurant_id, algorithm: :concurrently
  end
end

@fatkodima
Copy link
Contributor

Exactly! It isn't safe. Foreign keys should be added in separate statements to be safe, e.g.

create_table :menus do |t|
t.references :restaurant, null: false
end

add_foreign_key :menus, :restaurants, validate: false
validate_foreign_key :menus, :restaurants

For new tables, there is no difference in this and adding a foreign key in create_table. The same lock (SHARE ROW EXCLUSIVE) will be required and validate_foreign_key will just be a no-op, because the table is empty.

ActiveRecord::Base.transaction do
  add_column :menus, :name, :text
end

Explicitly calling transaction is redundant, because when no explicit transaction via BEGIN is opened, every single statement is executed in its own transaction. So "Bad" and "Good" are the same.

If you have more questions or need clarifications, would be happy to answer.

@bullfight
Copy link

  1. validate_foreign_key uses VALIDATE CONSTRAINT - From the docs this form acquires a SHARE UPDATE EXCLUSIVE, which does not block concurrent data changes, rather it protects a table against concurrent schema changes and VACUUM runs.
  2. My example is for migrations where disable_ddl_transaction! is used, the wrapping transaction is not added in these cases.

P.S. Thanks for taking all the time to provide feedback here.

@fatkodima
Copy link
Contributor

  1. When adding a foreign key to a new table, we can do it 2 ways:
    First:
add_foreign_key :table1, :table2

This will acquire a SHARE UPDATE EXCLUSIVE lock on both tables.

Second:

add_foreign_key :table1, :table2, validate: false # (1)
validate_foreign_key :table1, :table2 # (2)

Both (1) and (2) acquire the same SHARE UPDATE EXCLUSIVE lock on both tables, but 2 times this time. Since table is empty, (2) will be a no-op, but a lock is still required to acquire.

So as you can see it is better and with the same safety level is just run the "First" approach.

  1. For your specific example, they are both equivalent since outside of explicit transaction, postgres runs each statement in a separate transaction.
    For more complex examples, it is assumed that if a user adds disable_statement_timeout!, then he knows what he is doing and we cannot always correctly guess which statements to wrap in a transaction and which not.

@bullfight
Copy link

  1. StrongMigrations already guards against this! Try to use add_foreign_key without validate: false and you will get the following warning:
class AddForeignKey < ActiveRecord::Migration[6.1]
  def change
    add_column :table1, :table2_id, :bigint
    add_foreign_key :table1, :table2
  end
end

bin/rails db:migrate

StandardError: An error has occurred, this and all later migrations canceled:

=== Dangerous operation detected strong_migrations ===

Adding a foreign key blocks writes on both tables. Instead,
add the foreign key without validating existing rows,
then validate them in a separate migration.

class AddForeignKey < ActiveRecord::Migration[6.1]
 def change
    add_foreign_key :table1, :table2, validate: false
  end
end

class ValidateForeignKey < ActiveRecord::Migration[6.1]
  def change
    validate_foreign_key :table1, :table2
  end
end

My point here is that I think that StrongMigrations could generate the same warning when creating a table.

  1. Ok. Maybe we can look into using a custom check or into extending the functionality of custom checks.

@fatkodima
Copy link
Contributor

My point here is that I think that StrongMigrations could generate the same warning when creating a table.

Nope, because it is safe to add a foreign key to new tables if configured a lock_timeout. See all my comments above.

@bullfight
Copy link

bullfight commented Feb 10, 2022

Nope, because it is safe to add a foreign key to new tables if configured a lock_timeout. See all my comments above.

Sure lock_timeout ensures the migration will fail, but how is it any more safe than add_foreign_key? Why would we want strong migrations to warn about one form and not the other?

P.S. I am happy to look into adding this warning and opening a PR. I don't really want to draw this out any further.

@ankane
Copy link
Owner

ankane commented Feb 16, 2022

@bullfight Adding a foreign key is safe on an empty table and potentially unsafe on a non-empty table (writes are blocked while the foreign key is validated), which is why it warns in one situation but not the other.

@andylouisqin If you have other ideas for checks involving the table definition, feel free to create a new issue.

@markbrown4
Copy link

markbrown4 commented Jun 29, 2022

Adding a foreign key is safe on an empty table

I don't know the exact differences but our teams understanding is the same as @bullfight's. That Postgres can lock the referenced tables so it's not safe.

Here's an example migration that brought our db to a halt.

# 1
create_table :x do |t|
  t.references :busy_table_one, null: false, index: true,
    foreign_key: {on_delete: :cascade}
  t.references :from_busy_table_two, null: false, index: true,
    foreign_key: {to_table: :busy_table_two, on_delete: :cascade}
  t.references :to_busy_table_two, null: false, index: true
    foreign_key: {to_table: :busy_table_two, on_delete: :cascade}
  t.timestamps
end

Adding / validating the foreign keys across separately did not.

# 1
create_table :x do |t|
  t.references :busy_table_one, null: false, index: true
  t.references :from_busy_table_two, null: false, index: true
  t.references :to_busy_table_two, null: false, index: true
  t.timestamps
end

# 2
add_foreign_key :x, :busy_table_one, column: :busy_table_one_id,
  validate: false, on_delete: :cascade

# 3
add_foreign_key :x, :busy_table_two, column: :from_busy_table_two_id,
  validate: false, on_delete: :cascade

# 4
add_foreign_key :x, :busy_table_two, column: :to_busy_table_two_id,
  validate: false, on_delete: :cascade

# 5
validate_foreign_key :x, column: :busy_table_one_id
validate_foreign_key :x, column: :from_busy_table_two_id
validate_foreign_key :x, column: :to_busy_table_two_id

Just one example, so not conclusive. But we're now enforcing adding foreign keys separately because we've been bitten more than once by this. Keen to hear if you know what else could be the issue with the first migration.

@fatkodima
Copy link
Contributor

Adding a single foreign key when creating a table is safe, but adding multiple foreign keys is not, as you actually experienced. See https://github.com/fatkodima/online_migrations#adding-multiple-foreign-keys for the details.

@markbrown4
Copy link

markbrown4 commented Jun 29, 2022

Aha! Thanks for the insight. Coming back to the original post, perhaps strong migrations should warn about this or allow custom checks for it.

Allow strong_migrations users to configure custom checks against the TableDefinition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants