Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add foreign key in the same statement as adding a column for PostgreSQL #45305

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fatkodima
Copy link
Member

When using add_reference, PostgreSQL adds a new column and a foreign key separately. For large tables, adding a foreign key leads to problems, because the dbms needs to validate all the rows which takes a lot of time while holding exclusive locks on both tables.

There are 2 approaches to solve this:

  1. add invalid foreign key and validate separately - safe approach, but cumbersome and takes the same long time
  2. add foreign key in the same statement as a new column, which is blazingly fast. This approach is implemented in this PR; before is was possible only via raw sql.

Example:

\timing on
create table foos (id integer primary key);
create table bars (id integer primary key);
INSERT INTO bars (id) SELECT i FROM generate_series(1, 50000000) AS i;

-- Separately
ALTER TABLE "bars" ADD "foo_id" bigint, ADD CONSTRAINT "fk_rails_9e5a99443d" FOREIGN KEY ("foo_id") REFERENCES "foos" ("id");
ALTER TABLE
Time: 23305.494 ms (00:23.305)

alter table bars drop column foo_id;

-- Combined
ALTER TABLE "bars" ADD "foo_id" bigint CONSTRAINT "fk_rails_9e5a99443d" REFERENCES "foos" ("id");
ALTER TABLE
Time: 2.362 ms

23s vs 2ms.

I added support only for PostgreSQL, because for other dbms' this will not add any benefits.

Based on #41863, added @ccutrer as a co-author.

if current_adapter?(:PostgreSQLAdapter)
test "add_reference combines adding column and foreign key" do
@connection.create_table :testings
assert_queries(1) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it well, it was one query before as well. Should we check more on query itself for proper format to be used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was 2 queries before:

def add_reference(table_name, ref_name, **options)
ReferenceDefinition.new(ref_name, **options).add(table_name, self)
end

def add(table_name, connection)
columns.each do |name, type, options|
connection.add_column(table_name, name, type, **options)
end
if index
connection.add_index(table_name, column_names, **index_options(table_name))
end
if foreign_key
connection.add_foreign_key(table_name, foreign_table_name, **foreign_key_options)
end
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, ok. I got confused by your example code mentioning following.

ALTER TABLE "bars" ADD "foo_id" bigint, ADD CONSTRAINT "fk_rails_9e5a99443d" FOREIGN KEY ("foo_id") REFERENCES "foos" ("id");

@simi
Copy link
Contributor

simi commented Jun 12, 2022

@fatkodima looks good. I did quick check and this syntax is supported for all PostgreSQL versions supported by ActiveRecord (9.3+). 💪

Btw. I did some checks locally with your example script and I got 2s vs 2ms. Are you using some super slow disk? 🤔

@fatkodima
Copy link
Member Author

Rerun again: got 13s this time. I am on a Mac from 2015 😄
But thats a time on a laptop, with bigger tables and more loaded env the time will be bigger.

@fatkodima fatkodima force-pushed the add_reference-foreign_key branch 2 times, most recently from 6ddd3e0 to 353c81c Compare February 20, 2023 22:41
Co-authored-by: Cody Cutrer <cody@instructure.com>
@@ -22,6 +22,7 @@ def accept(o)

private
def visit_AlterTable(o)
set_current_table(o.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you let me know what is the purpose of setting table_name instance variable here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used to set the current table we create or alter and is needed here https://github.com/rails/rails/pull/45305/files#diff-5b1b920fc3bebbb1d4c144b302603e5003a1d9e5a493e3234bf764b9e26fbe30R181-R182 to be able to calculate the foreign key name based on the "from" and "to" table names in the foreign_key_options method.

We can manually calculate the foreign key name and use something like

# We need to get foreign key/ primary key columns here somehow,
# which is also done for us in `foreign_key_options`.
name = @conn.foreign_key_name(...)
options = @conn.foreign_key_options(nil, to_table, options.merge(name: name))
ForeignKeyDefinition.new(nil, to_table, options) <=== note that we set nil as table name

but this is ugly and more work that needed. Wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. I am not very familiar with design patterns, but I am not sure if it is good practice to set instance variables within a method that uses the visitor pattern with something like "set_something".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is also the way how compilers generate the code.

@byroot Can you suggest us here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super well versed in this design, but I don't think it's a big deal?

The visitor pattern for what I know of it, does down a tree to collect information and generate a result, so it doesn't shock me to keep some state when visiting.

But I think @tenderlove is much more well versed than me in Arel.

@@ -42,6 +43,7 @@ def visit_AddColumnDefinition(o)
end

def visit_TableDefinition(o)
set_current_table(o.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a same question here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants