Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on Foreign Keys? #331

Closed
vandrijevik opened this issue Dec 9, 2016 · 7 comments
Closed

Thoughts on Foreign Keys? #331

vandrijevik opened this issue Dec 9, 2016 · 7 comments

Comments

@vandrijevik
Copy link

Hello,

Thanks for your hard work on gh-ost!

As I familiarize myself with the way it all works, I noted that foreign keys are explicitly not supported, but that they may be to some extent in the future.

Looking through issues in the repo, I also noted this expected feature that the table has neither foreign keys pointing to other tables, nor foreign keys pointing to it.

Could you please expand on your thoughs about foreign keys, and what you mean by “some extent” for their potential support? I am also curious: does this limitation mean that you don’t use foreign keys at all in GitHub’s MySQL databases, or do you use them and manage those tables somehow differently?

Thanks for your time and consideration!

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Dec 9, 2016

At GitHub we do not use foreign keys, ever, anywhere.

Personally, it took me quite a few years to make up my mind about whether foreign keys are good or evil, and for the past 3 years I'm in the unchanging strong opinion that foreign keys should not be used. Main reasons are:

  • FKs are in your way to shard your database. Your app is accustomed to rely on FK to maintain integrity, instead of doing it on its own. It may even rely on FK to cascade deletes (shudder).
    When eventually you want to shard or extract data out, you need to change & test the app to an unknown extent.

  • FKs are a performance impact. The fact they require indexes is likely fine, since those indexes are needed anyhow. But the lookup made for each insert/delete is an overhead.

  • FKs don't work well with online schema migrations.

This last bullet is not a chicken and an egg, as you might think. FKs impose a lot of constraints on what's possible and what's not possible.

Here's an old post of mine, reviewing the first appearance of Facebook's OSC, and which includes some thoughts on foreign keys: http://code.openark.org/blog/mysql/mk-schema-change-check-out-ideas-from-oak-online-alter-table

Let's say you have two tables, P & C, standing for Parent & Child, respectively. There's a foreign key in C such that each row in C points to some "parent" value in P.

Doing schema migration of C is possible. However since foreign keys have unique names, the new (migrated) C table will have a FK with a different name than the original one.

Doing schema migration of P is just not going to work. Recall that gh-ost renames the table at the end. Alas, when renaming a table away, the FK will move with the renamed table. To create a parent-side FK on the ghost table, one would need to migrate C ; and because gh-ost uses async approach, P and P-ghost are never in complete sync at any point in time (except at lock time) which makes it impossible for C to have both a FK to P and to P-ghost. some integrity will be broken.

There's more discussion on the documentation of pt-online-schema-change

@vandrijevik
Copy link
Author

Thanks for the thoughtful and detailed answer, @shlomi-noach! 😄

@chriszrc
Copy link

chriszrc commented Nov 8, 2019

Postgres 11 added support for foreign keys from the sharded table to other tables fwiw-
https://www.percona.com/blog/2019/05/24/an-overview-of-sharding-in-postgresql-and-how-it-relates-to-mongodbs/

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Nov 9, 2019

@chriszrc thank you - interesting to know! Since this issue is 3 years old, I'm assuming you and others came here via a recent HN thread. I thought visitors from HN might benefit from a bit of context:

  • gh-ost is a schema migration tool for MySQL; the jargon used here and most of the problems discussed relate to MySQL - though some of them can apply to other databases.
  • Some comments in the HN thread seem to confuse the logical referential integrity of the relational model, and the FOREIGN KEY constraint, which is a construct where the database enforces the referential integrity. We do use tables with "id"s and tables referencing those "id"s. We do not use FOREIGN KEY constraints in our table definitions.

I'm merely speculating about the feature suggested by @chriszrc above; MySQL does not have such feature at this time, and so obviously I cannot test it in MySQL at this time. So this is just a conjecture: I believe, given our workload at GitHub, that we would not attempt to use cross-server foreign keys. I'd imagine running a DELETE statement, for example, that would then have to synchronously lock and check rows in another server, possibly (likely, in many use cases) cascading yet to other tables in other servers, would impose too much latency and locking for us to be able to serve efficiently.

@wcurrie
Copy link
Contributor

wcurrie commented Jul 29, 2021

In depth explanation https://code.openark.org/blog/mysql/the-problem-with-mysql-foreign-key-constraints-in-online-schema-changes in case google (or other path) leads you here. Thanks shlomi!

@georgedrummond
Copy link
Member

The website referenced appears to be down.

Here is an archive.org cache.

@ramon-garcia
Copy link

I have just come here, with the subject of Github not using foreign keys at all.
But, with regard to the last issue, there is a simple solution: disable the foreign key constraint during the migration (when the child table can point to p in some rows and p-ghost in others) and enable it at the end.

Thanks for your great explanation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants