Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments for Disabling Postgres constraints for pg_dump #1243

Open
phinjensen opened this issue Nov 11, 2017 · 8 comments
Open

Comments for Disabling Postgres constraints for pg_dump #1243

phinjensen opened this issue Nov 11, 2017 · 8 comments

Comments

@phinjensen
Copy link
Contributor

phinjensen commented Nov 11, 2017

Comments for https://www.endpointdev.com/blog/2016/07/disabling-postgres-constraints-for/
By Greg Sabino Mullane

To enter a comment:

  1. Log in to GitHub
  2. Leave a comment on this issue.
@phinjensen
Copy link
Contributor Author

original author: kustodian
date: 2016-07-14T04:15:55-04:00

After reading this article feels that it would make sense for pg_dump to have an option (or even make it default) to set constraints after all the data finished loading. It would solve this problem, but more importantly should give a significant speed boost of a restore.

@phinjensen
Copy link
Contributor Author

original author: Douglas Hunley
date: 2016-07-14T09:11:06-04:00

I feel like this post is misleading. You're not disabling the constraints for pg_dump as pg_dump works perfectly well w/ the constraints in place. You need to disable the constraint for the purposes of restoring the data through psql (does pg_restore fail similarly?).

@phinjensen
Copy link
Contributor Author

original author: Greg Sabino Mullane
date: 2016-07-14T22:22:31-04:00

kustodian: An option might be nice, but tricky to implement. Check constraints are a little bit of a gray area, as very implicit constraints (e.g. the data type itself, the column being NOT NULL) are always in the pre-data section and thus checked on data load, while other things (esp. triggers) are always in the post-data section, and never checked. A flag that tells pg_dump to put the constraints inside post-data only could work, although this is a potential foot gun, as one could in theory mess up your flags and leave the constraints out of both sections! A better solution may be disabling the constraints per table as the --disable-triggers flag of pg_dump does (via ALTER TABLE), or allow something more global similar to session_replication_role.

@phinjensen
Copy link
Contributor Author

original author: Greg Sabino Mullane
date: 2016-07-14T22:27:43-04:00

Douglas Hunley: Sorry you feel the title is misleading (alternatives welcome), but pg_dump's primary purpose is to create something that will be loaded again into Postgres, and pg_dump is the one with the logic concerning the convalidated column. Yes, pg_restore will fail in exactly the same way as psql will.

@phinjensen
Copy link
Contributor Author

original author: renzo
date: 2016-07-15T02:30:52-04:00

an alternative title might be: "NOT VALID: Adding Constraints for faster and successful pg_dump and restore"

I too think that the current title is misleading because in this discussion constraints do not get disabled at all. They are just changed to "NOT VALID" after creation.

In addition wouldn't it be better to recommend that if changing the semantics of a function which is used for a check constraint and would not validate already existing data, should best be done as (just sketching):

Begin;

Drop constraint on table...;
Alter function...;
Alter table add constraint ... NOT VALID;

commit;

This way anyone with the permissions to change table and functions can do this.
And changing the system catalogue is only needed for legacy cases.

@phinjensen
Copy link
Contributor Author

original author: Andrew Dunstan
date: 2016-07-17T19:17:16-04:00

ISTM the real problem lis in your alteration of the function. Essentially at that point you have lied to Postgres about the data.

@phinjensen
Copy link
Contributor Author

original author: Greg Sabino Mullane
date: 2016-07-18T22:14:30-04:00

Andrew: Yes, the example function is not an ideal example of a real world case; it was written to prove the point that a failing constraint will prevent the data from loading. Actual use cases are check constraints that all functions that access other tables (as a poor replacement for foreign keys - or in cases where a foreign key is not sufficient, in which case they /really/ ought to be using a trigger). In such a case, the order that pg_dump accesses the tables becomes very important, because (unlike foreign keys), pg_dump has no way of knowing that your check constraint function implies some sort of dependency between the two tables. Thus, the output of the pg_dump will not load, despite there being no changes to the function, and despite the fact that the data is quite "valid" in the existing database. Using check constraints in such a fashion is a foot gun, to be sure, but preventing feet from getting shot off is one of the reasons clients call us. :)

Also, the performance boost can be quite significant when the underlying function is expensive.

@phinjensen
Copy link
Contributor Author

original author: Greg Sabino Mullane
date: 2016-07-18T22:21:02-04:00

renzo: Thank you for the feedback. I don't know if "adding constraints" is truly a title improvement, but I will consider modifying some of it. I think stating that the constraints get "disabled" is not misleading - they went from firing, to not firing. While that's not exactly the same as literally disabling the constraint, neither is setting session_replication_role the same as literally disabling a trigger. As for changing the semantics of the function, it was just a quick means to an end - see my answer to Andrew above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant