Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent DDL guard? #8

Open
dvarrazzo opened this issue Oct 16, 2012 · 4 comments
Open

Concurrent DDL guard? #8

dvarrazzo opened this issue Oct 16, 2012 · 4 comments
Milestone

Comments

@dvarrazzo
Copy link
Member

It is generally unsafe to perform DDL on your table while it is being reorg'ed, except for VACUUM or ANALYZE. Although the primary benefit of pg_reorg is that it does not hold a high-level lock on the target table while it is being reorg'ed, this also leaves us with no hard and fast way to prevent unsafe DDL. Perhaps we could add in some sanity checks of pg_class, pg_tablespace, etc. before and after the reorg finishes, and throw an error if any unexpected changes in the table attributes are detected.

@dvarrazzo
Copy link
Member Author

Reading this issue: wouldn't be enough to take a "ACCESS SHARE" lock, which doesn't conflict with anything except the ACCESS EXCLUSIVE taken by ALTER TABLE?

See lock compatibility table.

@schmiddy
Copy link
Member

Yeah, that thought had crossed my mind as well.

The only obstacle is that the LOCK TABLE ... IN ACCESS SHARE MODE would have to be performed in a separate connection, since we want the lock to be held throughout pg_reorg's multiple transactions of work. Then, when it's time to release the ACCESS SHARE lock so that the other connection's ACCESS EXCLUSIVE could go through, we'll have to be careful that there's no one else waiting for a lock on the table who could succeed in altering the table before we acquire our desired ACCESS EXCLUSIVE lock.

It looks like lock_exclusive() is called in two places: first during initial set-up, and lastly before the table swap. We'd want to avoid a race between releasing the first EXCLUSIVE lock and acquiring the ACCESS SHARE lock in the second connection.

@schmiddy
Copy link
Member

I've got a rough implementation of this idea, in my fork at pg_reorg-1.

The basic idea is to set up a second connection which will (asynchronously) try to grab an AccessShare lock while the primary connection still holds AccessExclusive lock during the setup phase. This AccessShare lock is held until step #5, the table swap, when it is bumped to AccessExclusive lock, and the second connection performs the table swapping.

I may well have missed something obvious, and I'll do some more testing and cleanup in the next day or two. If I haven't screwed anything up too badly, perhaps this project could be moved into a branch in reorg master after that.

@schmiddy
Copy link
Member

I tagged this issue with milestone Rel 1.2, since presumably we wouldn't want to shoehorn it into 1.1.x. I'll try to make a branch off the main git repository with these changes. If these changes are OK to merge into master at some point, we could get rid of the forbid_truncate branch and idea, since this feature would of course take care of TRUNCATE as well.

schmiddy added a commit that referenced this issue Oct 28, 2012
…to prevent concurrent DDL.

This is a first pass at Daniele's suggestion in Issue #8, although it is
definitely still buggy -- it is still possible for another transaction
to get in an AccessExclusive lock and perform DDL either before the
ACCESS SHARE lock is acquired or immediately after it is released.
schmiddy added a commit that referenced this issue Oct 28, 2012
Fix table locking so that race conditions don't exist between lock
release in primary conn, and lock acquisition in conn2. Also, have
conn2 be in charge of performing the table swap step, to avoid a
similar race.

Part of work for Issue #8.
schmiddy added a commit that referenced this issue Oct 28, 2012
 * KILL_COMPETING_LOCKS was using pg_cancel_backend() instead of
   pg_terminate_backend()
 * create kill_ddl() function for canceling+terminating any pending
   unsafe concurrent DDL, i.e. anyone hanging out waiting for
   an ACCESS EXCLUSIVE lock on our table.
 * create lock_access_share() function for reliably obtaining an
   ACCESS SHARE lock on the target table, killing off any queued
   ACCESS EXCLUSIVE lockers in the process via kill_ddl()
 * Avoid deadlock possible before we run:
     CREATE TABLE reorg.table_xxx AS SELECT ... FROM ONLY ...
   by using lock_access_share()
 * Fix a few calls in lock_exclusive() which were forgetting to
   specify the passed-in connection.

These fixes are related to Issue #8. The main thing remaining AFAIK
is to review or fix some of the unlikely-error handling bits;
most of these should be marked with XXX now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants