Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider issuing DDL commands on the master before the worker nodes #357

Closed
4 of 7 tasks
samay-sharma opened this issue Feb 23, 2016 · 3 comments
Closed
4 of 7 tasks
Assignees
Milestone

Comments

@samay-sharma
Copy link
Contributor

samay-sharma commented Feb 23, 2016

@anarazel mentioned

"Executing the DDL on the workers before doing so on the master strikes me as rather fragile. There's always going to be problems without using 2PC, but if we were to do this on the master first, we'd at least have a chance to mark the shards where the statement failed as broken. This way round there's no way of doing that."

The rationale for doing this on the worker nodes first was to verify that the command can go through on atleast one shard before we commit to doing it on the master.

However, since we can rollback on the master node in case a query fails on the first worker node, we could run it on the master first. So, there don't seem to be advantages of doing it on the workers before doing it on the master.

However, if we do execute it on the master first, we have the ability to rollback and that makes it better for certain cases.

Plan

  • Reorganize multi_ProcessUtility to better separate DDL mode from other modes
  • Determine call sites which modify parse tree
  • Refactor modifications into separate methods, as needed
  • Change call order of standard_ProcessUtility to happen first
  • Evaluate locking semantics
  • Verify no unit tests break
  • Run against PostgreSQL tests
@lithp
Copy link
Contributor

lithp commented Mar 29, 2016

I think this is the real way to fix #350. It's PR is just a quick fix for this underlying problem.

@jasonmp85
Copy link
Contributor

Finally getting around to doing a brain dump here. Basically, as @anarazel has pointed out, it would be nicer to execute the master's DDL command first as this allows some piggybacking on the locking semantics already provided by PostgreSQL.

The existing ways we handle DDL commands in the utility hook are:

  • ProcessIndexStmt
  • ProcessDropIndexStmt
  • ProcessAlterTableStmt
  • ProcessAlterObjectSchemaStmt
  • WorkerProcessAlterTableStmt (runs on worker against distributed table for certain commands)

Each of these returns a possibly-modified parse tree which is subsequently executed by a call to standard_ProcessUtility. Other non-DDL commands are also processed by multi_ProcessUtility:

  • FORMAT = 'transmit' statements
  • ProcessCopyStmt (performs copy, then potentially runs local copy)
  • EXPLAIN EXECUTE
  • ProcessVacuumStmt

After poking around the various Process- functions, the scope of this is:

  1. Refactor multi_ProcessUtility to separate concerns (verification, caveats, utilities, DDL). This function is presently nearly 300 lines and breaking it up will help understand the modes it operates in (some functions will still need standard_ProcessUtility to run after some remove operations)
  2. Determine parse modifications performed by each DDL-related Process- function
  3. Break up parse modifications into separate function
  4. Change flow to: modify, execute (local), execute (remote)
  5. Evaluate removal of any shard-specific locking in favor of PostgreSQL's built-in locking

I'm putting a checklist up top with what I've done and am doing. Estimated time: PR out today.

@aamederen
Copy link
Contributor

@jasonmp85 I think execute (remote) part can be separated to execute (worker distributed table) and execute (shards).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants