Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: distsql spec creation in execbuilder plumbing #49348

Merged
merged 3 commits into from Jun 2, 2020

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented May 20, 2020

sql: add stub exec.Factory for opt-driven distsql planning

This commit adds a stub implementation of exec.Factory interface that
will be creating DistSQL processor specs directly from opt.Expr,
sidestepping intermediate planNode phase.

It also introduces a new private cluster setting
"sql.defaults.experimental_distsql_planning" as well as a session
variable "experimental_distsql_planning" which determine whether the new
factory is used, set to off by default (other options are on and
always - the latter is only for the session variable).

Off planning mode means using the old code path, on means attempting
to use the new code path but falling back to the old one if we encounter
an error, and always means using only the new code path and do not
fallback in case of an error. Currently the fallback doesn't occur with
always only for SELECT statements (so that we could run other
statements types, like SET), meaning that whenalways option is used,
if we encounter an unsupported node while planning a SELECT statement,
an error is returned, but if we encounter an unsupported node while
planning a statement other than SELECT, we still fallback to the old
code (in a sense always behaves exactly like on for all statement
types except for SELECTs).

Release note: None

sql: add plumbing for phys plan creation directly in execbuilder

This commit introduces planMaybePhysical utility struct that
represents a plan and uses either planNode ("logical") or DistSQL spec
("physical") representations. It will be removed once we are able to
create all processor specs in execbuilder directly. This struct has been
plumbed in all places that need to look at the plan, but in most of them
only the logical representation is supported. However, the main codepath
for executing a statement supports both, and if physical representation
is used, then we bypass distsql physical planner.

This commit also renames checkSupportForNode to
checkSupportForPlanNode and createPlanForNode to
createPhysPlanForPlanNode to make their purpose more clear.

Addresses: #47474.

Release note: None

sql: operate on pointers to PhysicalPlan

This commit changes createPlanFor* methods to operate on pointers to
PhysicalPlan instead of values. It also renames and cleans up some
explain-related utility methods.

Release note: None

@yuzefovich yuzefovich added the do-not-merge bors won't merge a PR with this label. label May 20, 2020
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the distsql branch 5 times, most recently from 220b9bd to 63ba4b3 Compare May 28, 2020 03:50
@yuzefovich yuzefovich removed the do-not-merge bors won't merge a PR with this label. label May 28, 2020
@yuzefovich yuzefovich changed the title WIP sql: distsql spec creation in execbuilder plumbing sql: distsql spec creation in execbuilder plumbing May 28, 2020
@yuzefovich yuzefovich requested a review from asubiotto May 28, 2020 03:53
Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see this!

The last commit seems to claim something that I don't think is happening yet ("we will use the execbuilder-driven DistSQL spec creationg only for SELECT statements").

Reviewed 7 of 7 files at r1, 2 of 2 files at r2, 15 of 15 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


docs/generated/settings/settings.html, line 48 at r1 (raw file):

<tr><td><code>sql.defaults.results_buffer.size</code></td><td>byte size</td><td><code>16 KiB</code></td><td>default size of the buffer that accumulates results for a statement or a batch of statements before they are sent to the client. This can be overridden on an individual connection with the 'results_buffer_size' parameter. Note that auto-retries generally only happen while no results have been delivered to the client, so reducing this size can increase the number of retriable errors a client receives. On the other hand, increasing the buffer size can increase the delay until the client receives the first result row. Updating the setting only affects new connections. Setting to 0 disables any buffering.</td></tr>
<tr><td><code>sql.defaults.serial_normalization</code></td><td>enumeration</td><td><code>rowid</code></td><td>default handling of SERIAL in table definitions [rowid = 0, virtual_sequence = 1, sql_sequence = 2]</td></tr>
<tr><td><code>sql.distsql.experimental_planning.enabled</code></td><td>boolean</td><td><code>false</code></td><td>set to true to enable experimental DistSQL planning driven by the optimizer.</td></tr>

Why is only this cluster setting removed from the file? I also feel like this commit should probably be its own PR.


pkg/sql/distsql_running.go, line 900 at r3 (raw file):

	} else {
		var err error
		subqueryPhysPlan, err = dsp.createPhysPlanForPlanNode(subqueryPlanCtx, subqueryPlan.plan.planNode)

Instead of having this if else in a lot of these places, could you create a wrapper over createPhysPlanForPlanNode that either unwraps the plan as a planNode, or returns the physical plan?


pkg/sql/exec_util.go, line 257 at r1 (raw file):

// execbuilder-driven DistSQL spec creation that sidesteps intermediate
// planNode transition when going from opt.Tree to DistSQL processor specs.
var execBuilderDrivenDistSQLSpecCreationClusterMode = settings.RegisterBoolSetting(

This feels unnecessarily long. How about execBuilderBuildDistSQLSpecsDirectly?


pkg/sql/exec_util.go, line 919 at r3 (raw file):

	var rec distRecommendation
	if plan.physPlan != nil {
		rec = plan.recommendation

Where is this recommendation coming from?


pkg/sql/execbuilder_distsql_spec_factory.go, line 24 at r1 (raw file):

)

type execBuilderDistSQLSpecFactory struct {

Not sure if it makes sense to have exec builder in this name and the filename. How about distSQLSpecExecFactory and s/execbuilder/opt in the filename?


pkg/sql/execbuilder_distsql_spec_factory.go, line 32 at r1 (raw file):

func makeExecBuilderDistSQLFactory(allowAutoCommit bool) execBuilderDistSQLSpecFactory {
	return execBuilderDistSQLSpecFactory{
		allowAutoCommit: allowAutoCommit,

Why add this if it's not used?


pkg/sql/explain_distsql.go, line 154 at r3 (raw file):

	var physPlan PhysicalPlan
	if n.plan.main.physPlan != nil {
		physPlan = *n.plan.main.physPlan

What happens if we have subqueries in this case?


pkg/sql/plan.go, line 323 at r3 (raw file):

// use either planNode or DistSQL spec representation, but eventually will be
// replaced by the latter representation directly.
type planMaybePhysical struct {

What made you choose to do this instead of creating a new planNode that is a wrapper over a *PhysicalPlan? I think it would probably require less code changes. The distsql planning code could then check for that and unwrap the physical plan.


pkg/sql/plan_opt.go, line 182 at r1 (raw file):

		execFactory = &factory
	} else {
		factory := makeExecFactory(p)

I wonder why we don't just s/make/new and return a pointer directly


pkg/sql/plan_opt.go, line 184 at r3 (raw file):

	// TODO(yuzefovich): update this once we support more than just creating
	// table reader specs in execbuilder.
	isSelectStmt := strings.HasPrefix(strings.ToLower(p.stmt.SQL), "select")

I think it would be better to look at p.stmt.AST


pkg/sql/plan_opt.go, line 185 at r3 (raw file):

	// table reader specs in execbuilder.
	isSelectStmt := strings.HasPrefix(strings.ToLower(p.stmt.SQL), "select")
	if execBuilderDrivenDistSQLSpecCreationClusterMode.Get(&p.execCfg.Settings.SV) && isSelectStmt {

Do we want to add a mode for this cluster setting similar to vectorize_always that uses this factory regardless of support?


pkg/sql/schema_changer.go, line 240 at r3 (raw file):

		// TODO(yuzefovich): update this once we support more than just
		// creating table reader specs in execbuilder.

I think this TODO needs to be reworded (and any similar ones) because "once we support more" is not clear. Does doing this require support for all specs? If so, we should probably link the issue.


pkg/sql/logictest/testdata/logic_test/experimental_distsql_planning, line 1 at r1 (raw file):

# LogicTest: !3node-tenant

What is the failure with 3node-tenant?


pkg/sql/logictest/testdata/logic_test/experimental_distsql_planning, line 8 at r1 (raw file):


statement ok
SET experimental_distsql_planning = true

Maybe also assert that the new factory is used by checking that an error is returned for an unsupported statement


pkg/sql/logictest/testdata/logic_test/experimental_distsql_planning, line 14 at r1 (raw file):


statement ok
SET CLUSTER SETTING sql.defaults.experimental_distsql_planning.enabled = true

weird chars at the end of the line

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was happening (in plan_opt.go), but I moved the claim in the first commit.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)


docs/generated/settings/settings.html, line 48 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Why is only this cluster setting removed from the file? I also feel like this commit should probably be its own PR.

I didn't regenerate the settings on the first commit on its own, fixed.

Extracted the second commit into a separate PR.


pkg/sql/distsql_running.go, line 900 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Instead of having this if else in a lot of these places, could you create a wrapper over createPhysPlanForPlanNode that either unwraps the plan as a planNode, or returns the physical plan?

Good point, done.


pkg/sql/exec_util.go, line 257 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

This feels unnecessarily long. How about execBuilderBuildDistSQLSpecsDirectly?

Renamed it to be inline with the actual setting name.


pkg/sql/exec_util.go, line 919 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Where is this recommendation coming from?

Currently, checkSupportForNode and createPlanForNode return the distribution recommendation and the physical plan respectively, but I envision that the new distsql exec factory will populate both at the same time - it will make the recommendation while constructing distsql specs. This is still TBD.


pkg/sql/explain_distsql.go, line 154 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

What happens if we have subqueries in this case?

The error should occur earlier, in the execbuilder. Left a TODO as a reminder.


pkg/sql/plan.go, line 323 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

What made you choose to do this instead of creating a new planNode that is a wrapper over a *PhysicalPlan? I think it would probably require less code changes. The distsql planning code could then check for that and unwrap the physical plan.

To me such an approach breaks an assumption that a planNode represents a single node in the tree because this new wrapper plan node would contain the whole physical plan at once, i.e. it would be a "meta" node.

I guess we could change the perspective a little and, instead, introduce a "plan-node-that-might-be-distsql-processor-spec" which would be the distsql spec for a single plan node and not the whole tree, but I don't know whether it'll be better.

This is one of those things that I learn by doing, and we're still laying out the plumbing before doing some meaningful work. From my point of view, it should be ok to merge something that moves us forward in the prototyping of this experimental planning (and doesn't break the normal flow), and we can iterate on the representation later.


pkg/sql/plan_opt.go, line 182 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I wonder why we don't just s/make/new and return a pointer directly

Good point, done.


pkg/sql/plan_opt.go, line 184 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I think it would be better to look at p.stmt.AST

Done.


pkg/sql/plan_opt.go, line 185 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Do we want to add a mode for this cluster setting similar to vectorize_always that uses this factory regardless of support?

I also thought about this, but the difficulty is if you set your cluster setting to always or whatever, you won't be able to change that easily. I tried catching SET statements and use the old factory for that, but internally we use INSERT into a system table (I guess) and maybe other statements, so it seemed pretty hard.

That is why (at least for now) I decided to not implement the fallback from the new factory to the old one if the former doesn't support the statement, but apply the heuristic to see whether the statement is a SELECT.


pkg/sql/schema_changer.go, line 240 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I think this TODO needs to be reworded (and any similar ones) because "once we support more" is not clear. Does doing this require support for all specs? If so, we should probably link the issue.

Removed this particular TODO and updated others.


pkg/sql/logictest/testdata/logic_test/experimental_distsql_planning, line 1 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

What is the failure with 3node-tenant?

I don't know, something was hanging, but I don't see those hangs anymore. Removed.


pkg/sql/logictest/testdata/logic_test/experimental_distsql_planning, line 8 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Maybe also assert that the new factory is used by checking that an error is returned for an unsupported statement

Done. I originally added this check in the last commit.


pkg/sql/logictest/testdata/logic_test/experimental_distsql_planning, line 14 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

weird chars at the end of the line

I think it was a newline missing at the end of the file.


pkg/sql/execbuilder_distsql_spec_factory.go, line 24 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Not sure if it makes sense to have exec builder in this name and the filename. How about distSQLSpecExecFactory and s/execbuilder/opt in the filename?

Renamed the struct and the file.

My initial thinking for the filename was that eventually the file should live in execbuilder package, so I used it as a prefix, but I'm not sure why I didn't create the file in that package right away, moved. Possibly we will need to move the file back, but we can do so later.


pkg/sql/execbuilder_distsql_spec_factory.go, line 32 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Why add this if it's not used?

Removed.

@yuzefovich yuzefovich force-pushed the distsql branch 2 times, most recently from 90dbd48 to 744d067 Compare May 28, 2020 22:44
@RaduBerinde
Copy link
Member

CC @rytaft

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)


pkg/sql/execbuilder_distsql_spec_factory.go, line 24 at r1 (raw file):

Previously, yuzefovich wrote…

Renamed the struct and the file.

My initial thinking for the filename was that eventually the file should live in execbuilder package, so I used it as a prefix, but I'm not sure why I didn't create the file in that package right away, moved. Possibly we will need to move the file back, but we can do so later.

I moved the file back into pkg/sql because currently opt_exec_factory is quite strongly coupled with that package, and I'm worried that moving the necessary pieces into a shared "base" package will be a little too much (for the moment).

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)


pkg/sql/exec_util.go, line 257 at r1 (raw file):

Previously, yuzefovich wrote…

Renamed it to be inline with the actual setting name.

I'm still debating on what the naming should be: currently I have "experimental distsql planning", and that implies not only the spec creation but also actual distsql physical planning, probably it's acceptable, but then the scope of the project increases. I don't think we'll get to the "physical planning" part in the near term, but it might make sense to keep this naming as an aspirational goal.


pkg/sql/plan_opt.go, line 185 at r3 (raw file):

Previously, yuzefovich wrote…

I also thought about this, but the difficulty is if you set your cluster setting to always of whatever, you won't be able to change that easily. I tried catching SET statements and use the old factory for that, but internally we use INSERT into a system table (I guess) and maybe other statements, so it seemed pretty hard.

That is why (at least for now) I decided to not implement the fallback from the new factory to the old one if the former doesn't support the statement, but apply the heuristic to see whether the statement is a SELECT.

I've actually just added this always mode and changed the setting to be more like what we have with vectorize after I did more prototyping.

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 20 of 20 files at r4, 14 of 14 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @yuzefovich)


pkg/sql/distsql_physical_planner.go, line 2295 at r5 (raw file):

) (physPlan PhysicalPlan, err error) {
	if plan.physPlan != nil {
		return *plan.physPlan, nil

Could we change createPhysPlan to return a pointer?


pkg/sql/exec_util.go, line 257 at r1 (raw file):

Previously, yuzefovich wrote…

I'm still debating on what the naming should be: currently I have "experimental distsql planning", and that implies not only the spec creation but also actual distsql physical planning, probably it's acceptable, but then the scope of the project increases. I don't think we'll get to the "physical planning" part in the near term, but it might make sense to keep this naming as an aspirational goal.

That's fine. What parts of physical planning are you referring to that we won't get to?


pkg/sql/exec_util.go, line 259 at r3 (raw file):

var execBuilderDrivenDistSQLSpecCreationClusterMode = settings.RegisterBoolSetting(
	"sql.defaults.experimental_distsql_planning.enabled",
	"if set to true, enables experimental DistSQL spec creation driven by the execbuilder",

This is useful help text, should we keep it in?


pkg/sql/exec_util.go, line 933 at r5 (raw file):

	} else {
		var err error
		rec, err = checkSupportForPlanNode(plan.planNode)

is it worth it to change checkSupport to take in a planMaybePhysical similar to createPlan?


pkg/sql/explain_distsql.go, line 79 at r5 (raw file):

	plan planMaybePhysical,
) bool {
	if plan.physPlan == nil {

I think it would be nice to abstract this check behind a method (e.g. isPhysicalPlan)


pkg/sql/explain_plan.go, line 299 at r5 (raw file):

	if plan.main.physPlan != nil {
		return errors.AssertionFailedf(
			"EXPLAIN of a query with opt-driven DistSQL planning is not supported",

nit: s/opt-driven/experimental to be consistent, otherwise we should rename experimental everywhere else to opt-driven experimental. Maybe that'd be better since it's more specific.


pkg/sql/plan.go, line 323 at r3 (raw file):

Previously, yuzefovich wrote…

To me such an approach breaks an assumption that a planNode represents a single node in the tree because this new wrapper plan node would contain the whole physical plan at once, i.e. it would be a "meta" node.

I guess we could change the perspective a little and, instead, introduce a "plan-node-that-might-be-distsql-processor-spec" which would be the distsql spec for a single plan node and not the whole tree, but I don't know whether it'll be better.

This is one of those things that I learn by doing, and we're still laying out the plumbing before doing some meaningful work. From my point of view, it should be ok to merge something that moves us forward in the prototyping of this experimental planning (and doesn't break the normal flow), and we can iterate on the representation later.

That makes sense.


pkg/sql/execbuilder_distsql_spec_factory.go, line 24 at r1 (raw file):

Previously, yuzefovich wrote…

I moved the file back into pkg/sql because currently opt_exec_factory is quite strongly coupled with that package, and I'm worried that moving the necessary pieces into a shared "base" package will be a little too much (for the moment).

Yeah, I think having it in the sql package is the right move. Not sure it'll ever belong in execbuilder TBH given the physical planning dependencies.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)


pkg/sql/distsql_physical_planner.go, line 2295 at r5 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Could we change createPhysPlan to return a pointer?

Done.


pkg/sql/exec_util.go, line 257 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

That's fine. What parts of physical planning are you referring to that we won't get to?

I meant making more "optimal" decisions in terms of distribution of work rather than simply transitioning the logic (and the heuristics) we already have in distsql_physical_planner.


pkg/sql/exec_util.go, line 259 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

This is useful help text, should we keep it in?

Done.


pkg/sql/exec_util.go, line 933 at r5 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

is it worth it to change checkSupport to take in a planMaybePhysical similar to createPlan?

I'm still debating about this - I think ideally we should be able to call ConstructPlan that will return planMaybePhysical with the distribution recommendation set, so we would be "checking support" while constructing at the same time, so I'm inclined to keep this as checkSupportForPlanNode as legacy code and attempt to move away from it.


pkg/sql/explain_distsql.go, line 79 at r5 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I think it would be nice to abstract this check behind a method (e.g. isPhysicalPlan)

Done.


pkg/sql/explain_plan.go, line 299 at r5 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

nit: s/opt-driven/experimental to be consistent, otherwise we should rename experimental everywhere else to opt-driven experimental. Maybe that'd be better since it's more specific.

Done.


pkg/sql/plan.go, line 323 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

That makes sense.

Done.


pkg/sql/execbuilder_distsql_spec_factory.go, line 24 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Yeah, I think having it in the sql package is the right move. Not sure it'll ever belong in execbuilder TBH given the physical planning dependencies.

I think eventually we should be able to pull that into execbuilder once the optimizer is aware of the nodes' physical placement.

@yuzefovich yuzefovich force-pushed the distsql branch 3 times, most recently from 2d5235f to ed4ff47 Compare June 2, 2020 03:28
Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 16 of 16 files at r6, 14 of 14 files at r7, 9 of 9 files at r8.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. Just taking a quick look at this now. I've looked at the first commit, checking the others now.

Some nits in the commit/PR message:

  • There's no such thing as opt.Tree.
  • I don't understand this sentence: "Currently the fallback doesn't occur with always only for SELECT statements (so that we could run other statements types, like SET)."
  • "logical -> "logical"
  • all processor spec -> all processor specs

Reviewed 3 of 20 files at r4, 16 of 16 files at r6.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto and @yuzefovich)


pkg/sql/exec_util.go, line 260 at r6 (raw file):

// experimentalDistSQLPlanningClusterMode can be used to enable
// optimizer-driven DistSQL planning that sidesteps intermediate planNode
// transition when going from opt.Tree to DistSQL processor specs.

[nit] opt.Tree does not exist


pkg/sql/exec_util.go, line 267 at r6 (raw file):

	map[int64]string{
		int64(sessiondata.ExperimentalDistSQLPlanningOff): "off",
		int64(sessiondata.ExperimentalDistSQLPlanningOn):  "on",

Seems weird to have a different set of allowed values for the cluster setting than the session setting. Why not include "always" here?


pkg/sql/opt_exec_factory.go, line 46 at r6 (raw file):

var _ exec.Factory = &execFactory{}

func makeExecFactory(p *planner) execFactory {

why did you change this?


pkg/sql/plan_opt.go, line 193 at r6 (raw file):

				// We use a simple heuristic to check whether the statement is
				// a SELECT, and the reasoning behind it is that we want to be
				// able to run certain statement types (e.g. SET) regardless of

This logic is a bit confusing. Why not just check if the statement is SET?

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 14 of 14 files at r7, 9 of 9 files at r8.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @asubiotto and @yuzefovich)

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed the nits in the commit messages.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @asubiotto and @rytaft)


pkg/sql/exec_util.go, line 260 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] opt.Tree does not exist

Done.


pkg/sql/exec_util.go, line 267 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Seems weird to have a different set of allowed values for the cluster setting than the session setting. Why not include "always" here?

I'm worried about the case of not supporting SET statements which would not allow to change the cluster setting from always easily once it is set. We had a similar issue with vectorize cluster setting, and if I remember correctly I had to do some manual system table manipulation to reset the cluster setting once it was set to experimental_always.


pkg/sql/opt_exec_factory.go, line 46 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

why did you change this?

I think there is a convention that if the method returns a value, then make is used, but if it returns a pointer, then new is used. We've changed the signature to return a pointer, so I made the corresponding change to the method name. If the question is why change it to return a pointer - Alfonso asked why not, and I didn't have an explanation. In fact, in two places where this method is called we already where taking the pointer from the return value when using it.


pkg/sql/plan_opt.go, line 193 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

This logic is a bit confusing. Why not just check if the statement is SET?

That was my first attempt, but execution of SET statement behind the scenes involves an INSERT into a system table (and possibly other statement types), and at least for now the goal is supporting kv workload, so restricting to SELECT in particular seems reasonable.

This commit adds a stub implementation of `exec.Factory` interface that
will be creating DistSQL processor specs directly from `opt.Expr`,
sidestepping intermediate `planNode` phase.

It also introduces a new private cluster setting
"sql.defaults.experimental_distsql_planning" as well as a session
variable "experimental_distsql_planning" which determine whether the new
factory is used, set to `off` by default (other options are `on` and
`always` - the latter is only for the session variable).

`Off` planning mode means using the old code path, `on` means attempting
to use the new code path but falling back to the old one if we encounter
an error, and `always` means using only the new code path and do not
fallback in case of an error. Currently the fallback doesn't occur with
`always` only for SELECT statements (so that we could run other
statements types, like SET), meaning that when`always` option is used,
if we encounter an unsupported node while planning a SELECT statement,
an error is returned, but if we encounter an unsupported node while
planning a statement other than SELECT, we still fallback to the old
code (in a sense `always` behaves exactly like `on` for all statement
types except for SELECTs).

Release note: None
This commit introduces `planMaybePhysical` utility struct that
represents a plan and uses either planNode ("logical") or DistSQL spec
("physical") representations. It will be removed once we are able to
create all processor specs in execbuilder directly. This struct has been
plumbed in all places that need to look at the plan, but in most of them
only the logical representation is supported. However, the main codepath
for executing a statement supports both, and if physical representation
is used, then we bypass distsql physical planner.

This commit also renames `checkSupportForNode` to
`checkSupportForPlanNode` and `createPlanForNode` to
`createPhysPlanForPlanNode` to make their purpose more clear.

Release note: None
This commit changes `createPlanFor*` methods to operate on pointers to
`PhysicalPlan` instead of values. It also renames and cleans up some
explain-related utility methods.

Release note: None
@yuzefovich
Copy link
Member Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Jun 2, 2020

Build succeeded

@craig craig bot merged commit 464e5c4 into cockroachdb:master Jun 2, 2020
@yuzefovich yuzefovich deleted the distsql branch June 2, 2020 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants