Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colexec: clean up memory infra on panic in NewColOperator #44564

Merged
merged 1 commit into from Feb 4, 2020

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jan 30, 2020

It is possible that an unexpected error occurs during NewColOperator
call after we have created some memory monitoring infrastructure. In
order to correctly clean up the infra in such scenario we will now have
a separate panic catcher that closes the infra within NewColOperator.
We will also now clean up the infra when an error occurs.

Release note: None

@yuzefovich yuzefovich requested review from asubiotto and a team January 30, 2020 21:46
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to do this panic catching and cleanup inside NewColOperator (and any cleanup as well), seems like the current abstraction is a bit leaky if the caller is expected to clean up state it doesn't know about.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)

@yuzefovich yuzefovich changed the title colflow: wrap NewColOperator call with a panic catcher colexec: clean up memory infra on panic in NewColOperator Feb 3, 2020
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/colexec/execplan.go, line 362 at r1 (raw file):

) (result NewColOperatorResult, err error) {
	// Make sure that we clean up memory monitoring infrastructure in case of a
	// panic.

It'll also be good to do this on error so that the caller doesn't have to

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)


pkg/sql/colexec/execplan.go, line 362 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

It'll also be good to do this on error so that the caller doesn't have to

Hm, I don't really like this idea but not sure why. I feel like the caller will already have the code to clean up the infra (for the non-error case), so it might as well clean up the infra in the error case.

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/colexec/execplan.go, line 362 at r1 (raw file):

Previously, yuzefovich wrote…

Hm, I don't really like this idea but not sure why. I feel like the caller will already have the code to clean up the infra (for the non-error case), so it might as well clean up the infra in the error case.

I think that the caller should never have to worry about intermediate state that wasn't cleaned up on error when calling a function. Up to you, since we already have code that does that

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)


pkg/sql/colexec/execplan.go, line 362 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I think that the caller should never have to worry about intermediate state that wasn't cleaned up on error when calling a function. Up to you, since we already have code that does that

Alright, this sounds reasonable, done. I think this will work, right?

It is possible that an unexpected error occurs during NewColOperator
call after we have created some memory monitoring infrastructure. In
order to correctly clean up the infra in such scenario we will now have
a separate panic catcher that closes the infra within NewColOperator.
We will also now clean up the infra when an error occurs.

Release note: None
Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/colexec/execplan.go, line 362 at r1 (raw file):

Previously, yuzefovich wrote…

Alright, this sounds reasonable, done. I think this will work, right?

:lgtm:, maybe we can remove the cleanup in the code that calls setupVectorized?

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r+

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)


pkg/sql/colexec/execplan.go, line 362 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

:lgtm:, maybe we can remove the cleanup in the code that calls setupVectorized?

Which code are you referring to? Maybe cleaning up in SupportsVectorized? In any case, I don't think we can remove any code because the caller is still responsible for clean up in non-error case, the only thing I can think of is that we could now add an if err == nil to it, but I don't think it'd help in any way.

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/colexec/execplan.go, line 362 at r1 (raw file):

Previously, yuzefovich wrote…

Which code are you referring to? Maybe cleaning up in SupportsVectorized? In any case, I don't think we can remove any code because the caller is still responsible for clean up in non-error case, the only thing I can think of is that we could now add an if err == nil to it, but I don't think it'd help in any way.

I was thinking of setupVectorized, which still does stuff if err != nil (if err == nil the cleanup happens on flow.Cleanup())

@yuzefovich
Copy link
Member Author

Oh, I see. I think it should be possible.

bors r-

@craig
Copy link
Contributor

craig bot commented Feb 4, 2020

Canceled

@yuzefovich
Copy link
Member Author

Actually, I take that back - we're creating memory infra for non-operators as well (hash router, synchronizers), and that happens outside of NewColOperator, so we still need to clean up all the infra even in non-error case.

bors r+

craig bot pushed a commit that referenced this pull request Feb 4, 2020
44564: colexec: clean up memory infra on panic in NewColOperator r=yuzefovich a=yuzefovich

It is possible that an unexpected error occurs during NewColOperator
call after we have created some memory monitoring infrastructure. In
order to correctly clean up the infra in such scenario we will now have
a separate panic catcher that closes the infra within NewColOperator.
We will also now clean up the infra when an error occurs.

Release note: None

44615: storage: remove preemptive snapshots r=tbg a=ajwerner

This commit adds a migration to remove any remaining on-disk pre-emptive
snapshots and then excises the concept from the codebase.

There's some cruft in replica_init that I'd love to clean up but don't
have a clear vision of how.

Release note: None

44660: colexec: reject CASE operator with Bytes output type r=yuzefovich a=yuzefovich

Currently, there is a contradiction between the way CASE operator
works (which populates its output in arbitrary order) and the flat
bytes implementation of Bytes type (which prohibits sets in arbitrary
order), so we reject such scenario to fall back to row-by-row engine.

Fixes: #44624.

Release note: None

44687: cli: add system.namespace_deprecated to debug zip r=knz a=otan

As we are migrating system.namespace from system.namespace_deprecated in
20.2, it may be useful to have a zip of this old table, in case it comes
in handy for debugging in future.

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Feb 4, 2020

Build succeeded

@craig craig bot merged commit 89834b7 into cockroachdb:master Feb 4, 2020
@yuzefovich yuzefovich deleted the fix-execplan-infra branch February 4, 2020 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants