Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: do not fallback to old namespace table for non-public schema #51359

Merged
merged 1 commit into from Aug 5, 2020

Conversation

arulajmani
Copy link
Collaborator

Previously, GetObjectNames would fallback to the old namespace table
unconditionally. This was incorrect when looking for objects under a
schema which wasn't public, as entries in the old namespace table are
implied to exist under the public schema. This manifested itself in
unintended scenarios, such as #51219, where we would try to delete all
objects constructed before 20.1 upgrade -- erroneously assuming they
are temporary objects.

This PR fixes that by ensuring the fallback logic only applies to
the public schema name case. Additionally this patch adds tests
to guard against the aforementioned bug.

Closes #51358

Release note (bug fix): A bug with temporary object cleaner was stuck
trying to remove objects that it mistakenly thought were temporary is
fixed. Note that no persistent data was deleted -- the temporary
cleaner simply errored out because it thought certain persistent data
was temporary.

@arulajmani arulajmani requested a review from rohany July 13, 2020 00:20
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@@ -272,3 +328,47 @@ func TestTemporarySchemaDropDatabase(t *testing.T) {
assert.Equal(t, 0, tempObjectCount)
}
}

// ensureTemporaryObjectsAreDeleted ensures all the tempNames have been deleted.
// This can take a longer amount of time if the job takes time/ lease doesn't
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use lease.TestingDisableTableLeases() to avoid this

// will only be present in the older system.namespace. To account for this
// scenario, we must do this filtering logic.
// TODO(solon): This complexity can be removed in 20.2.
if scName == tree.PublicSchema {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be clearer to just return early if scName != tree.PublicSchema?

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR. Should be RFAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rohany)


pkg/sql/temporary_schema_test.go, line 333 at r1 (raw file):

Previously, rohany (Rohan Yadav) wrote…

I think you can use lease.TestingDisableTableLeases() to avoid this

Is in the function fine or does this need to happen during test setup? I've done it in the function for now, let me know if this needs changing.


pkg/sql/catalog/catalogkv/physical_accessor.go, line 144 at r1 (raw file):

Previously, rohany (Rohan Yadav) wrote…

would it be clearer to just return early if scName != tree.PublicSchema?

Done.

@arulajmani arulajmani force-pushed the name-resolution-scan-objects branch from a3ac044 to ca587ee Compare July 13, 2020 19:54
@arulajmani arulajmani requested a review from rohany July 13, 2020 19:55
func ensureTemporaryObjectsAreDeleted(
ctx context.Context, t *testing.T, conn *sql.Conn, schemaName string, tempNames []string,
) {
lease.TestingDisableTableLeases()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put this in the test, not the helper -- you have to defer lease.TestingDisableTableLeases()() so that it is reset after the test.

) {
lease.TestingDisableTableLeases()
for _, name := range tempNames {
testutils.SucceedsSoon(t, func() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you still need this succeeds soon then?

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @otan, and @rohany)


pkg/sql/temporary_schema_test.go, line 340 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

do you still need this succeeds soon then?

I guess I do because the job might take time to succeed, but I'll cc @otan because he wrote this initially, I just moved it into a function.

@rohany
Copy link
Contributor

rohany commented Jul 14, 2020

If you need the succeeds soon anyway, then we can avoid the lease disabling all together

@otan
Copy link
Contributor

otan commented Jul 14, 2020


pkg/sql/temporary_schema_test.go, line 340 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

I guess I do because the job might take time to succeed, but I'll cc @otan because he wrote this initially, I just moved it into a function.

iirc it was flaking without this, but with leasing disabled maybe not.
(why are we testing with leasing disabled?)

@arulajmani arulajmani force-pushed the name-resolution-scan-objects branch from ca587ee to 70f9146 Compare July 14, 2020 23:04
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do need the succeeds soon stuff regardless -- removed the lease disabling. RFAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rohany)


pkg/sql/temporary_schema_test.go, line 338 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

put this in the test, not the helper -- you have to defer lease.TestingDisableTableLeases()() so that it is reset after the test.

Done.


pkg/sql/temporary_schema_test.go, line 340 at r2 (raw file):

Previously, otan (Oliver Tan) wrote…

iirc it was flaking without this, but with leasing disabled maybe not.
(why are we testing with leasing disabled?)

Tried removing it and I'm getting some flakes because the job hasn't completed -- so I think we do need the succeed soon stuff here regardless of the defer lease.TestingDisableTableLeases()()

@arulajmani arulajmani requested a review from rohany July 14, 2020 23:16
@rohany
Copy link
Contributor

rohany commented Jul 14, 2020

Lgtm

@rohany
Copy link
Contributor

rohany commented Jul 14, 2020

Actually, this seems weird that we need the succeeds soon — the job executes serially and we are just running drop statements in the job.

@otan
Copy link
Contributor

otan commented Jul 14, 2020


pkg/sql/temporary_schema_test.go, line 340 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Tried removing it and I'm getting some flakes because the job hasn't completed -- so I think we do need the succeed soon stuff here regardless of the defer lease.TestingDisableTableLeases()()

i think the JOB to actually run the schema changes isn't run serially before this statement

@arulajmani arulajmani force-pushed the name-resolution-scan-objects branch from 70f9146 to 8702dca Compare July 15, 2020 23:51
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got TestingDisableTableLeases to pass on stress.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rohany)

@otan
Copy link
Contributor

otan commented Aug 5, 2020

should this be movin'?

@arulajmani
Copy link
Collaborator Author

@rohany had some questions over why we need the succeeds soon/disable leases in the first place. Do we want to block this until then or maybe put a TODO/open an issue and address later?

@rohany
Copy link
Contributor

rohany commented Aug 5, 2020

Putting a TODO and an opening an issue to investigate is fine. I didn't want to hold up on this bugfix + backport.

Previously, `GetObjectNames` would fallback to the old namespace table
unconditionally. This was incorrect when looking for objects under a
schema which wasn't `public`, as entries in the old namespace table are
implied to exist under the `public` schema. This manifested itself in
unintended scenarios, such as cockroachdb#51219, where we would try to delete all
objects constructed before 20.1 upgrade -- erroneously assuming they
are temporary objects.

This PR fixes that by ensuring the fallback logic only applies to
the `public` schema name case. Additionally this patch adds tests
to guard against the aforementioned bug.

Closes cockroachdb#51358

Release note (bug fix): A bug with temporary object cleaner was stuck
trying to remove objects that it mistakenly thought were temporary is
fixed. Note that no persistent data was deleted -- the temporary
cleaner simply errored out because it thought certain persistent data
was temporary.
@arulajmani
Copy link
Collaborator Author

Opened #52412 and added a TODO for myself. RFAL

@arulajmani
Copy link
Collaborator Author

Thanks for the reviews.

bors r=rohany

@craig
Copy link
Contributor

craig bot commented Aug 5, 2020

Build succeeded:

@craig craig bot merged commit 3b193fe into cockroachdb:master Aug 5, 2020
@otan
Copy link
Contributor

otan commented Aug 11, 2020

did we backport this?

@arulajmani
Copy link
Collaborator Author

Thanks for the ping, it totally slipped me. Opened a backport rn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: GetObjectNames should not fallback to the old namespace table when the schema is not public
4 participants