Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Keyspace already exists issue caused by OID collision + X #16441

Closed
1 task done
yifanguan opened this issue Mar 15, 2023 · 2 comments
Closed
1 task done

[YSQL] Keyspace already exists issue caused by OID collision + X #16441

yifanguan opened this issue Mar 15, 2023 · 2 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@yifanguan
Copy link
Contributor

yifanguan commented Mar 15, 2023

Jira Link: DB-5849

Description

This is another side-effect issue caused by OID collision issue #16130 + X.
First,
It looks like we end up never clean up the namespace_ids_map_ after deleting a YSQL database.
In catalog_manager.cc

// Remove namespace from CatalogManager name mapping.  Will remove ID map after all Tables gone.
  {
    LockGuard lock(mutex_);
    if (namespace_names_mapper_[database->database_type()].erase(database->name()) < 1) {
      LOG(WARNING) << Format("Could not remove namespace from maps, name=$0, id=$1",
                             database->name(), database->id());
    }
  }

And a small repro on a fresh cluster to illustrate this issue:

connection 1 on node1:
\c yugabyte
CREATE DATABASE qqq; -- 16384
CREATE DATABASE ooo; -- 16385
DROP DATABASE qqq;

connection 2 on node2:
\c ooo
CREATE DATABASE ppp; -- 16384
ERROR:  Keyspace 'ppp' already exists

Workaround is to retry the failed create database statement.

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@Arjun-yb
Copy link
Contributor

Arjun-yb commented Apr 18, 2023

@yifanguan
in 2.18.0.0-b14, We observed

Failed to execute task {"sleepAfterMasterRestartMillis":180000,"sleepAfterTServerRestartMillis":180000,"nodeExporterUser":"prometheus","universeUUID":"c01f90e1-c553-496e-bc74-3572fd536dc6","enableYbc":false,"installYbc":false,"ybcInstalled":false,"encryptionAtRestConfig":{"encryptionAtRestEnabled":false,"opType":"UNDEFINED","type":"DATA_KEY"},"communicationPorts":{"masterHttpPort":7000,"masterRpcPort":7100,"tserverHttpPort":9000,"tserverRpcPort":9100,"ybControllerHttpPort":14000,"ybControllerrRpcPort":18018,"redisS..., hit error:

Task id 0ba4d5ed-0073-4b49-846b-0f7099070afd_PGSQL_TABLE_TYPE_non_colocated_db status: Failed with error COMMAND_FAILED.

in testysqltspcmvandbr test
https://jenkins.dev.yugabyte.com/view/Test%20Jobs/job/itest-system-developer/5177/artifact/logs/2.18.0.0_testysqltspcmvandbr-aws-rf3_20230417_111840/testysqltspcmvandbr-aws-rf3pvlzfvfs.log
Raised issue: #16642
cc: @kripasreenivasan @zlareb1-yb

yifanguan added a commit that referenced this issue Jul 20, 2023
Summary:
OID collision is one general issue for YSQL.
In vanilla PG, OIDs are assigned by a cluster-wide counter.
However, for YSQL, we allocate OIDs in a bit weird way. We allocate OIDs on a per-database level and share the allocated OIDs on tserver for all databases.
In this case, OID collision happens due to the same range of OIDs allocated to and shared by different tservers.

This diff resolves the oid collision issue for YSQL CREATE DATABASE by retrying CREATE DATABASE if oid collision happens.
A more general fix for the OID collision issue will be completed in a future diff.
Jira: DB-5849

Test Plan:
./yb_build.sh --cxx-test pg_libpq-test --gtest_filter PgLibPqTest.RetryCreateDatabasePgOidCollisionFromTservers
Jenkins: urgent

Reviewers: tverona, myang, zdrudi

Reviewed By: myang, zdrudi

Subscribers: ybase, yql, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D27004
@yifanguan
Copy link
Contributor Author

Resolved by commit 1c781b7

yifanguan added a commit that referenced this issue Jul 20, 2023
…ion happens

Summary:
OID collision is one general issue for YSQL.
In vanilla PG, OIDs are assigned by a cluster-wide counter.
However, for YSQL, we allocate OIDs in a bit weird way. We allocate OIDs on a per-database level and share the allocated OIDs on tserver for all databases.
In this case, OID collision happens due to the same range of OIDs allocated to and shared by different tservers.

This diff resolves the oid collision issue for YSQL CREATE DATABASE by retrying CREATE DATABASE if oid collision happens.
A more general fix for the OID collision issue will be completed in a future diff.

Original commit: 1c781b7 / D27004
Jira: DB-5849

Test Plan:
./yb_build.sh --cxx-test pg_libpq-test --gtest_filter PgLibPqTest.RetryCreateDatabasePgOidCollisionFromTservers
Jenkins: urgent

Reviewers: tverona, myang, zdrudi

Reviewed By: myang

Subscribers: ybase, bogdan, yql

Differential Revision: https://phorge.dev.yugabyte.com/D27114
yifanguan added a commit that referenced this issue Jul 27, 2023
…n happens

Summary:
OID collision is one general issue for YSQL.
In vanilla PG, OIDs are assigned by a cluster-wide counter.
However, for YSQL, we allocate OIDs in a bit weird way. We allocate OIDs on a per-database level and share the allocated OIDs on tserver for all databases.
In this case, OID collision happens due to the same range of OIDs allocated to and shared by different tservers.

This diff resolves the oid collision issue for YSQL CREATE DATABASE by retrying CREATE DATABASE if oid collision happens.
A more general fix for the OID collision issue will be completed in a future diff.
Jira: DB-5849

Original commit: 1c781b7 / D27004

Test Plan: ./yb_build.sh --cxx-test pg_libpq-test --gtest_filter PgLibPqTest.RetryCreateDatabasePgOidCollisionFromTservers

Reviewers: tverona, myang, zdrudi

Reviewed By: zdrudi

Subscribers: ybase, yql, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D27247
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

6 participants