Skip to content

Commit

Permalink
[KYUUBI #3968] Upgrading and migration script for Jdbc
Browse files Browse the repository at this point in the history
### _Why are the changes needed?_

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #4022 from lightning-L/kyuubi-3968.

Closes #3968

8560a2f [lightning_L] Merge pull request #2 from turboFei/kyuubi-3968
5f76107 [fwang12] follow up
7f6cb1b [lightning_L] Merge pull request #1 from turboFei/kyuubi-3968
cc0d6cb [lightning_L] Merge branch 'apache:master' into kyuubi-3968
46ea82e [fwang12] nit
11b1f8c [fwang12] follow up
54fa3df [Tianlin Liao] fix NPE when folder does not exist
0353e70 [Tianlin Liao] update regex
a475f7b [Tianlin Liao] list all metadata store sql files and use the one with the largest version number
043b43b [Tianlin Liao] fix license
871b60e [Tianlin Liao] fix
ece1f60 [Tianlin Liao] fix
40831c5 [Tianlin Liao] [KYUUBI #3968] Upgrading and migration script for Jdbc

Lead-authored-by: Tianlin Liao <tiliao@ebay.com>
Co-authored-by: lightning_L <tianlinliao@163.com>
Co-authored-by: fwang12 <fwang12@ebay.com>
Signed-off-by: fwang12 <fwang12@ebay.com>
  • Loading branch information
3 people committed Jan 7, 2023
1 parent 2a6b8ec commit daa8d45
Show file tree
Hide file tree
Showing 14 changed files with 329 additions and 20 deletions.
2 changes: 2 additions & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ build/scala-*/**
**/org/apache/kyuubi/ui/swagger/**
**/org.apache.spark.status.AppHistoryServerPlugin
**/metadata-store-schema*.sql
**/*.derby.sql
**/*.mysql.sql
**/pnpm-lock.yaml
**/node_modules/**
**/gen/*
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
ALTER TABLE metadata ALTER COLUMN session_type varchar(32) NOT NULL;
ALTER TABLE metadata ALTER COLUMN real_user varchar(255) NOT NULL;
ALTER TABLE metadata ALTER COLUMN user_name varchar(255) NOT NULL;
ALTER TABLE metadata ALTER COLUMN ip_address varchar(128);
ALTER TABLE metadata ALTER COLUMN engine_type varchar(32) NOT NULL;
ALTER TABLE metadata ALTER COLUMN engine_state varchar(32);

DROP INDEX metadata_kyuubi_instance_index;
98 changes: 98 additions & 0 deletions kyuubi-server/src/main/resources/sql/derby/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

Kyuubi MetaStore Upgrade HowTo
============================

This document describes how to upgrade the schema of a Derby backed
Kyuubi MetaStore instance from one release version of Kyuubi to another
release version of Kyuubi. For example, by following the steps listed
below it is possible to upgrade a Kyuubi 1.6.0 MetaStore schema to a
Kyuubi 1.7.0 MetaStore schema. Before attempting this project we
strongly recommend that you read through all of the steps in this
document and familiarize yourself with the required tools.

MetaStore Upgrade Steps
=======================

1) Shutdown your MetaStore instance and restrict access to the
MetaStore's Derby database. It is very important that no one else
accesses or modifies the contents of database while you are
performing the schema upgrade.

2) Create a backup of your Derby metastore database. This will allow
you to revert any changes made during the upgrade process if
something goes wrong. The easiest way of accomplishing this task is
by creating a copy of the directory containing your Derby database.

3) Dump your MetaStore database schema to a file using Derby's dblook utility:

% dblook -d <derby_db_url> -z "APP" > my-schema-x.y.z.derby.sql

Note that "APP" is Derby's default schema for user-created catalog
objects.


4) The schema upgrade scripts assume that the schema you are upgrading
closely matches the official schema for your particular version of
Kyuubi. The files in this directory with names like
"metadata-store-schema-x.y.z.derby.sql" contain dumps of the official schemas
corresponding to each of the released versions of Kyuubi. You can
determine differences between your schema and the official schema
by comparing the contents of the official dump with the schema dump
you created in the previous step. Note that due to a bug in Derby
the order in which the DDL statements appear is non-deterministic,
so simply diffing the two dumps is unlikely to result in useable
results. A simple workaround for this problem is to compare sorted
versions of the two schema dumps, e.g:

% sort metadata-store-schema-x.y.z.derby.sql > metadata-store-schema-x.y.z.derby.sql.sorted
% sort my-schema-x.y.z.derby.sql > my-schema-x.y.z.derby.sql.sorted
% diff metadata-store-schema-x.y.z.derby.sql.sorted my-schema-x.y.z.derby.sql.sorted

Some differences are acceptable and will not interfere
with the upgrade process, but others need to be resolved manually
or the upgrade scripts will fail to complete.

5) You are now ready to run the schema upgrade scripts. If you are
upgrading from Kyuubi 1.6.0 to Kyuubi 1.7.0 you need to run the
upgrade-1.6.0-to-1.7.0.derby.sql script, but if you are upgrading
from 1.6.0 to 1.8.0 you will need to run the 1.6.0 to 1.7.0 upgrade
script followed by the 1.7.0 to 1.8.0 upgrade script.

NOTE: You may need to install the Derby 'ij' utility.
Look here for installation instructions:
http://db.apache.org/derby/docs/10.4/getstart/

% ij
ij version 10.4
ij> CONNECT 'jdbc:derby:/Users/bob/hive/metastore_db';
ij> RUN 'upgrade-1.6.0-to-1.7.0.derby.sql';
ij> RUN 'upgrade-1.7.0-to-1.8.0.derby.sql';
ij> quit;

These scripts should run to completion without any errors. If you
do encounter errors you need to analyze the cause and attempt to
trace it back to one of the preceding steps.

6) The final step of the upgrade process is validating your freshly
upgraded schema against the official schema for your particular
version of Kyuubi. This is accomplished by repeating steps (3) and
(4), but this time comparing against the official version of the
upgraded schema, e.g. if you upgraded the schema to Kyuubi 1.7.0 then
you will want to compare your schema dump against the contents of
metadata-store-schema-1.7.0.derby.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
-- Derby does not support `CREATE TABLE IF NOT EXISTS`

-- the metadata table ddl

CREATE TABLE metadata(
key_id bigint PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY, -- the auto increment key id
identifier varchar(36) NOT NULL, -- the identifier id, which is an UUID
session_type varchar(128) NOT NULL, -- the session type, SQL or BATCH
real_user varchar(1024) NOT NULL, -- the real user
user_name varchar(1024) NOT NULL, -- the user name, might be a proxy user
ip_address varchar(512), -- the client ip address
kyuubi_instance varchar(1024) NOT NULL, -- the kyuubi instance that creates this
state varchar(128) NOT NULL, -- the session state
resource varchar(1024), -- the main resource
class_name varchar(1024), -- the main class name
request_name varchar(1024), -- the request name
request_conf clob, -- the request config map
request_args clob, -- the request arguments
create_time BIGINT NOT NULL, -- the metadata create time
engine_type varchar(1024) NOT NULL, -- the engine type
cluster_manager varchar(128), -- the engine cluster manager
engine_id varchar(128), -- the engine application id
engine_name clob, -- the engine application name
engine_url varchar(1024), -- the engine tracking url
engine_state varchar(128), -- the engine application state
engine_error clob, -- the engine application diagnose
end_time bigint, -- the metadata end time
peer_instance_closed boolean default FALSE -- closed by peer kyuubi instance
);

CREATE INDEX metadata_kyuubi_instance_index ON metadata(kyuubi_instance);

CREATE UNIQUE INDEX metadata_unique_identifier_index ON metadata(identifier);

CREATE INDEX metadata_user_name_index ON metadata(user_name);

CREATE INDEX metadata_engine_type_index ON metadata(engine_type);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
RUN '001-KYUUBI-3967.derby.sql';
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
SELECT '< KYUUBI-3967: Shorten column varchar length of metadata table >' AS ' ';

ALTER TABLE metadata MODIFY COLUMN session_type varchar(32) NOT NULL COMMENT 'the session type, SQL or BATCH';
ALTER TABLE metadata MODIFY COLUMN real_user varchar(255) NOT NULL COMMENT 'the real user';
ALTER TABLE metadata MODIFY COLUMN user_name varchar(255) NOT NULL COMMENT 'the user name, might be a proxy user';
ALTER TABLE metadata MODIFY COLUMN ip_address varchar(128) COMMENT 'the client ip address';
ALTER TABLE metadata MODIFY COLUMN engine_type varchar(32) NOT NULL COMMENT 'the engine type';
ALTER TABLE metadata MODIFY COLUMN engine_state varchar(32) COMMENT 'the engine application state';

DROP INDEX kyuubi_instance_index on metadata;
87 changes: 87 additions & 0 deletions kyuubi-server/src/main/resources/sql/mysql/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

Kyuubi MetaStore Upgrade HowTo
============================

This document describes how to upgrade the schema of a MySQL backed
Kyuubi MetaStore instance from one release version of Kyuubi to another
release version of Kyuubi. For example, by following the steps listed
below it is possible to upgrade a Kyuubi 1.6.0 MetaStore schema to a
Kyuubi 1.7.0 MetaStore schema. Before attempting this project we
strongly recommend that you read through all of the steps in this
document and familiarize yourself with the required tools.

MetaStore Upgrade Steps
=======================

1) Shutdown your MetaStore instance and restrict access to the
MetaStore's MySQL database. It is very important that no one else
accesses or modifies the contents of database while you are
performing the schema upgrade.

2) Create a backup of your MySQL metastore database. This will allow
you to revert any changes made during the upgrade process if
something goes wrong. The mysqldump utility is the easiest way to
create a backup of a MySQL database:

% mysqldump --opt <metastore_db_name> > metastore_backup.sql

Note that you may need also need to specify a hostname and username
using the --host and --user command line switches.

3) Dump your metastore database schema to a file. We use the mysqldump
utility again, but this time with a command line option that
specifies we are only interested in dumping the DDL statements
required to create the schema:

% mysqldump --skip-add-drop-table --no-data <metastore_db_name> > my-schema-x.y.z.mysql.sql

4) The schema upgrade scripts assume that the schema you are upgrading
closely matches the official schema for your particular version of
Kyuubi. The files in this directory with names like
"metadata-store-schema-x.y.z.mysql.sql" contain dumps of the official schemas
corresponding to each of the released versions of Kyuubi. You can
determine differences between your schema and the official schema
by diffing the contents of the official dump with the schema dump
you created in the previous step. Some differences are acceptable
and will not interfere with the upgrade process, but others need to
be resolved manually or the upgrade scripts will fail to complete.

5) You are now ready to run the schema upgrade scripts. If you are
upgrading from Kyuubi 1.6.0 to Kyuubi 1.7.0 you need to run the
upgrade-1.6.0-to-1.7.0.mysql.sql script, but if you are upgrading
from 1.6.0 to 1.8.0 you will need to run the 1.6.0 to 1.7.0 upgrade
script followed by the 1.7.0 to 1.8.0 upgrade script.

% mysql --verbose
mysql> use <metastore_db_name>;
Database changed
mysql> source upgrade-1.6.0-to-1.7.0.mysql.sql
mysql> source upgrade-1.7.0-to-1.8.0.mysql.sql

These scripts should run to completion without any errors. If you
do encounter errors you need to analyze the cause and attempt to
trace it back to one of the preceding steps.

6) The final step of the upgrade process is validating your freshly
upgraded schema against the official schema for your particular
version of Kyuubi. This is accomplished by repeating steps (3) and
(4), but this time comparing against the official version of the
upgraded schema, e.g. if you upgraded the schema to Kyuubi 1.7.0 then
you will want to compare your schema dump against the contents of
metadata-store-schema-1.7.0.mysql.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
-- the metadata table ddl

CREATE TABLE IF NOT EXISTS metadata(
key_id bigint PRIMARY KEY AUTO_INCREMENT COMMENT 'the auto increment key id',
identifier varchar(36) NOT NULL COMMENT 'the identifier id, which is an UUID',
session_type varchar(128) NOT NULL COMMENT 'the session type, SQL or BATCH',
real_user varchar(1024) NOT NULL COMMENT 'the real user',
user_name varchar(1024) NOT NULL COMMENT 'the user name, might be a proxy user',
ip_address varchar(512) COMMENT 'the client ip address',
kyuubi_instance varchar(1024) NOT NULL COMMENT 'the kyuubi instance that creates this',
state varchar(128) NOT NULL COMMENT 'the session state',
resource varchar(1024) COMMENT 'the main resource',
class_name varchar(1024) COMMENT 'the main class name',
request_name varchar(1024) COMMENT 'the request name',
request_conf mediumtext COMMENT 'the request config map',
request_args mediumtext COMMENT 'the request arguments',
create_time BIGINT NOT NULL COMMENT 'the metadata create time',
engine_type varchar(1024) NOT NULL COMMENT 'the engine type',
cluster_manager varchar(128) COMMENT 'the engine cluster manager',
engine_id varchar(128) COMMENT 'the engine application id',
engine_name mediumtext COMMENT 'the engine application name',
engine_url varchar(1024) COMMENT 'the engine tracking url',
engine_state varchar(128) COMMENT 'the engine application state',
engine_error mediumtext COMMENT 'the engine application diagnose',
end_time bigint COMMENT 'the metadata end time',
peer_instance_closed boolean default '0' COMMENT 'closed by peer kyuubi instance',
INDEX kyuubi_instance_index(kyuubi_instance),
UNIQUE INDEX unique_identifier_index(identifier),
INDEX user_name_index(user_name),
INDEX engine_type_index(engine_type)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
SELECT '< Upgrading MetaStore schema from 1.6.0 to 1.7.0 >' AS ' ';
SOURCE 001-KYUUBI-3967.mysql.sql;
SELECT '< Finished upgrading MetaStore schema from 1.6.0 to 1.7.0 >' AS ' ';
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ package org.apache.kyuubi.server.metadata.api
import org.apache.kyuubi.session.SessionType.SessionType

/**
* The metadata to store. It including three parts:
* The metadata store. It includes three parts:
* 1. session related metadata.
* 2. request related metadata.
* 3. engine related metadata.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,13 @@

package org.apache.kyuubi.server.metadata.jdbc

import java.io.{BufferedReader, InputStream, InputStreamReader}
import java.io.{BufferedReader, InputStreamReader}
import java.nio.file.{Files, FileSystems, Paths}
import java.sql.{Connection, PreparedStatement, ResultSet, SQLException}
import java.util.Locale
import java.util.stream.Collectors

import scala.collection.JavaConverters._
import scala.collection.mutable.ListBuffer

import com.fasterxml.jackson.databind.ObjectMapper
Expand Down Expand Up @@ -77,26 +79,50 @@ class JDBCMetadataStore(conf: KyuubiConf) extends MetadataStore with Logging {
}

private def initSchema(): Unit = {
val classLoader = Utils.getContextOrKyuubiClassLoader
val initSchemaStream: Option[InputStream] = dbType match {
case DERBY =>
Option(classLoader.getResourceAsStream("sql/derby/metadata-store-schema-derby.sql"))
case MYSQL =>
Option(classLoader.getResourceAsStream("sql/mysql/metadata-store-schema-mysql.sql"))
case CUSTOM => None
}
initSchemaStream.foreach { inputStream =>
try {
val ddlStatements = new BufferedReader(new InputStreamReader(inputStream)).lines()
.collect(Collectors.joining("\n")).trim.split(";")
withConnection() { connection =>
Utils.tryLogNonFatalError {
ddlStatements.foreach { ddlStatement =>
execute(connection, ddlStatement)
info(s"Execute init schema ddl: $ddlStatement successfully.")
}
getInitSchema(dbType).foreach { schema =>
val ddlStatements = schema.trim.split(";")
withConnection() { connection =>
Utils.tryLogNonFatalError {
ddlStatements.foreach { ddlStatement =>
execute(connection, ddlStatement)
info(s"Execute init schema ddl: $ddlStatement successfully.")
}
}
}
}
}

// Visible for testing.
private[jdbc] def getInitSchema(dbType: DatabaseType): Option[String] = {
val classLoader = Utils.getContextOrKyuubiClassLoader
val schemaPackage = s"sql/${dbType.toString.toLowerCase}"
val schemaUrlPattern = """^metadata-store-schema-(\d+)\.(\d+)\.(\d+)\.(.*)\.sql$""".r
val schemaUrls = ListBuffer[String]()

Option(classLoader.getResource(schemaPackage)).map(_.toURI).foreach { uri =>
val pathNames = if (uri.getScheme == "jar") {
val fs = FileSystems.newFileSystem(uri, Map.empty[String, AnyRef].asJava)
try {
Files.walk(fs.getPath(schemaPackage), 1).iterator().asScala.map(
_.getFileName.toString).filter { name =>
schemaUrlPattern.findFirstMatchIn(name).isDefined
}.toArray
} finally {
fs.close()
}
} else {
Paths.get(uri).toFile.listFiles((_, name) => {
schemaUrlPattern.findFirstMatchIn(name).isDefined
}).map(_.getName)
}
pathNames.foreach(name => schemaUrls += s"$schemaPackage/$name")
}

schemaUrls.sorted.lastOption.map { schemaUrl =>
val inputStream = classLoader.getResourceAsStream(schemaUrl)
try {
new BufferedReader(new InputStreamReader(inputStream)).lines()
.collect(Collectors.joining("\n"))
} finally {
inputStream.close()
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,12 @@ class JDBCMetadataStoreSuite extends KyuubiFunSuite {
assert(jdbcMetadataStore.hikariDataSource.getIdleTimeout == 60000)
}

test("test get init schema stream") {
assert(jdbcMetadataStore.getInitSchema(DatabaseType.DERBY).isDefined)
assert(jdbcMetadataStore.getInitSchema(DatabaseType.MYSQL).isDefined)
assert(jdbcMetadataStore.getInitSchema(DatabaseType.CUSTOM).isEmpty)
}

test("jdbc metadata store") {
val batchId = UUID.randomUUID().toString
val kyuubiInstance = "localhost:10099"
Expand Down

0 comments on commit daa8d45

Please sign in to comment.