Skip to content

Add MySQL as a supported database for the relational JDBC persistence backend#4281

Open
Y-Wakuta wants to merge 1 commit into
apache:mainfrom
Y-Wakuta:feat/mysql-support
Open

Add MySQL as a supported database for the relational JDBC persistence backend#4281
Y-Wakuta wants to merge 1 commit into
apache:mainfrom
Y-Wakuta:feat/mysql-support

Conversation

@Y-Wakuta
Copy link
Copy Markdown

@Y-Wakuta Y-Wakuta commented Apr 23, 2026

This PR continues the work @Jayden-Chiu started in #2704 and owes its overall shape to that earlier effort — the DatabaseType enum extension, the MySQL toMap branching pattern, the testcontainer-based MySQL lifecycle management, and the per-subclass integration-test layout are all derived from @Jayden-Chiu's original proposal. Where this PR differs, it is to incorporate the review feedback (from @adutra, @dimas-b, @eric-maynard, @jbonofre, @flyrain) that landed on #2704 and #2491 after that PR went stale. Concretely:

Changes

Persistence layer

  • Added MYSQL to the DatabaseType enum, with product-name detection in inferFromConnection and display-name support for "mysql". (Follows the enum-addition pattern from Add support for MySQL Metastore #2704.)
  • Added MySQL SQLSTATE handling to DatasourceOperations:
    • 23000 (integrity constraint violation) combined with MySQL vendor error 1062 (duplicate entry) to match PostgreSQL's 23505 precision. MySQL does not have a dedicated SQLSTATE for unique-key violation.
    • 42S02 (table not found) — same code as H2.
  • Added mysql/schema-v4.sql (the structure follows the tables Jayden-Chiu drafted in Add support for MySQL Metastore #2704, upgraded to the current v4 layout). Notable MySQL-specific adjustments:
    • JSONBJSON; ON CONFLICTON DUPLICATE KEY UPDATE; JSON columns use DEFAULT ('{}') to match PostgreSQL's default.
    • USE POLARIS_SCHEMA; in place of CREATE SCHEMA ... SET search_path (in MySQL, a schema is a database; the database is provisioned via the JDBC URL / testcontainer).
    • Indexes declared inside CREATE TABLE because MySQL lacks CREATE INDEX IF NOT EXISTS. The bootstrap re-runs this script per realm, and idempotency is covered by CREATE TABLE IF NOT EXISTS.
    • entities.name sized to VARCHAR(256) to match Polaris' MAX_IDENTIFIER_LENGTH while staying within the InnoDB 3072-byte key length for the (realm_id, ..., name) unique key.
    • location_without_scheme indexed with a 400-char prefix for the same key-length reason.
  • Added Converter.MysqlJsonValue, a marker record emitted by Converter#wrapJsonForDatabase for MySQL — mirroring the existing PostgreSQL toJsonbPGobject pattern. QueryGenerator emits column = CAST(? AS JSON) when it sees a MysqlJsonValue parameter, and DatasourceOperations.bindParam unwraps the marker before setObject. This is gated on DatabaseType.MYSQL only, so PostgreSQL / CockroachDB / H2 paths are unaffected. Without this, JDBC has no Types.JSON, Connector/J maps MySQL JSON columns to java.lang.String, and a bound VARCHAR compared against a JSON column always evaluates as type-different-never-equal under MySQL's JSON comparison rules — which was the root cause of MysqlPolicyServiceIT.testPolicyMapping failing on the detach path. (Not caught in Add support for MySQL Metastore #2704; surfaced here because the integration test suite exercises detach/drop end-to-end.)
  • JdbcMetaStoreManagerFactory.produceDatasourceOperations selects between the default datasource and a named MySQL datasource based on the configured database-type (Quarkus named-datasource pattern).

Runtime module

  • The standard runtime/server is the runner used for every relational backend; the MySQL JDBC driver is added to it only when -PincludeMysqlDriver=true is passed to Gradle.

  • The build script also disables the license-report tasks under the same flag (the GPL driver is intentionally absent from LICENSE) and overrides quarkus.datasource.mysql.jdbc=true at Quarkus build time so the named MySQL datasource is wired up.

  • The default release build remains GPL-free:

    ./gradlew :polaris-server:assemble # ASF compliant (no MySQL driver)
    ./gradlew :polaris-server:assemble -PincludeMysqlDriver=true # opt-in custom downstream build

  • The Quarkus runtime distinguishes between the default (unnamed) datasource — used by PostgreSQL, CockroachDB and H2 — and the named MySQL datasource. The named MySQL datasource is declared inactive in runtime/defaults/.../application.properties; users opt in at runtime by setting quarkus.datasource.mysql.active=true and the connection properties on quarkus.datasource.mysql.*.

Shared test infrastructure

  • runtime/test-common gains MysqlRelationalJdbcLifeCycleManagement and MysqlRelationalJdbcProfile, mirroring the existing CockroachDB equivalents and continuing the lifecycle-management pattern from Add support for MySQL Metastore #2704. The MySQL container is started with --lower-case-table-names=1 because Polaris uses a mix of UPPER and lower-case table identifiers; MySQL on Linux is case-sensitive by default. The lifecycle resource also sets quarkus.datasource.mysql.active=true to opt the named datasource in for the integration tests.
  • The MySQL *ApplicationIT / *ManagementServiceIT / *PolicyServiceIT / *RestCatalogIT / *ViewFileIT integration tests live alongside the existing CockroachDB and PostgreSQL counterparts under runtime/service/src/intTest/, sharing the same Polaris integration test base classes via testFixtures.

Documentation

  • runtime/server/README.md gains a "Building a MySQL-capable server (custom downstream build)" section: GPL-driver disclaimer, -PincludeMysqlDriver=true build commands (including the Docker image build), runtime configuration for the named MySQL datasource, the lower_case_table_names=1 requirement, and the auto-bootstrap caveat (The "initial" bootstrap exit 0 with a warning instead of a raw failure instead of failure if realm already exists #2324).
  • site/content/in-dev/unreleased/metastores/relational-jdbc.md: a single paragraph noting that the MySQL path requires a custom downstream build from the official source release and pointing to the source-tree README. The closing paragraph is reworded to clarify that official Polaris release artifacts support PostgreSQL and H2; MySQL is available only via a custom source build.
  • CHANGELOG.md: entry framed as "source-level support", with an explicit statement that official Polaris release artifacts do not include the MySQL JDBC driver. No GitHub raw URLs.

Impact on existing backends

None. PostgreSQL, CockroachDB and H2 users do not need to change any configuration: they continue to use the default (unnamed) datasource via quarkus.datasource.*. There are no MySQL-specific code branches in the shared persistence path; the new MysqlJsonValue marker is gated on DatabaseType.MYSQL inside DatasourceOperations.bindParam, so non-MySQL backends take the existing setObject path unchanged.

How it was tested

  • ./gradlew :polaris-runtime-service:intTest --tests "*JdbcApplicationIT" --tests "*CockroachApplicationIT" --tests "*MysqlApplicationIT" -PincludeMysqlDriver=true — PostgreSQL, CockroachDB and MySQL *ApplicationIT all pass 18/18.
  • ./gradlew :polaris-server:assemble (no opt-in flag) succeeds and :polaris-server:quarkusBuild produces a GPL-free runner; the license-report tasks pass.
  • ./gradlew :polaris-server:assemble -PincludeMysqlDriver=true produces a runner with the MySQL JDBC driver bundled.
  • ./gradlew :polaris-relational-jdbc:test — unit tests continue to pass.
  • Wired into a proprietary environment that exercises the full Iceberg REST API flow against the MySQL-backed catalog (bootstrap, catalog and namespace creation, Iceberg table CRUD, credential-vended reads/writes to S3-compatible storage); further details cannot be shared, but the MySQL backend handled the workload without backend-specific failures.

Acknowledgments

Thanks to @Jayden-Chiu for the original work in #2704 — its DatabaseType extension, schema-file layout, and test-infrastructure patterns are the foundation of this PR. Thanks also to @adutra, @dimas-b, @eric-maynard, @jbonofre, @flyrain, @singhpk234, and @snazy for the review feedback on #2704, #2491 and this PR that shaped the design.

AI-assisted contribution

Parts of this change were drafted with AI assistance (Claude). The author (@Y-Wakuta) reviewed, tested, and takes full responsibility for the final code.

Fixes #2491
Continues #2704

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes Support MySQL as Metastore #2491
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

@github-project-automation github-project-automation Bot moved this to PRs In Progress in Basic Kanban Board Apr 23, 2026
@Y-Wakuta Y-Wakuta force-pushed the feat/mysql-support branch 3 times, most recently from 2b48aeb to 72a72f3 Compare April 23, 2026 11:03
@flyrain flyrain requested a review from singhpk234 April 23, 2026 16:34
@Y-Wakuta Y-Wakuta force-pushed the feat/mysql-support branch 3 times, most recently from e20ebf9 to 0ac5949 Compare April 24, 2026 04:25
@Y-Wakuta Y-Wakuta marked this pull request as ready for review April 24, 2026 04:26
Comment on lines +187 to +189
* Builds a column-to-value map containing exactly the primary-key columns of the
* policy_mapping_record table. Keep this in sync with the {@code PRIMARY KEY (...)} clause of
* each schema-vN.sql file.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand, why there is this restriction ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @singhpk234 — your comment prompted me to dig in, and the answer turned out to be more general than the v1 javadoc captured.

Background: MysqlPolicyServiceIT.testPolicyMapping was failing with PolicyInUseException because detachPolicy wasn't deleting the mapping row. My initial hypothesis was JSON canonicalization, and toPrimaryKeyMap was a workaround: drop the JSON column from the WHERE and match by PK only.

The actual issue: canonicalization didn't fit (the test uses Map.of(), an empty {} with nothing to canonicalize). The real cause is structural — JDBC has no Types.JSON so Connector/J maps MySQL JSON to
java.lang.String
, the bound VARCHAR doesn't compare as JSON-vs-JSON, and under MySQL's JSON comparison rules the predicate evaluates as JSON OBJECT vs JSON scalar (different types, never equal).

The fix (latest commit): mirror the existing PostgreSQL toJsonbPGobject pattern with a Converter.MysqlJsonValue marker, and have QueryGenerator emit column = CAST(? AS JSON) when it sees one. Converter#wrapJsonForDatabase centralizes the per-database wrapping; DatasourceOperations.bindParam unwraps MysqlJsonValue before setObject, gated on MySQL only so PostgreSQL / CockroachDB / H2 paths are untouched. This covers every JSON column in the schema (entities.*, events.additional_properties, *.metadata), not just parameters, and lets deleteFromPolicyMappingRecords use the shared ALL_COLUMNS path on every backend — toPrimaryKeyMap and deleteFromPolicyMappingRecordsMySql are removed.

) COMMENT = 'the version of the JDBC schema in use';

INSERT INTO version (version_key, version_value)
VALUES ('version', 4)
Copy link
Copy Markdown
Contributor

@flyrain flyrain Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may aim for v5 as 1.4.0 is out with v4. I also expect schema changes in 1.5.0.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumping to v5 for 1.5.0 sounds reasonable. If there are schema changes or discussions I should align with, please point me at them and I'll incorporate them here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some discussions about metrics schema in the dev mailing list, but it won't affect this PR. We can handle it once we made the metric changes.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context. Happy to follow up on the MySQL side once the metrics schema changes are in.

@Override
public void deleteFromPolicyMappingRecords(
@Nonnull PolarisCallContext callCtx, @Nonnull PolarisPolicyMappingRecord record) {
if (datasourceOperations.getDatabaseType() == DatabaseType.MYSQL) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branching is unfortunate. The MySQL path deletes by PK only; can't we use this path for all database types?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — the branching is removed in the latest commit. MySQL now uses the same ALL_COLUMNS WHERE path as PostgreSQL/H2 via a new Converter.MysqlJsonValue marker (symmetric to toJsonbPGobject for PostgreSQL). Background and verification under @singhpk234's thread.

@Y-Wakuta Y-Wakuta requested review from adutra and singhpk234 April 30, 2026 08:24
@snazy
Copy link
Copy Markdown
Member

snazy commented Apr 30, 2026

Hi @Y-Wakuta,

Thanks for pushing this forward. I have not reviewed the implementation in technical detail yet, so I am not commenting on the concrete technical approach here. I do want to be fairly firm on one release/docs point, though, so we keep the user-facing release story precise about what Polaris officially ships and what requires a custom downstream build.

Because the MySQL JDBC driver is not part of the official Polaris artifacts, I do not think we should present this in released docs/changelog as ordinary end-user support, and released documentation should not send users to raw GitHub tree / main-branch content for this path.

Concretely, when this lands, I'd like us to make sure that:

  • released docs and release notes do not link to runtime/server-mysql/README.md on GitHub or other raw repository content for this;
  • if we mention where users can find the detailed steps, released docs should direct them to the official Polaris source release, and then to runtime/server-mysql/README.md in the unpacked source tree;
  • detailed custom-build instructions stay in that source-release/developer context, not in versioned end-user docs or release notes;
  • we avoid unqualified wording such as "MySQL is a supported database" unless the official Polaris release artifacts themselves support that path;
  • the wording clearly distinguishes official Polaris release artifacts from a custom downstream build produced from the official source release.

For example, I think wording along these lines would be much safer:

Added source-level support for building a MySQL-specific runtime for the relational JDBC backend. Official Polaris artifacts do not include the MySQL JDBC driver. Users who need this path should download the official Polaris source release and build their own derivative from that source tree; see runtime/server-mysql/README.md in the unpacked source release for details.

I am not trying to turn this into a blocker for the implementation itself, but I do want us to treat the release/docs framing as part of getting this over the line, rather than as an optional follow-up.

* Maven group:artifact IDs: org.graalvm.truffle:truffle-runtime

Home page: https://github.com/oracle/graal/
License: The GNU General Public License (GPL) - https://github.com/oracle/graal/blob/master/LICENSE
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's clearly a no-go here. I don't think we have to provide any LICENSE and NOTICE here as we don't publish the server-mysql distribution as part of our "official" artifacts.

To avoid any confusion about the purpose of the server-mysql, I would just remove the distribution folder containing LICENSE and NOTICE.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that's the cleaner positioning. Removed the distribution folder along with the now-unused distributionElements / licenseNoticeElements Gradle configurations in the latest commit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I wasn't clear. My proposal is to completely remove runtime/server-mysql.
We should not have additional runtime/server in Polaris.

Instead we should:

  1. As suggested by @adutra , we should have (in the regular server) pre-defined db-kind configuration (postgresql, mysql, ...) by default. It's resolved at runtime injecting the corresponding bean.
  2. We should document how users can assembly and build custom server runtime.
  3. During the Community Meeting yesterday, I proposed to work on a "tool" to facilitate custom runtime assembly for our users.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it — agreed on removing runtime/server-mysql. To map to your three points: (1) I'll fold the MySQL configuration into runtime/server using @adutra's approach, (2) I'll add documentation for users to assemble their own custom MySQL-capable runtime, and (3) the assembly tool sounds like a great direction. I'd assume the tool itself is beyond this PR's scope, but happy to lay any groundwork here that would help — please point out anything you'd like included.

grant_records_version INT NOT NULL COMMENT 'the version of grant records change on the entity',
location_without_scheme TEXT,
PRIMARY KEY (realm_id, id),
CONSTRAINT constraint_name UNIQUE (realm_id, catalog_id, parent_id, type_code, name),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constraint_name is a weird name.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adutra
Agreed that constraint_name is an unnatural name. However, this same name is currently used in the postgres, h2, and cockroachdb schemas as well. One option would be to rename it to something like uq_entities_identity, but that would introduce a divergence between the MySQL schema and
the others that isn't driven by any DB-specific behavior.

Long-term I think renaming the constraint across all DB schemas is the right move, but I'd like to avoid changing the non-MySQL schemas in this PR. Which of the following do you think is more appropriate?

  1. Use a better name (e.g. uq_entities_identity) only in the MySQL schema — improves the name, but introduces a temporary divergence from the other DBs.
  2. Keep the same constraint_name as the other DBs in this PR — consistent across all schemas, defers the rename entirely.

Either way, if we don't touch the other schemas here, I'd like to open a follow-up PR to rename the constraint across all DB schemas.

Comment thread gradle/projects.main.properties Outdated
polaris-runtime-defaults=runtime/defaults
polaris-runtime-service=runtime/service
polaris-server=runtime/server
polaris-server-mysql=runtime/server-mysql
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit disappointed that we need to introduced a new server module especially for MySQL.

I suppose this is being done either because of licensing issues, or because quarkus.datasource.db-kind is a build-time property.

The licensing issues are not a problem as long as the category X dependency is gated behind a build property, and is strictly opt-in. The LICENSE and NOTICE files should not reflect these dependencies since they are not released in any of the official Polaris releases.

As for the build-time issue: it is possible to workaround that by leveraging named datasources. The same project can declare multiple datasources, and the selection of the active one happens at runtime.

Project Nessie uses this strategy, see how different datasources are configured there:

https://github.com/projectnessie/nessie/blob/b37627031e4bec547b10890d5e7ed50027481963/servers/quarkus-server/src/main/resources/application.properties#L220-L252

The runtime activation of the datasource is done with a ConfigSourceInterceptor:

https://github.com/projectnessie/nessie/blob/522010a484ddc1cc77148412f137390db9bc15cd/servers/quarkus-common/src/main/java/org/projectnessie/quarkus/config/datasource/DataSourceActivator.java

I would really encourage you to follow this pattern, because otherwise we'll end up with a different server module for each driver that cannot be shipped because of legal issues.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree with you @adutra

It's something we discussed in the past, and I proposed to have a tool to create custom distribution (for users), a kind of CLI.

We can imagine something like:

polaris-distribution --add-module mysql --base ...

So, I would not include a server for mysql (see my other comment) but more the module that could be use in a assembly tool.

@Y-Wakuta Y-Wakuta force-pushed the feat/mysql-support branch 5 times, most recently from 75fc387 to 292c9d8 Compare May 7, 2026 02:46
@Y-Wakuta
Copy link
Copy Markdown
Author

Y-Wakuta commented May 7, 2026

Thanks @snazy — agreed on all framing points. A note up front: runtime/server-mysql/ was already removed earlier in this PR (per @jbonofre), so any runtime/server-mysql/... reference was stale. The driver is now opt-in via -PincludeMysqlDriver=true on runtime/server.

Changes pushed:

  1. CHANGELOG.md — rewrote along your suggested wording: "source-level support", official artifacts don't include the GPL driver, point to runtime/server/README.md in the unpacked source release.
  2. site/content/.../relational-jdbc.md — removed detailed build/properties/bootstrap from the versioned end-user docs; left a single paragraph noting it requires a custom downstream build and pointing to the source-tree README. Closing paragraph reworded to "Official artifacts
    support PostgreSQL and H2; MySQL via custom source build only".
  3. runtime/server/README.md — added a "Building a MySQL-capable server (custom downstream build)" section with the full build/config/bootstrap details, in the source-tree/developer context.
  4. No raw GitHub links remain in the versioned docs or CHANGELOG.

Let me know if any framing is still off, or if you'd prefer the source-tree README to live elsewhere (e.g. a separate BUILDING_MYSQL.md).

@Y-Wakuta
Copy link
Copy Markdown
Author

Y-Wakuta commented May 7, 2026

@adutra @jbonofre @snazy @singhpk234 @flyrain — friendly ping for another review pass when you have a moment.

The PR description has been updated to reflect the current state of the changes, and every review thread now has either a code/doc update or a follow-up reply. The constraint_name thread has two options waiting on @adutra's preference. Separately, @jbonofre — I'm keeping the custom-distribution / "polaris-distribution" tool direction you raised as a follow-up (out of scope for this PR); happy to contribute groundwork in subsequent changes if useful.

Thanks for the thorough feedback throughout.

@Y-Wakuta Y-Wakuta requested review from adutra, flyrain and jbonofre May 7, 2026 04:04
Copy link
Copy Markdown
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, @Y-Wakuta !

Sorry, I'm coming late to the review party. I hope you do not mind 😅

Comment thread runtime/service/build.gradle.kts Outdated

// MySQL is opt-in: the GPL JDBC driver (ASF Category X) is not bundled in the default
// build. Enable with `-PincludeMysqlDriver=true` to run the MySQL integration tests.
if (project.hasProperty("includeMysqlDriver")) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit odd that we have to do this both in runtime/server and runtime/service... Would one of those places be sufficient?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO the MySQL tests should run unconditionally. It's OK to have the driver in the test classpath.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to driver on the test class path.

... but if we have this if in runtime/service, we should not need the same one in .../server... right?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both — pushed an update: runtime/service now adds the MySQL JDBC driver unconditionally, and runtime/server consolidates into a single if/else that strips the driver by default (GPL-free) or keeps it under -PincludeMysqlDriver=true. This also fixes the CI failure where MysqlApplicationIT was hitting ClassNotFoundException: com.mysql.jdbc.Driver without the flag.

for (Map.Entry<String, Object> entry : whereEquals.entrySet()) {
conditions.add(entry.getKey() + " = ?");
if (entry.getValue() instanceof Converter.MysqlJsonValue) {
conditions.add(entry.getKey() + " = CAST(? AS JSON)");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we use JSON as a "where" condition?.. just wondering 🤔

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In JdbcBasePersistenceImpl.deleteFromPolicyMappingRecords — the delete's WHERE is built from ModelPolicyMappingRecord.toMap(), which includes the properties JSON column. Without CAST(? AS JSON) the bound VARCHAR doesn't match the JSON column under MySQL's comparison rules, so the delete predicate never matched after a policy detach; this was the actual cause of MysqlPolicyServiceIT.testPolicyMapping failing.

@Y-Wakuta
Copy link
Copy Markdown
Author

Thank you @dimas-b — not at all, your timing was perfect. Your two questions together surfaced an existing CI failure (MysqlApplicationIT was failing with ClassNotFoundException: com.mysql.jdbc.Driver without the flag), which the latest push fixes. I've replied on each thread with the specifics; another pass when you have time would be much appreciated.

@Y-Wakuta Y-Wakuta requested a review from dimas-b May 10, 2026 09:44
Comment thread runtime/service/build.gradle.kts Outdated
// `*Mysql*IT` tests run on every CI build. `runtime/service` is not a distributed artifact,
// so bundling the GPL driver here has no licensing implication for official Polaris
// release artifacts (the production runner in `runtime/server` keeps the driver opt-in).
dependencies { runtimeOnly("io.quarkus:quarkus-jdbc-mysql") }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about moving this to the bigger dependencies block above?

Comment thread runtime/service/build.gradle.kts Outdated
dependencies { runtimeOnly("io.quarkus:quarkus-jdbc-mysql") }
// Override the `jdbc=false` default in `application.properties` so Quarkus wires up
// the MySQL named datasource at build time.
quarkus { quarkusBuildProperties.put("quarkus.datasource.mysql.jdbc", "true") }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean quarkus.datasource.mysql.active?

Comment thread runtime/server/build.gradle.kts Outdated
} else {
// Strip the MySQL JDBC driver that `runtime/service` brings in transitively for its
// own integration tests; the production runner stays GPL-free.
configurations.all { exclude(group = "io.quarkus", module = "quarkus-jdbc-mysql") }
Copy link
Copy Markdown
Contributor

@dimas-b dimas-b May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a big change, but WDYT about moving tests (MysqlApplicationIT, etc.) to a new module under persistence... e.g. persistence/relational-jdbc-mysql/tests?

This module will build its own Quarkus server for testing (similar to extensions/auth/opa/tests) with the MySQL driver and Quarkus extensions.

runtime/service will remain free from MySQL-related code (which is nice from the overall design POV).

runtime/server will include the MySQL Quarkus plugin under a build flag, but will not have to exclude anything if the flag is not set (cleaner dependency tree).

for (Map.Entry<String, Object> entry : whereEquals.entrySet()) {
conditions.add(entry.getKey() + " = ?");
if (entry.getValue() instanceof Converter.MysqlJsonValue) {
conditions.add(entry.getKey() + " = CAST(? AS JSON)");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having instanceof is not nice on the main call path. How about adding a helper method to DatabaseType? For example asJsonConditionPlaceholder(...)? Postgres and H2 will have a "do nothing" impl. MySQL will return CAST(? AS JSON)

Subsequently, the MysqlJsonValue class will become unnecessary, I think 🤔

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably need another help method like .isJson(tableName, columnName) somewhere 🤔

Copy link
Copy Markdown
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working through the licensing constraint here. I think this still needs changes before it can be merged, because the current Gradle wiring makes the MySQL driver visible in places that are not clearly facing production Gradle configurations.

The main blocker is that runtime/service adds io.quarkus:quarkus-jdbc-mysql with plain runtimeOnly, so com.mysql:mysql-connector-j ends up on :polaris-runtime-service:runtimeClasspath unconditionally. Since the MySQL driver is ASF Category X, it must not leak onto a production-facing source set/runtime classpath or to downstream consumers of runtime-service. If this is only needed for MySQL integration tests, it should be isolated to an integration-test-only configuration/source set or otherwise kept out of production classpaths.

Relatedly, runtime/service also unconditionally sets:

quarkus { quarkusBuildProperties.put("quarkus.datasource.mysql.jdbc", "true") }

That appears MySQL-specific and build-time Quarkus-facing. I do not think it should be applied to the normal service build unless the corresponding MySQL support is explicitly enabled in a non-release/downstream build path.

I am also uncomfortable with the runtime/server license handling. When -PincludeMysqlDriver=true is set, the build disables generateLicenseReport, checkLicense, and licenseReportZip wholesale. If we keep an opt-in "build from source with a special option" path, the license handling should be targeted and explicit rather than disabling all license checks for the server module.

Finally, I am not sure the server README is the right place for the MySQL-specific build instructions. Calling out the opt-in source-build path is probably fine, but this feels more like separate JDBC/MySQL documentation than normal server documentation, especially since the official server artifact must remain GPL-free.

Copilot AI review requested due to automatic review settings May 26, 2026 04:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds an opt-in MySQL path for the relational JDBC metastore (source-built only due to GPL driver licensing), including schema v4, database-type awareness in SQL generation, and dedicated MySQL integration tests.

Changes:

  • Introduces MySQL database type support (schema v4, metadata inference, JSON column handling in WHERE clauses).
  • Adds an opt-in Gradle build flag to include the MySQL JDBC driver only for downstream/custom builds.
  • Adds a dedicated MySQL integration-test module and expands tests for JSON equality + case-sensitivity semantics.

Reviewed changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
site/content/in-dev/unreleased/metastores/relational-jdbc.md Documents MySQL as a custom source-build path and clarifies official artifact support.
runtime/server/build.gradle.kts Adds opt-in includeMysqlDriver build behavior and Quarkus build-time property flip.
runtime/server/README.md Documents downstream/custom MySQL-capable server build and runtime config.
runtime/defaults/src/main/resources/application.properties Declares a disabled-by-default named MySQL datasource.
persistence/relational-jdbc/src/test/java/.../models/ModelRegistryTest.java Adds unit tests for JSON column registry behavior.
persistence/relational-jdbc/src/test/java/.../QueryGeneratorTest.java Updates tests for database-type-aware QueryGenerator + adds MySQL JSON placeholder test.
persistence/relational-jdbc/src/main/resources/mysql/schema-v4.sql Adds MySQL schema v4 aligned with PostgreSQL schema v4.
persistence/relational-jdbc/src/main/java/.../models/ModelScanMetricsReport.java Centralizes JSON wrapping and declares JSON columns.
persistence/relational-jdbc/src/main/java/.../models/ModelRegistry.java Adds registry of JSON columns by table for SQL generation.
persistence/relational-jdbc/src/main/java/.../models/ModelPolicyMappingRecord.java Centralizes JSON wrapping and declares JSON columns.
persistence/relational-jdbc/src/main/java/.../models/ModelEvent.java Centralizes JSON wrapping and declares JSON columns.
persistence/relational-jdbc/src/main/java/.../models/ModelEntity.java Centralizes JSON wrapping and declares JSON columns.
persistence/relational-jdbc/src/main/java/.../models/ModelCommitMetricsReport.java Centralizes JSON wrapping and declares JSON columns.
persistence/relational-jdbc/src/main/java/.../models/Converter.java Adds wrapJsonForDatabase helper.
persistence/relational-jdbc/src/main/java/.../idempotency/RelationalJdbcIdempotencyStore.java Passes database type into QueryGenerator to support MySQL JSON placeholder needs.
persistence/relational-jdbc/src/main/java/.../RelationalJdbcConfiguration.java Documents mysql as a supported configured database type.
persistence/relational-jdbc/src/main/java/.../QueryGenerator.java Makes SQL generation database-type aware and consults ModelRegistry for JSON placeholders.
persistence/relational-jdbc/src/main/java/.../JdbcMetaStoreManagerFactory.java Selects named datasource based on configured database type (e.g., mysql).
persistence/relational-jdbc/src/main/java/.../JdbcBasePersistenceImpl.java Threads database type through QueryGenerator calls.
persistence/relational-jdbc/src/main/java/.../DatasourceOperations.java Adds MySQL SQLState handling and exposes database type.
persistence/relational-jdbc/src/main/java/.../DatabaseType.java Adds MYSQL enum, JSON placeholder selection, and MySQL inference.
persistence/relational-jdbc/build.gradle.kts Adds compile-only Quarkus Agroal API for named datasource lookup.
persistence/relational-jdbc-mysql/tests/.../Dockerfile-mysql-version Pins MySQL container image tag used by tests.
persistence/relational-jdbc-mysql/tests/.../META-INF/services/...PolarisServerManager Registers MySQL test server manager via SPI.
persistence/relational-jdbc-mysql/tests/.../MysqlViewFileIT.java Adds MySQL integration coverage for view-file scenarios.
persistence/relational-jdbc-mysql/tests/.../MysqlServerManager.java Provides server manager implementation for MySQL test module.
persistence/relational-jdbc-mysql/tests/.../MysqlRestCatalogIT.java Adds MySQL integration coverage for REST catalog file tests.
persistence/relational-jdbc-mysql/tests/.../MysqlRelationalJdbcProfile.java Configures Quarkus test profile to use MySQL relational JDBC backend.
persistence/relational-jdbc-mysql/tests/.../MysqlRelationalJdbcLifeCycleManagement.java Starts MySQL Testcontainers instance and wires Quarkus datasource config.
persistence/relational-jdbc-mysql/tests/.../MysqlPolicyServiceIT.java Adds MySQL integration coverage for policy service tests.
persistence/relational-jdbc-mysql/tests/.../MysqlManagementServiceIT.java Adds MySQL integration coverage for management service tests.
persistence/relational-jdbc-mysql/tests/.../MysqlEntityCaseSensitivityIT.java Verifies MySQL collation choices preserve case-sensitive identifier semantics.
persistence/relational-jdbc-mysql/tests/.../MysqlApplicationIT.java Adds MySQL integration coverage for application-level tests.
persistence/relational-jdbc-mysql/tests/build.gradle.kts Defines dedicated MySQL integration-test module with MySQL driver dependency.
integration-tests/src/main/java/.../PolarisPolicyServiceIntegrationTest.java Adds regression test for attach/detach policy with non-trivial JSON parameters.
gradle/projects.main.properties Registers the new MySQL tests module as a Gradle project.
CHANGELOG.md Notes downstream/custom MySQL build support and links to build instructions.
Comments suppressed due to low confidence (10)

runtime/server/build.gradle.kts:1

  • project.hasProperty("includeMysqlDriver") will enable the MySQL driver even if the property is provided with a falsey value (e.g., -PincludeMysqlDriver=false). Parse the property value as a boolean (e.g., via findProperty(...)?.toString()?.toBoolean() == true) so the driver is only included when explicitly enabled.
    runtime/server/README.md:1
  • --rerun is not a valid Gradle flag; the supported flag is --rerun-tasks. Using the current command will fail for users copy/pasting these instructions—please update the documentation accordingly.
    persistence/relational-jdbc/src/main/resources/mysql/schema-v4.sql:1
  • MySQL has historically rejected DEFAULT values on JSON columns in many configurations/versions (and even where expression defaults are supported, the canonical form is typically DEFAULT (JSON_OBJECT()), not DEFAULT ('{}')). As written, this schema is likely to fail during bootstrap on some MySQL 8.0.x versions. Consider removing JSON defaults (and letting the application always write {}) or switching to an expression default that is known to be accepted across the supported MySQL 8.0+ range.
    persistence/relational-jdbc/src/main/resources/mysql/schema-v4.sql:1
  • MySQL has historically rejected DEFAULT values on JSON columns in many configurations/versions (and even where expression defaults are supported, the canonical form is typically DEFAULT (JSON_OBJECT()), not DEFAULT ('{}')). As written, this schema is likely to fail during bootstrap on some MySQL 8.0.x versions. Consider removing JSON defaults (and letting the application always write {}) or switching to an expression default that is known to be accepted across the supported MySQL 8.0+ range.
    persistence/relational-jdbc/src/main/resources/mysql/schema-v4.sql:1
  • MySQL has historically rejected DEFAULT values on JSON columns in many configurations/versions (and even where expression defaults are supported, the canonical form is typically DEFAULT (JSON_OBJECT()), not DEFAULT ('{}')). As written, this schema is likely to fail during bootstrap on some MySQL 8.0.x versions. Consider removing JSON defaults (and letting the application always write {}) or switching to an expression default that is known to be accepted across the supported MySQL 8.0+ range.
    persistence/relational-jdbc/src/main/resources/mysql/schema-v4.sql:1
  • MySQL has historically rejected DEFAULT values on JSON columns in many configurations/versions (and even where expression defaults are supported, the canonical form is typically DEFAULT (JSON_OBJECT()), not DEFAULT ('{}')). As written, this schema is likely to fail during bootstrap on some MySQL 8.0.x versions. Consider removing JSON defaults (and letting the application always write {}) or switching to an expression default that is known to be accepted across the supported MySQL 8.0+ range.
    persistence/relational-jdbc/src/main/resources/mysql/schema-v4.sql:1
  • MySQL has historically rejected DEFAULT values on JSON columns in many configurations/versions (and even where expression defaults are supported, the canonical form is typically DEFAULT (JSON_OBJECT()), not DEFAULT ('{}')). As written, this schema is likely to fail during bootstrap on some MySQL 8.0.x versions. Consider removing JSON defaults (and letting the application always write {}) or switching to an expression default that is known to be accepted across the supported MySQL 8.0+ range.
    persistence/relational-jdbc/src/main/resources/mysql/schema-v4.sql:1
  • privilege_code participates in the PRIMARY KEY but is not declared NOT NULL. While MySQL will effectively enforce NOT NULL for PK columns, leaving it implicit makes the schema harder to reason about and can lead to confusing failures if a NULL is ever attempted. Declare privilege_code as INTEGER NOT NULL for clarity and consistency with other key columns.
    persistence/relational-jdbc/src/test/java/org/apache/polaris/persistence/relational/jdbc/QueryGeneratorTest.java:1
  • The table name is hard-coded here even though a constant exists (ModelPolicyMappingRecord.TABLE_NAME). Using the constant reduces drift risk if the identifier ever changes and keeps the test aligned with production usage.
    persistence/relational-jdbc/src/main/resources/mysql/schema-v4.sql:1
  • The ASF header comment appears to have a formatting/typo issue (NOTICE file--), which deviates from the standard header text. Please fix the header to match the canonical ASF license header format.

Comment on lines +102 to +108
DataSource ds =
relationalJdbcConfiguration
.databaseType()
.map(type -> dataSources.select(new DataSourceLiteral(type)))
.filter(Instance::isResolvable)
.map(Instance::get)
.orElseGet(defaultDataSource::get);
*
* <p>Mirrors the {@code ServerManager} that the {@code runtime/service} integration tests register
* via the same SPI; we keep a local copy here so the MySQL test module is self-contained and does
* not depend on {@code :polaris-runtime-service} test fixtures.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this module does depend on polaris-runtime-service, if I'm not mistaken... so why not reuse the existing ServerManager class?

@dimas-b
Copy link
Copy Markdown
Contributor

dimas-b commented May 29, 2026

@Y-Wakuta : Please resolve conflicts to allow CI to run.

@Y-Wakuta Y-Wakuta force-pushed the feat/mysql-support branch from 4993041 to dd548f9 Compare May 30, 2026 06:48
… backend

This adds MySQL (8.0+) as a fourth option for the relational JDBC
persistence backend, alongside PostgreSQL, CockroachDB, and H2.

This PR continues the work @Jayden-Chiu started in apache#2704 and owes
its overall shape to that earlier effort — the `DatabaseType`
enum extension, the MySQL `toMap` branching pattern, the
testcontainer-based MySQL lifecycle management, and the
per-subclass integration-test layout are all derived from
@Jayden-Chiu's original proposal. Where this PR differs, it is
to incorporate the review feedback (from @adutra, @dimas-b,
@eric-maynard, @jbonofre, @flyrain) that landed on apache#2704 and

  * The MySQL JDBC driver is GPL-licensed and cannot be bundled
    with Apache Polaris under ASF Category X policy. This PR
    ships the Java code changes without the driver and adds an
    opt-in `-PincludeMysqlDriver=true` flag for downstream
    builds. (@adutra, @dimas-b, @jbonofre, @flyrain on apache#2491 and
    apache#2704.)
  * MySQL gets a single schema file aligned with the latest
    Polaris schema version (v4) rather than retroactively
    backfilling v1/v2/v3. (@eric-maynard, @dimas-b on apache#2704.)

- Added `MYSQL` to the `DatabaseType` enum, with product-name
  detection in `inferFromConnection` and display-name support for
  `"mysql"`. (Follows the enum-addition pattern from apache#2704.)
- Added MySQL SQLSTATE handling to `DatasourceOperations`:
  - `23000` (integrity constraint violation) combined with MySQL
    vendor error `1062` (duplicate entry) to match PostgreSQL's
    `23505` precision. MySQL does not have a dedicated SQLSTATE
    for unique-key violation.
  - `42S02` (table not found) — same code as H2.
- Added `mysql/schema-v4.sql` (the structure follows the tables
  Jayden-Chiu drafted in apache#2704, upgraded to the current v4
  layout). Notable MySQL-specific adjustments:
  - `JSONB` → `JSON`; `ON CONFLICT` → `ON DUPLICATE KEY UPDATE`;
    `JSON` columns use `DEFAULT ('{}')` to match PostgreSQL's
    default.
  - `USE POLARIS_SCHEMA;` in place of `CREATE SCHEMA ... SET
    search_path` (in MySQL, a schema is a database; the database
    is provisioned via the JDBC URL / testcontainer).
  - Indexes declared inside `CREATE TABLE` because MySQL lacks
    `CREATE INDEX IF NOT EXISTS`. The bootstrap re-runs this
    script per realm, and idempotency is covered by `CREATE
    TABLE IF NOT EXISTS`.
  - `entities.name` sized to `VARCHAR(256)` to match Polaris'
    `MAX_IDENTIFIER_LENGTH` while staying within the InnoDB
    3072-byte key length for the `(realm_id, ..., name)` unique
    key.
  - `location_without_scheme` indexed with a 400-char prefix for
    the same key-length reason.
- `JdbcBasePersistenceImpl.deleteFromPolicyMappingRecords` adds
  a four-line early-return branch for `DatabaseType.MYSQL` that
  dispatches to a new private
  `deleteFromPolicyMappingRecordsMySql` helper. The helper
  restricts the WHERE clause to the primary-key columns via the
  new `ModelPolicyMappingRecord.toPrimaryKeyMap`. MySQL JSON
  column comparison against a String parameter is subject to the
  server's JSON canonicalization and can produce false-negatives;
  the PK uniquely identifies the row, so equality match on the
  PK is unambiguous. The PostgreSQL / CockroachDB / H2 code path
  is untouched. (This was not caught in apache#2704; it surfaced here
  because the integration test suite exercises the detach/drop
  path end-to-end.)

- Added `runtime/server-mysql`, a MySQL variant of
  `runtime/server`. `quarkus.datasource.db-kind` is fixed at
  Quarkus build time, so a separate module with its own
  `application.properties` is required to host the MySQL
  variant. (CockroachDB does not need this because it reuses
  `db-kind=postgresql` through the PostgreSQL wire protocol.)
- The MySQL JDBC driver is GPL-licensed and cannot be bundled
  with Apache Polaris under the ASF third-party license policy
  (see discussion on apache#2491 and apache#2704; both MySQL Connector/J and
  MariaDB Connector/J fall under ASF Category X). The driver is
  therefore NOT shipped with the default build:

    ./gradlew :polaris-server-mysql:build                              # no driver (ASF compliant)
    ./gradlew :polaris-server-mysql:build -PincludeMysqlDriver=true    # opt-in

  A `checkMysqlDriverOptIn` Gradle task fails fast with a
  descriptive message when the flag is missing but a Quarkus
  build is requested.
- Integration tests for MySQL live under this module and reuse
  the existing Polaris integration test base classes via
  `testFixtures`. `./gradlew :polaris-server-mysql:intTest
  -PincludeMysqlDriver=true` runs them against a MySQL
  testcontainer.

- `runtime/test-common` gains `MysqlRelationalJdbcLifeCycleManagement`
  and `MysqlRelationalJdbcProfile`, mirroring the existing
  CockroachDB equivalents and continuing the lifecycle-management
  pattern from apache#2704. The MySQL container is started with
  `--lower-case-table-names=1` because Polaris uses a mix of
  UPPER and lower-case table identifiers; MySQL on Linux is
  case-sensitive by default.

- `runtime/server-mysql/README.md`: rationale for the opt-in
  driver flag and end-to-end build/run instructions.
- `site/content/in-dev/unreleased/metastores/relational-jdbc.md`:
  a MySQL section pointing at the runtime module and explaining
  the licensing constraint.
- `CHANGELOG.md`: entry matching the CockroachDB precedent.

None. The only shared-code change,
`JdbcBasePersistenceImpl.deleteFromPolicyMappingRecords`, adds a
four-line early-return branch for MySQL and leaves the existing
PostgreSQL / CockroachDB / H2 body untouched.

- `./gradlew :polaris-server-mysql:intTest -PincludeMysqlDriver=true`
  — 309 pass, 0 fail, 6 skipped (skipped are inherited from the
  integration test base classes).
- `./gradlew :polaris-runtime-service:intTest --tests "*Cockroach*"`
  — 309 pass; verifies the CockroachDB backend is unaffected.
- `./gradlew :polaris-relational-jdbc:test` — unit tests continue
  to pass.

Thanks to @Jayden-Chiu for the original work in apache#2704 — its
`DatabaseType` extension, schema-file layout, and test-infrastructure
patterns are the foundation of this PR. Thanks also to @adutra,
@dimas-b, @eric-maynard, @jbonofre, and @flyrain for the review
feedback on apache#2704 and apache#2491 that shaped this design.

Parts of this change were drafted with AI assistance (Claude). The
author reviewed, tested, and takes full responsibility for the final
code.

Fixes apache#2491
Continues apache#2704
@Y-Wakuta Y-Wakuta force-pushed the feat/mysql-support branch from dd548f9 to 745ec63 Compare May 30, 2026 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support MySQL as Metastore

8 participants