Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default schema exceeds mysql row limits #550

Closed
laszlof opened this issue Dec 22, 2021 · 4 comments
Closed

Default schema exceeds mysql row limits #550

laszlof opened this issue Dec 22, 2021 · 4 comments
Assignees
Labels
type: bug A general bug
Milestone

Comments

@laszlof
Copy link

laszlof commented Dec 22, 2021

Describe the bug
The default schema included with this exceeds the physical limits set by mysql (65,535 bytes) when using a utf8 character set. This issue is further compounded if you are using a clustered storage engine (NDB) which has an even lower row size limit (30K bytes).

To Reproduce

Expected behavior
Database tables are imported successfully

What actually happens

mysql> CREATE TABLE oauth2_authorization (
    ->     id varchar(100) NOT NULL,
    ->     registered_client_id varchar(100) NOT NULL,
    ->     principal_name varchar(200) NOT NULL,
    ->     authorization_grant_type varchar(100) NOT NULL,
    ->     attributes varchar(15000) DEFAULT NULL,
    ->     state varchar(500) DEFAULT NULL,
    ->     authorization_code_value blob DEFAULT NULL,
    ->     authorization_code_issued_at timestamp DEFAULT NULL,
    ->     authorization_code_expires_at timestamp DEFAULT NULL,
    ->     authorization_code_metadata varchar(2000) DEFAULT NULL,
    ->     access_token_value blob DEFAULT NULL,
    ->     access_token_issued_at timestamp DEFAULT NULL,
    ->     access_token_expires_at timestamp DEFAULT NULL,
    ->     access_token_metadata varchar(2000) DEFAULT NULL,
    ->     access_token_type varchar(100) DEFAULT NULL,
    ->     access_token_scopes varchar(1000) DEFAULT NULL,
    ->     oidc_id_token_value blob DEFAULT NULL,
    ->     oidc_id_token_issued_at timestamp DEFAULT NULL,
    ->     oidc_id_token_expires_at timestamp DEFAULT NULL,
    ->     oidc_id_token_metadata varchar(2000) DEFAULT NULL,
    ->     refresh_token_value blob DEFAULT NULL,
    ->     refresh_token_issued_at timestamp DEFAULT NULL,
    ->     refresh_token_expires_at timestamp DEFAULT NULL,
    ->     refresh_token_metadata varchar(2000) DEFAULT NULL,
    ->     PRIMARY KEY (id)
    -> ) Engine=InnoDB;
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs

Suggested fix

The simplest fix here is to convert those large VARCHAR columns over to TEXT. VARCHAR will allocate their entire size in the row size calculation, where as TEXT/BLOB type columns will not, as they are stored by reference.

This issue is even more compounded when you factor in that most databases default to some utf8 type of encoding. UTF8 is going to make each of those VARCHAR sizes be n*4 + 1. that single 15K character VARCHAR for attributes is consuming > 60K bytes.

@laszlof laszlof added the type: bug A general bug label Dec 22, 2021
@sjohnr
Copy link
Member

sjohnr commented Dec 22, 2021

@laszlof thanks for the report. It seems possible that this began happening in mysql with 14cb58d.

As mentioned in this comment:

It is very difficult to provide an implementation that works out of the box for all databases. This implementation strives to use standard sql datatypes and is a simplified JDBC implementation. However, it is designed to be customizable so user's can provide customizations for database vendors that deviate from the standard sql types.

In this case, I don't think mysql is non-standard. But it still applies that it is difficult to make a simplified JDBC implementation work across the board out of the box. This schema is really intended as a starting point, and the entire schema and implementation of JdbcOAuth2AuthorizationService is intended to be customized to meet your needs. You can start by customizing the sql to see if it can be adjusted for your case. I'm unsure, but you could possibly change to TEXT columns without code changes.

If that doesn't work, take a look at JdbcOAuth2AuthorizationServiceTests.tableDefinitionWhenCustomThenAbleToOverride(), which provides a test on how to override the table definition altogether. There are tests in that class that demonstrate how to customize further.

You may also be interested in implementing this using Spring Data. See this gist if you're interested in trying that approach: JpaOAuth2AuthorizationService. Note that we will hopefully be able to provide a concrete example of some of this in our upcoming reference documentation and how-to guides. Please upvote #545 if this is of interest to you.

I'm going to leave this issue open for now, so we can review it relative to the change to increase attributes to 15,000. However, I recommend you pursue customizing the schema and/or implementation and see how far you can get, as it really is our intention to provide a starting point more than an out of the box solution for all databases.

@laszlof
Copy link
Author

laszlof commented Dec 22, 2021

@sjohnr Thanks for the reply. We've already customized the schema on our end. I just wanted to bring awareness to the fact that the recent change in 14cb58d effectively breaks its out of the box usage for all mysql databases.

I agree its difficult to provide an implementation that will work out of the box with all database. But perhaps building the schema in a way that works with at least the 2 most common, mysql, and postgres, would be worthwhile.

For our personal implementation, we're using an NDB cluster, so the issue is even more compounded as we're faced with a 30K byte row limitation rather than the typical 65535 byte's. I realize this is a bit more of an edge-case, so I dont expect it to be supported out of the box.

@jgrandja
Copy link
Collaborator

@laszlof The original commit that caused this issue has been reverted via e175f4f.

As well, see gh-604 that provides an improvement to large data column support.

@aka-unk
Copy link

aka-unk commented Feb 9, 2024

is it possible to provide separate script for different databases, rather than try to initialize all with a single script?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A general bug
Projects
None yet
Development

No branches or pull requests

4 participants