Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add primary keys to SQL tables for percona/galera support #1337

Closed
Ornias1993 opened this issue Sep 23, 2020 · 16 comments
Closed

Add primary keys to SQL tables for percona/galera support #1337

Ornias1993 opened this issue Sep 23, 2020 · 16 comments
Labels
area/storage SQL Storage related features/bugs Good First Issue New contributors, join in! priority/3/medium Medium priority items type/feature Request for adding a new feature
Milestone

Comments

@Ornias1993
Copy link

Ornias1993 commented Sep 23, 2020

Percona is quite strict on having primary keys on all tables (and rightly so).
Please add primary keys to all tables, even if it's just an auto-increment.

It's mostly just the following tables:
authentication_logs (could just get a auto-increment primary key added)
identity_verification_tokens (could just set the token as primary key)

@james-d-elliott
Copy link
Member

This is definitely something we can do. We have a plan to make some changes to the DB schema shortly so I will try to squeeze that in then.

@james-d-elliott james-d-elliott added type/feature Request for adding a new feature priority/3/medium Medium priority items labels Oct 23, 2020
@Ornias1993
Copy link
Author

@james-d-elliott Awesome.
I did some initial tests by manually adding those colums and it worked out just fine btw :)

@james-d-elliott
Copy link
Member

Yeah making them primary should be fine, it's just making sure everyone is on the same schema state when we do it. That way there is no desync.

@Ornias1993
Copy link
Author

Yeah the migration is the most work indeed 👍
Thanks for picking it up! :)

@nightah nightah added the Good First Issue New contributors, join in! label Nov 19, 2020
@james-d-elliott james-d-elliott added this to the v4.28.0 milestone Feb 2, 2021
@james-d-elliott james-d-elliott modified the milestones: v4.28.0, v4.30.0 Apr 14, 2021
@james-d-elliott james-d-elliott modified the milestones: v4.30.0, v4.32.0, v4.31.0 Jun 12, 2021
@james-d-elliott james-d-elliott modified the milestones: v4.31.0, v4.33.0 Oct 8, 2021
@pdf
Copy link

pdf commented Oct 9, 2021

This has been pushed for a while, and I'm hesitant to modify the tables locally to get Authelia working, for fear that future DB migrations will fail. Any chance of this getting some love in the near future?

I'd also strongly recommend not using varchar primary keys - they're terrible for clusters (and in general are a bad idea). My recommendation would be to use an AUTO_INCREMENT primary key on all tables, remove PRI from varchars, and add indexes for those varchars that you need to query (you can also set them to unique, rather than primary).

@james-d-elliott
Copy link
Member

james-d-elliott commented Oct 9, 2021

It has definitely been pushed a bit, it is also a major holdup for future features so it's something I'm actively working on. #2431

I would estimate this will be done within a week, or less. As to when it'll be released.. there are a few other features that will accompany it, all requiring changes to the schema.

@pdf
Copy link

pdf commented Oct 10, 2021

FWIW, my preference would be to cut a release with the updated base table structure, so that the app can at least be deployed on a cluster with default configuration, and push feature updates afterwards with additional migrations as/when needed.

@Ornias1993
Copy link
Author

FWIW, my preference would be to cut a release with the updated base table structure, so that the app can at least be deployed on a cluster with default configuration, and push feature updates afterwards with additional migrations as/when needed.

That's not really a good practice.
It would require a semver major release, that happens to not-be breaking on a later date due to added migrations. Thats... weird and super unclear for the average user understanding semver.

@pdf
Copy link

pdf commented Oct 10, 2021

That's not really a good practice. It would require a semver major release, that happens to not-be breaking on a later date due to added migrations. Thats... weird and super unclear for the average user understanding semver.

I don't understand exactly what you're suggesting here - unless the DB schema is part of your public API (which seems like a terrible idea), migrations should not require a major semver bump.

@james-d-elliott
Copy link
Member

The issue is we have two tables which require upgrades that are not possible via migrations. I want both of these changes to occur in one step in order to prevent a horrible migration process. This way I can realistically create a simple method to migrate down as well (either now or later, more likely to be now).

@pdf
Copy link

pdf commented Oct 10, 2021

I'd suggest all the existing tables should probably be modified to add an int primary key. Getting the structure right this pass is a sound idea.

What prompted my repliy is only that your earlier comment made it sound like release of these schema changes would be predicated on some future feature work, and the current schema is a blocker for deployment in some environments.

All changes should be possible via migration though (depending I guess on your definition of migration), can you expand on what problems you're facing in updating the schema?

@james-d-elliott
Copy link
Member

james-d-elliott commented Oct 10, 2021

I'd suggest all the existing tables should probably be modified to add an int primary key. Getting the structure right this pass is a sound idea.

I'm actually a DBA by profession, this was always the plan. It's relatively easy to do this part.

What prompted my repliy is only that your earlier comment made it sound like release of these schema changes would be predicated on some future feature work, and the current schema is a blocker for deployment in some environments.

A majority of the changes that will be bundled are small. However we'd rather take our time and ensure the process is going to be smooth and non-disruptive. We could definitely do it the way you're suggesting, and I know this has been pushed a few times already. However the key thing that this is needed for is OIDC storage, which is the number 1 thing on our priority list, it's very unlikely that that change will be bundled with this one. This will be released (as v1) prior to the OIDC change to be later released (as v2).

All changes should be possible via migration though (depending I guess on your definition of migration), can you expand on what problems you're facing in updating the schema?

I would argue a migration is something that can be purely done via SQL syntax. The problem only exists when you have to do data transformation that can't be done in one of our SQL providers. Encrypting stored data, and modifying a byte array stored as a base64 TEXT type to a byte array native type (BLOB/BYTEA). Both are not possible on SQLite. For MySQL/PostgreSQL/SQLite we also have to ensure the encryption method is consistent, so realistically that's something that has to be done with software.

@james-d-elliott
Copy link
Member

james-d-elliott commented Oct 10, 2021

Here's a peek at the schema, considering your interest in scaling databases I'd guess you're using postgres:

CREATE TABLE IF NOT EXISTS authentication_logs (
    id SERIAL PRIMARY KEY,
    time INTEGER NOT NULL,
    successful BOOLEAN NOT NULL,
    username VARCHAR(100) NOT NULL,
    remote_ip VARCHAR(47) NULL DEFAULT NULL,
    request_uri TEXT NULL DEFAULT NULL,
    request_method VARCHAR(4) NULL
);

CREATE TABLE IF NOT EXISTS identity_verification_tokens (
    id SERIAL PRIMARY KEY,
    token VARCHAR(512) UNIQUE NOT NULL
);

CREATE TABLE IF NOT EXISTS migrations (
    id SERIAL PRIMARY KEY,
    time INTEGER NOT NULL,
    schema INTEGER NOT NULL,
    prior INTEGER NULL DEFAULT NULL,
    version VARCHAR(64) NOT NULL
);

CREATE TABLE IF NOT EXISTS totp_configurations (
    id SERIAL PRIMARY KEY,
    username VARCHAR(100) UNIQUE NOT NULL,
    algorithm VARCHAR(6) NOT NULL DEFAULT 'sha1',
    digits INTEGER NOT NULL DEFAULT 6,
    totp_period INTEGER NOT NULL DEFAULT 30,
    secret VARCHAR(64) NOT NULL
);

CREATE TABLE IF NOT EXISTS u2f_devices (
    id SERIAL PRIMARY KEY,
    username VARCHAR(100) UNIQUE NOT NULL,
    key_handle BYTEA NOT NULL,
    public_key BYTEA NOT NULL
);

CREATE TABLE IF NOT EXISTS user_preferences (
    id SERIAL PRIMARY KEY,
    username VARCHAR(100) UNIQUE NOT NULL,
    second_factor_method VARCHAR(11) NOT NULL
);

@pdf
Copy link

pdf commented Oct 10, 2021

Ah right, yeah SQLite is frequently the problem child. Schema looks generally fine at a glance, though:

  • identity_verification_tokens.token probably wants UNIQUE
  • migrations.time being an int seems problematic (2038 problem)

@james-d-elliott
Copy link
Member

Ah yep, it is UNIQUE in my DB, just had not dumped the new schema yet. Regarding time, I'm deciding what to do. Regardless it has to be stored in the same type for migrations and authentication_logs in my opinion. It's on my list of things to play with regarding serialization/scanning.

@james-d-elliott
Copy link
Member

james-d-elliott commented Nov 23, 2021

The change has been merged however it's not ready for use in production. Please wait until the following PR's are merged before using it:

#2588 (ETA Wednesday UTC+11)
#2137 (ETA Wednesday UTC+11)
#2589 (ETA Thursday UTC+11)

If you decide to use it you'll have to downgrade again to pre1 before using any newer commit or the final release.

@james-d-elliott james-d-elliott added the area/storage SQL Storage related features/bugs label Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage SQL Storage related features/bugs Good First Issue New contributors, join in! priority/3/medium Medium priority items type/feature Request for adding a new feature
Projects
None yet
Development

No branches or pull requests

4 participants