Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-31125 Add option for default local schema when I_S is queried #2628

Closed
wants to merge 2 commits into from

Conversation

LinuxJedi
Copy link
Contributor

This adds the additional system variable information_schema_local_database. When this variable is enabled it will cause built-in INFORMATION_SCHEMA tables and SHOW commands to use the current database as the TABLE_SCHEMA when one has not been explicitly set in a query.

This is so that for hosting companies with thousands of databases, these queries that are in applications they cannot control do not hit the disk hard scanning all FRM files. These would not be relevant to the application anyway.

SHOW DATABASES will still show all databases and if there is no default database currently set then these commands will behave as normal.

  • The Jira issue number for this PR is: MDEV-31125

How can this PR be tested?

Regression suite test added.

Basing the PR against the correct MariaDB version

  • This is a new feature and the PR is based against the latest MariaDB development branch
  • This is a bug fix and the PR is based against the earliest maintained branch in which the bug can be reproduced

Backward compatibility

New behaviour is off by default.

PR quality check

@LinuxJedi LinuxJedi force-pushed the MDEV-31125 branch 2 times, most recently from 0eed573 to 848b0e8 Compare May 10, 2023 06:54
@LinuxJedi LinuxJedi added the MariaDB Foundation Pull requests created by MariaDB Foundation label May 10, 2023
LinuxJedi and others added 2 commits May 16, 2023 11:05
This adds the additional system variable
`information_schema_local_database`. When this variable is enabled it
will cause built-in `INFORMATION_SCHEMA` tables to use the current
database as the `TABLE_SCHEMA` when one has not been explicitly set in
a query.

This is so that for hosting companies with thousands of databases, these
queries that are in applications they cannot control do not hit the disk
hard scanning all FRM files. These would not be relevant to the
application anyway.

`SHOW DATABASES` will still show all databases and if there is no
default database currently set then these commands will behave as
normal.
This patch:
* Adds description code comments to a couple of functions
* Explicitly marks a function as static
* Uses LEX_CSTRING to remove the usage of `strlen()`
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ LinuxJedi
❌ montywi
You have signed the CLA already but the status is still pending? Let us recheck it.

@LinuxJedi
Copy link
Contributor Author

Has been reviewed by @montywi. Amendments made:

  • Changed the system variable description to make it more readable
  • Changed the test case so that it doesn't use the mysql tables
  • Added patches by Monty to add a description and remove a couple of strlen() uses

@vuvova
Copy link
Member

vuvova commented May 17, 2023

This is weird. Normally, if such an application simply shouldn't have privileges on other schemas, then I_S automatically will only show one schema. Couldn't they do that instead?

@LinuxJedi
Copy link
Contributor Author

This is weird. Normally, if such an application simply shouldn't have privileges on other schemas, then I_S automatically will only show one schema. Couldn't they do that instead?

As far as I can tell the privilege check happens at a table level, after the tables have been scanned. So, whilst it only shows a filtered list, it is scanning the FRMs. If the way I'm reading it is true, then that would not solve the problem.

@vuvova
Copy link
Member

vuvova commented May 27, 2023

privileges on databases happen before tables are read. Or should happen, if they don't — we'll fix it.
Also, if only one database is allowed from privileges, I_S can read it directly, without even scanning the datadir (same optimization as for TABLE_SCHEMA = const

@LinuxJedi
Copy link
Contributor Author

privileges on databases happen before tables are read. Or should happen, if they don't — we'll fix it. Also, if only one database is allowed from privileges, I_S can read it directly, without even scanning the datadir (same optimization as for TABLE_SCHEMA = const

Having run some tests and looked through the permissions section better, you are correct, but only at a table level. This patch still solves two problems:

  1. Without a WHERE condition I_S will run a stat on every directory in the datadir, regardless of permissions. This is an expensive operation on hosting setups. When filtered with a WHERE condition (and with this patch) the stat is only on the required directory.
  2. (a lesser case) There are cases where hosting companies will have multiple schemas under one user. The application likely only needs to know about one schema.

There may be some smart privilege check you can do for part 1 here but I suspect it will be complicated to code up as the privileges system gets more complicated.

@vuvova
Copy link
Member

vuvova commented Jun 14, 2023

I don't see how these two are "problems". Looks to me like "works as expected". Otherwise I can point out to few more problems:

  1. Without a WHERE condition a SELECT might do a full table scan. When filtered with an @@implicit_where_clause variable, this can be avoided and the user will only see what he wanted to see.
  2. (a lesser case) There are cases when a user forgets to create the table before using it. With the @@implicit_create_table variable (that holds a full CREATE TABLE statement for such a table) the table can be automatically created as needed.

Back to the PR, why on Earth a user who runs a SELECT with a clear and well defined WHERE clause, gets the result limited by some user variable that he has to set beforehand?

@LinuxJedi
Copy link
Contributor Author

Closing as we are going to do this a little differently

@LinuxJedi LinuxJedi closed this Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MariaDB Foundation Pull requests created by MariaDB Foundation
4 participants