Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow optimizer hints #4243

Merged
merged 4 commits into from Jan 27, 2021
Merged

Conversation

martinmacko47
Copy link
Contributor

@martinmacko47 martinmacko47 commented Jan 20, 2021

Implements #4199

Optimizer hints using comment /*+ syntax support in dialects:

  • MySQL: Supported and working.
  • Postgres: Not supported in vanilla Postgres. Supported in some extensions like in EDB Postgres Advanced Server, where the syntax is the same as in MySQL. Not tested as I don't have access to EDB Postgres Advanced Server.
  • Oracle: Supported. The syntax is the same as in MySQL. Not tested as I don't have access to Oracle DB.
  • MSSQL: Not supported.
  • Sqlite3: Not supported.

It is still good to keep the hints implementation even for dialects that don't support them as there may be proxy interpreters / middlewares supporting similar hints syntax.

@@ -1000,6 +1001,11 @@ export declare namespace Knex {
): QueryBuilder<TRecord, TResult2>;
}

interface OptimizerHint<TRecord extends {} = any, TResult extends {} = any> {
(...hints: string[]): QueryBuilder<TRecord, TResult>;
(hints: readonly string[]): QueryBuilder<TRecord, TResult>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this typescript rule is correct. Please check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lorefnon Could you take a look?

throw new Error('Optimizer hint cannot include "/*" or "*/"');
}
if (hints.some((hint) => hint.includes('?'))) {
throw new Error('Optimizer hint cannot include "?"');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is ? in the hint it messes up parameter bindings. So I'm forbidding ? symbols in hints. Later we can somehow escape ? in hints or actually allow parameter bindings in hints if they are supported by any dialect.

@@ -148,6 +148,25 @@ class Builder extends EventEmitter {
return this;
}

// Adds a one or more optimizer hits to the list of "optimizerHints" on the query.
optimizerHint() {
const hints = helpers.normalizeArr.apply(null, arguments);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should get rid of all the helpers.normalizeArr.apply(null, arguments); calls.
Could this be rewritten to be helpers.normalizeArr(...arguments) instead? Bonus points if you could avoid using arguments at all and just use explicit array as a param.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad Sure, I just followed the patterns in code arround

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad Changed in 65f488b. I removed variadic arguments, therefore it's now just simple optimizerHint(hints)

hints = hints.map((hint) => compact(hint.value).join(' '));
hints = compact(hints).join(' ');
if (hints) {
return `/*+ ${hints} */ `
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later we may add an optional parameter to configure the + mark in the hint if there are some dialects or proxy / server interprets that require different marks. I don't know about any so far though.

if (hints.some((hint) => !isString(hint))) {
throw new Error('Optimizer hint must be a string');
}
if (hints.some((hint) => hint.includes('/*') || hint.includes('*/'))) {
Copy link
Collaborator

@kibertoad kibertoad Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fastest way to do this check is .indexOf <...> !== -1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad I followed this review comment in #2815 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough :)

let hints = this.grouped.optimizerHints || [];
hints = hints.map((hint) => compact(hint.value).join(' '));
hints = compact(hints).join(' ');
if (hints) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be a ternary statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad Sure I can change it, I just lookend arround the codebase, and it seems that

if (...) return ...
return ...

pattern is much more frequent than

return ... ? ... : ....

pattern. So I followed the more common style.

Should I change it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codebase, unfortunately, is full of suboptimal code, definitely don't follow it :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but either is fine, pick the one that looks most readable to you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed in 8414e94

@@ -150,6 +150,78 @@ module.exports = function (knex) {
});
});

it('#4199 - adheres to optimizer hints', async function () {
const expectedErrors = {
mysql: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also implement this at least for PostgreSQL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad Vanilla PostgreSQL does not suport optimizer hints of this kind. It is supported just in extensions like EDB Postgres Advanced Server.

@@ -150,6 +150,78 @@ module.exports = function (knex) {
});
});

it('#4199 - adheres to optimizer hints', async function () {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to test any other hint besides max_execution_time in MySQL as they don't alter the result just the alhorithm.

Also I don't know how to test any of the hints used in Oracle.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, PostgreSQL just doesn't support hints and don't intend to. Could you make it throw an error when you try to use it on PG? Also I think SQLite doesn't support it either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it good to throw errors if hints are used for dialects that don't support hints? There are proxies or middlewares that may listen to the hints. Like in #1982 they needed comments for passing hints for their router. Also there are DB extensions like EDB Postgres Advanced Server which support hints while vanilla Postgres does not.

I guess we should just make sure the hints won't interfer in dialects that don't support them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, then we shouldn't throw an error.

@kibertoad
Copy link
Collaborator

Great job!
This also needs documentation update at https://github.com/knex/documentation

).to.eventually.be.rejected.and.to.deep.include(expectedErrors[knex.client.driverName])
});

it('#4199 - ignores invalid optimizer hints', async function () {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither Postres, Sqlite3 nor MSSQL support this kind of hints. So I at least test they don't interfere if used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSSQL does, I think: https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query?view=sql-server-ver15

Everything that does not should be throwing an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad The problem with MSSQL is that it uses OPTION clause instead of /*+ ... */ comment syntax, which works differently.

For instance we cannot just simply validate the hint for non-existence of */ within the hint to prevent a premature end of the hint as mentioned in #2815 (comment). We would need to parse the hint itself to know whether there are not more ) than ( in it. Otherwise any code could leak into the query passing somehting like .optimizerHint('HASH GROUP) injected sql (...').

Also as the OPTION clause is a normal SQL clause, it supports parameter bindings, while the comment hints syntax does not. So I'm not sure if it is good to mingle two so different syntexes into a single method. Possibly we could implement the OPTION clause using a separate method.

OTOH, we may let MSSQL use the OPTION clause for the hints, but without checking for injections within hints. And with forbidden parameter bindings for now, as adding parameter bindings will mess with the comment syntax. Assuming there are no proxy interpreters / middlewares / routers for MSSQL using the comments hint syntax. Otherwise we would prevent users from passing the comment hints to them when using MSSQL dialect.

Perhaps we could rename the .optimizerHint(hint) method to something like .commentHint(hint) to emphasize it is for hints passed via the /*+ comment syntax regardless of used dialect.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinmacko47 Then it sounds like current validation logic should be moved from generic querybuilder into MySQL querybuilder. Then we could implement it differently for MSSQL. Would that allow to use same optimizerHint for both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad I guess we need to decide whether we want:

a) A semantic method for optimizer hints as they are supported in each dialect. That would be comment-like syntax for MySQL and Oracle, OPTION clause for MSSQL, and unsupported exception for Postgres and SQLite. With no way to pass comment-like hints for proxies / routers / whatever middlewares people may use.

b) A common method for any hints with comment-like syntax regardless they are consumed by the DB itself or any party on the way to DB.

I don't really have an opinion which one is more useful.

Copy link
Collaborator

@kibertoad kibertoad Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a tough one to pick. On one hand one of the main goals of knex is to abstract away dialect differences between different DBs, so that swapping them works as transparently as possible. Now in this case one would obviously need to rewrite hints across dbs either way, but using unified syntax for that is still likely to be more familiar for people doing the change.

However, I see how that would be limiting the use of the hints for non-optimizer cases, hence I probably would vote for the second one, with very explicit documentation of what use-cases it could be used for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad Then let's keep the comment-like syntax hints for all dialects and we will try to document it properly. Just perhaps let's name the method not optimizerHint so users won't confuse it with other types of optimizer hints in other dialects, but somehow else.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MySQL documentation calls them "optimizer hint comments", so either optimizerHintComment or hintComment works for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad Renamed to hintComment in ad8b9a6. The other is too long IMHO.

types/index.d.ts Outdated
@@ -1000,6 +1001,11 @@ export declare namespace Knex {
): QueryBuilder<TRecord, TResult2>;
}

interface OptimizerHint<TRecord extends {} = any, TResult extends {} = any> {
(...hints: string[]): QueryBuilder<TRecord, TResult>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to support this at all and just force user to pass array always

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad You mean like either a single hint or an array of hints? Or you mean that even for a single hint an array with this single hint should be passed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either would work. Either support a singular argument and wrap it into array inside, or always require an array; main thing I don't want to support is an arbitrary amount of separate arguments being passed and spread later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad Changed in 65f488b

@martinmacko47
Copy link
Contributor Author

martinmacko47 commented Jan 20, 2021

This also needs documentation update at https://github.com/knex/documentation

I'll add docs after we consolidate the new API. Number of parameters, comments syntax vs OPTIONS clause for hints, parameter bindings and so.

@martinmacko47
Copy link
Contributor Author

This also needs documentation update at https://github.com/knex/documentation

@kibertoad Docs added in knex/documentation#306

@martinmacko47
Copy link
Contributor Author

@kibertoad I guess the PR is ready. What else can I do to have it merged and released? Thanks.

@kibertoad
Copy link
Collaborator

will review in the evening, thanks!

@kibertoad kibertoad merged commit 50be910 into knex:master Jan 27, 2021
@kibertoad
Copy link
Collaborator

@martinmacko47 Since this is a part of a next semver major release which is not yet fully ready, I'm afraid I can't cut a new stable version yet. I see two options if you need it released sooner than in two-three weeks: either you'd have to backport it to 0.21 branch, or I can release a preview NEXT version with this change.

@martinmacko47
Copy link
Contributor Author

@martinmacko47 Since this is a part of a next semver major release which is not yet fully ready, I'm afraid I can't cut a new stable version yet. I see two options if you need it released sooner than in two-three weeks: either you'd have to backport it to 0.21 branch, or I can release a preview NEXT version with this change.

@kibertoad Sure, I can backport it to 0.21 branch. How to do it? Just to cherrypick the commit to 0.21 branch, resolve conflicts and create a new PR onto 0.21? Or is there anything else not to forget about?

@kibertoad
Copy link
Collaborator

@martinmacko47 yeah, that would be worth a try, but I expect some conflicts, as master was heavily refactored since then. as long as tests will pass, should be good.

martinmacko47 added a commit to martinmacko47/knex that referenced this pull request Jan 28, 2021
Conflicts:
	lib/query/builder.js
	test/unit/query/builder.js
martinmacko47 added a commit to martinmacko47/knex that referenced this pull request Jan 28, 2021
Conflicts:
	lib/query/builder.js
	test/unit/query/builder.js
@martinmacko47
Copy link
Contributor Author

martinmacko47 commented Jan 28, 2021

@kibertoad Here is the backport PR #4258 (CI seems not starting, but I run tests locally and they passed)

I didn't find 0.21 base branch for docs. Is there one? Or how to backport docs?

@kibertoad
Copy link
Collaborator

@martinmacko47 There is no way to do that, unfortunately, so I'll just link to the documentation PR in the changelog.

Travis CI is sloooooooooow, give it a couple of hours to kick in.

kibertoad pushed a commit that referenced this pull request Jan 30, 2021
@martinmacko47 martinmacko47 deleted the mysql-optimizer-hints branch February 27, 2021 19:02
This was referenced Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants