Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ALL and ANY operators accepting array, subquery, or table parameters #1886

Merged
merged 12 commits into from
Jan 3, 2024

Conversation

ShreckYe
Copy link
Contributor

ALL and ANY are 2 SQL operators to perform a comparison between a single column value and a range of other values.

According to java - PreparedStatement IN clause alternatives? - Stack Overflow, using = ANY(?) and PreparedStatement.setArray if supported can be more efficient than using IN (inList in Exposed) when passing lists of different sizes because a single cached prepared statement can be reused. This is the major motive of this PR.

Since according to references online they are called operators not functions, and CustomOperator in Exposed has 2 operands, I am not sure whether it is better to extend Op or CustomFunction, and so I have implemented both and they are both tested to work. Please help remove the less appropriate one if this PR is accepted.

@e5l e5l requested a review from bog-walk December 6, 2023 12:15
@e5l
Copy link
Member

e5l commented Dec 6, 2023

@bog-walk, could you please check?

@e5l
Copy link
Member

e5l commented Dec 6, 2023

Hey @ShreckYe, thanks for the PR. Could you please run apiDump task and commit the changes?

@bog-walk bog-walk self-assigned this Dec 6, 2023
Copy link
Member

@bog-walk bog-walk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ShreckYe and thanks for working on this PR. These operators will be a great addition to the API, so I've requested some changes to ensure that this implementation is extensible enough to be adapted later.

Using these operators with an array argument is very niched to PostgreSQL, and the standard way of using ANY|ALL is with a subquery on the right hand side (supported by all except SQLite). This will be the expectation and natural progression if these operators are supported.
MySQL also has a very niched use case, accepting a table on the right hand side. We need to be able to add that functionality if users ask for it, especially if we're adding the PostgreSQL array.

The most ideal way to keep your functionality and future use cases possible would be to implement the feature in a way similar to SubQueryOp, but this would mean creating a class and a function for every single comparison operator that can precede ANY|ALL and the DSL would become more verbose.

I think your way of incorporating the existing comparison operators reads better, so I would suggest extending Op and creating a sealed class that allows us to build on it later. I've included this class and how to use it in my comments below.

Comment on lines +988 to +991
object UntypedAndUnsizedArrayColumnType : ColumnType() {
override fun sqlType(): String =
currentDialect.dataTypeProvider.untypedAndUnsizedArrayType()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the need for this and for the scope of this PR it's ok. It does have some problems, for example:

  • If a logger is enabled, the array object value is not processed in the output string, which isn't helpful to users wanting to see the values.
  • It won't work with edge cases like if a user wants to check if a datetime column value is in an array of LocalDateTimes.

But we need to flesh out the ArrayColumnType more thoroughly anyway, so I can work on these problems after this PR is merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't remove this because the implemented Ops still depend on it. As you commented this will be fleshed out, so to not keep these new definitions in the PR, one way I can think of is to replace its usage with a temporary inlined anonymous ColumnType with its sqlType being the empty string or just "ARRAY" and the tests still pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that sounds like a good temporary option to me.

Copy link
Contributor Author

@ShreckYe ShreckYe Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so shall I remove it and adopt this approach in the next commit?

Comment on lines 733 to 734
/** Returns the specified [value] as an array query parameter. */
fun <T> arrayParam(value: Array<T>): Expression<Array<T>> = QueryParameter(value, UntypedAndUnsizedArrayColumnType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this as it isn't necessary for the Op implementation. An arrayParam() will be introduced to the API when the ArrayColumnType is fully supported for storing arrays.

Comment on lines 101 to 105
/** Returns this array of data wrapped in the `ALL` operator. */
fun <T> Array<T>.allOp(): Op<T> = AllAnyOp("ALL", this)

/** Returns this array of data wrapped in the `ANY` operator. The name is explicitly distinguished from [Array.any]. */
fun <T> Array<T>.anyOp(): Op<T> = AllAnyOp("ANY", this)
Copy link
Member

@bog-walk bog-walk Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having these be extension functions makes it difficult to extend the functionality to, for example, a subquery. It also doesn't read like a natural SQL query: WHERE { Users.id eq arrayOf(1, 2).anyOp() }.

Please replace them with anyFrom which takes an array argument, that way we can make an override for subqueries and both will read well:

fun <T> anyFrom(array: Array<T>): AnyFromArrayOp<T> = AnyFromArrayOp(array)
fun <T> allFrom(array: Array<T>): AllFromArrayOp<T> = AllFromArrayOp(array)

// use
WHERE { Users.id eq anyFrom(arrayOf(1, 2)) }
WHERE { Users.id eq anyFrom(Users.selectAll()) }

Also, please place the functions in the subsection titled "// Array Comparisons".

Comment on lines 115 to 118
/** Checks if this expression is equal to any element from [array].
* This is a more efficient alternative to [ISqlExpressionBuilder.inList] on PostgreSQL and H2. */
infix fun <T> ExpressionWithColumnType<T>.eqAny(array: Array<T>): Op<Boolean> =
this eq array.anyOp() // TODO or `array.anyFunction()`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't be necessary once the functions above are switched to take an argument instead of the more verbose extension function.

Comment on lines 7 to 9
class AllAnyOp<T>(val opName: String, val array: Array<T>) : Op<T>() {

override fun toQueryBuilder(queryBuilder: QueryBuilder) = queryBuilder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be to replace this with a sealed class, something like this:

sealed class AnyOrAllFromBaseOp<T>(
    val isAny: Boolean,
    val subSearch: Any
) : Op<T>() {
    override fun toQueryBuilder(queryBuilder: QueryBuilder): Unit = queryBuilder {
        if (isAny) {
            +"ANY ("
        } else {
            +"ALL ("
        }
        when (subSearch) {
            is Array<*> -> registerArgument(UntypedAndUnsizedArrayColumnType, subSearch)
        }
        +")"
    }
}

class AnyFromArrayOp<T>(array: Array<T>) : AnyOrAllFromBaseOp<T>(isAny = true, subSearch = array)

class AllFromArrayOp<T>(array: Array<T>) : AnyOrAllFromBaseOp<T>(isAny = false, subSearch = array)

With that, we'd be able to easily go in an extend support to subqueries, etc. by just adding a branch to the when block and some more subclasses.

Copy link
Contributor Author

@ShreckYe ShreckYe Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored opName into isAny as you suggested but there might still be other row and array comparison operators, for example SOME though it's only an alias of ANY. (see PostgreSQL docs and MySQL docs) If it's necessary to leave room for them (though I haven't find many cases supporting this), this class can be renamed to RowAndArrayComparisonOp as how PostgreSQL calls them and the old opName approach can be adopted.

Also I added a SubSearch type parameter instead of using Any for better type safety and implemented registerArgument in subclasses for better extensibility from my point of view. If this is unnecessary I can change them to what you suggested.

Comment on lines 130 to 133
/** Data type for arrays with no specified size or element type, used only as types of [QueryParameter]s for PostgreSQL and H2.
* An array with no element type cannot be used for storing arrays in a column in either PostgreSQL or H2. */
abstract fun untypedAndUnsizedArrayType(): String

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go by the majority here, since only 2 dialects are implemented. Please swap this with the following and remove all exceptions from unsupported dialect files:

open fun untypedAndUnsizedArrayType(): String =
        throw UnsupportedByDialectException("This vendor does not support array data type", currentDialect)

@@ -211,6 +213,55 @@ class SelectTests : DatabaseTestsBase() {
}
}

val testDBsSupportingArrays =
listOf(TestDB.POSTGRESQL, TestDB.POSTGRESQLNG, TestDB.H2, TestDB.H2_MYSQL, TestDB.H2_MARIADB, TestDB.H2_PSQL, TestDB.H2_ORACLE, TestDB.H2_SQLSERVER)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestDB.allH2TestDB is available to replace listing all H2 modes.

Comment on lines 221 to 229
fun testEqAny(eqOp: Column<String>.() -> Op<Boolean>) {
withDb(testDBsSupportingArrays) {
withCitiesAndUsers { _, users, _ ->
val r = users.select { users.id.eqOp() }.orderBy(users.name).toList()

assertEquals(2, r.size)
assertEquals("Alex", r[0][users.name])
assertEquals("Andrey", r[1][users.name])
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate that this helps reduce redundancy, especially since you implemented the feature using both Op and Function. After refactoring, please use this (or whatever code you're testing) directly in each unit test instead. That way it will be more clear what the DSL we're testing actually looks like when used.

The tests that would be great to see are:

  1. eq anyFrom
  2. One other comparison op: the example below is good -> greaterEq anyFrom
  3. neq anyFrom
  4. eq anyFrom(emptyArray())

Major changes:
1. support the `ALL` and `ANY` operators taking subquery arguments and table arguments, and the `IN` operator taking table arguments
1. remove some no longer needed code, simplify code, and rename some functions
1. add related tests
1. run `apiDump`

The operators (`IN`, `ANY`, and `ALL`) with table arguments are only supported by PostgreSQL and H2. MySQL claims supporting this but it does not work with the JDBC driver as tested. Those with subquery arguments are not supported by SQLite.
@ShreckYe
Copy link
Contributor Author

@bog-walk thank you for your comments. I have modified the sources as you suggested and requested. For some of them I still have some questions which I leave in their respective reply comments.

Comment on lines 98 to 99
/** Returns this subquery wrapped in the `ANY` operator. This function is not supported by the SQLite dialect. */
fun <T> anyFrom(subQuery: Query): Op<T> = AllAnyFromSubQueryOp(true, subQuery)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ANY and ALL operators with subquery arguments are not supported by SQLite as is also commented in the commit message. Actually ANY and ALL are not keywords of SQLite at all according to this SO question.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShreckYe I forgot to mention something. Please take AbstractQuery<*> as the argument so we don't restrict it to just Query. It should be possible to use a SetOperation for example as a subquery in the SQL.

Comment on lines 104 to 105
/** Returns this table wrapped in the `ANY` operator. This function is only supported by PostgreSQL and H2 dialects. */
fun <T> anyFrom(table: Table): Op<T> = AllAnyFromTableOp(true, table)
Copy link
Contributor Author

@ShreckYe ShreckYe Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, The operators with table arguments are supported by PostgreSQL and H2 as tested though they haven't clearly documented about this, but the tests fail on MySQL who claims supporting it. I think it's probably because it's not supported by its JDBC connector. I verified by copying the SQL statement to an online MySQL executor and it works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is currently an issue with the mysql8 docker setup in our tests, so the test failing is a false negative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so shall I allow the tests to run on MySQL related dialects and just let them fail, or skip the tests as how they are now?

Copy link
Member

@bog-walk bog-walk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShreckYe Thanks for all the work so far, it's looking great. I've left some comments and requested a few small edits.

Also, please address the detekt edit requested in the failed action.

Comment on lines +988 to +991
object UntypedAndUnsizedArrayColumnType : ColumnType() {
override fun sqlType(): String =
currentDialect.dataTypeProvider.untypedAndUnsizedArrayType()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that sounds like a good temporary option to me.

Comment on lines 104 to 105
/** Returns this table wrapped in the `ANY` operator. This function is only supported by PostgreSQL and H2 dialects. */
fun <T> anyFrom(table: Table): Op<T> = AllAnyFromTableOp(true, table)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is currently an issue with the mysql8 docker setup in our tests, so the test failing is a false negative.

Comment on lines 116 to 126
// TODO Currently these functions delegate to `anyFrom`. Should the actual `SOME` SQL operator be supported?

/** An alias for [anyFrom]. */
fun <T> someFrom(subQuery: Query): Op<T> = anyFrom(subQuery)

/** An alias for [anyFrom]. */
fun <T> someFrom(array: Array<T>): Op<T> = anyFrom(array)

/** An alias for [anyFrom]. */
fun <T> someFrom(table: Table): Op<T> = anyFrom(table)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShreckYe There's already quite a bit in this PR. Please remove these aliases and then SOME can be introduced in another PR once this one is finalized. It won't take much tweaking with the current setup.

Comment on lines 671 to 672
/** Checks if this expression is equal to any element from the column of [table] with only a single column. This function is only supported by PostgreSQL and H2 dialects. */
infix fun <T> ExpressionWithColumnType<T>.notInTable(table: Table): InTableOp = InTableOp(this, table, false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "not" is probably missing from the KDocs here.

@ShreckYe
Copy link
Contributor Author

@ShreckYe Thanks for all the work so far, it's looking great. I've left some comments and requested a few small edits.

Also, please address the detekt edit requested in the failed action.

Sorry I don't know how to fix the detekt failure. I don't see a blank line at the the location it mentions and my detekt plugin keeps crashing. I downloaded the detekt command line client and ran auto correction but nothing changed. Please help me check if it persists.

@ShreckYe
Copy link
Contributor Author

ShreckYe commented Dec 19, 2023

OK I got the plugin to work on another computer and ran "AutoCorrect by detekt Rules" which removes Line 236 of the mentioned file.

@bog-walk bog-walk changed the title feat: Add the ALL and ANY operators accepting array parameters with an added untyped and unsized array type for PostgreSQL and H2 feat: Add ALL and ANY operators accepting array, subquery, or table parameters Jan 2, 2024
@bog-walk bog-walk requested a review from e5l January 2, 2024 22:55
@e5l e5l merged commit 98df06d into JetBrains:main Jan 3, 2024
3 checks passed
@e5l
Copy link
Member

e5l commented Jan 3, 2024

Hey @ShreckYe, thanks for the PR!
LGTM. Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants