Adding explicit do-nothing to on-conflict clauses #255

PedroD · 2021-03-05T11:13:59Z

This PR allows for using doNothing() within onConflict clauses in Postgresql extension methods.

Additionally I improved the bulk operations by allowing it to batch an infinite amount of items by using chunks internally. Before that ktorm was capped to 32767 (Short.MAX_VALUE) parameters as per Postgres specification, which would result in a very limited number of items. These bulk operations are now ready to handle millions if not billions of items.

PedroD · 2021-03-05T16:34:12Z

Travis sometimes fails initializing docker for tests, weird.

Not a problem with the PR.

org.ktorm.support.postgresql.PostgreSqlTest > classMethod FAILED

    java.lang.ExceptionInInitializerError at Unsafe.java:-2

        Caused by: org.testcontainers.containers.ContainerFetchException at PostgreSqlTest.kt:37

            Caused by: org.testcontainers.containers.ContainerFetchException at PostgreSqlTest.kt:37

                Caused by: com.github.dockerjava.api.exception.DockerClientException at PostgreSqlTest.kt:37

1 test completed, 1 failed

vincentlauvlwj · 2021-03-13T16:24:50Z

Awesome!!

vincentlauvlwj · 2021-04-29T08:44:41Z

Additionally I improved the bulk operations by allowing it to batch an infinite amount of items by using chunks internally. Before that ktorm was capped to 32767 (Short.MAX_VALUE) parameters as per Postgres specification, which would result in a very limited number of items. These bulk operations are now ready to handle millions if not billions of items.

Finally I decided to remove the bulk chunking feature. I made this decision because chunking would face a transaction issue. Ktorm doesn't know how to handle transactions. When an exception thrown in one of the chunk operations, we should ignore the exception or rethrow it? If we rethrow the exception, the transaction should rollback or not? Or just abort it and left the data in an inconsistent state?

Since transactions should be handled in application level, I would rather not to implement chunking. If you need this feature, you can implement it in your own application. The bulkInsert & bulkInsertOrUpdate in Ktorm should be atomic operations.

You can check the latest code in the v3.4.x branch. It's almost ready to be released. #277

PedroD · 2021-04-29T16:09:18Z

@vincentlauvlwj what if we allow the user to pass the exception handler as a parameter of those bulk functions, letting Ktorm know what to do in case of an exception being thrown?

The speed difference between these bulks and the basic batch operations, plus the "returning" feature, makes Ktorm the most performant tool to work with PostgreSQL.

About being atomic operations, technically that's not possible with the current PostgreSQL parameter number limit, unless we use a transaction within.

PedroD · 2021-04-29T16:26:05Z

@vincentlauvlwj Two possible solutions to make these atomic:

The reason why they are not atomic is because I am batching them in batches of MAX_SQL_EXPR_BATCH_SIZE, which means that every time this limit is exceeded, a new query is performed, hence, if any subsquent query fails, the previous successful ones will not be rolled back.

Eg.: We want to bulk-insert MAX_SQL_EXPR_BATCH_SIZE * 2 + 5 elements, this will result in 3 batches (and batch queries) of:

Batch 1, size: MAX_SQL_EXPR_BATCH_SIZE

Batch 2, size: MAX_SQL_EXPR_BATCH_SIZE

Batch 3, size: 5

If Batch 3 fails, Batch 3 is rolled back, but not Batch 1 and 2.

Solution 1: Instead of batching, just throw an exception if the MAX_SQL_EXPR_BATCH_SIZE is exceeded, and the developers are forced to handle the batching and atomicity-concerns on their side.

Solution 2: Just make these bulk methods transactional:

private fun <T : BaseTable<*>> Database.bulkInsertReturningAux(
    table: T,
    returningColumns: List<Column<*>>,
    block: BulkInsertStatementBuilder<T>.(T) -> Unit
): Pair<Int, CompositeCachedRowSet> {
    var affectedTotal = 0
    val cachedRowSets = CompositeCachedRowSet()

    val builder = BulkInsertStatementBuilder(table).apply { block(table) }

    if (builder.assignments.isEmpty()) return Pair(0, CompositeCachedRowSet())

    val execute: (List<List<ColumnAssignmentExpression<*>>>) -> Unit = { assignments ->
        val expression = BulkInsertExpression(
            table.asExpression(),
            assignments,
            returningColumns = returningColumns.map { it.asExpression() }
        )

        val (total, rows) = executeUpdateAndRetrieveKeys(expression)

        affectedTotal += total
        cachedRowSets.add(rows)
    }

    this.useTransaction { // <-------------- The transaction should make this atomic
        executeQueryInBatches(builder, execute)
    }

    return Pair(affectedTotal, cachedRowSets)
}

Maybe you are more leaning towards solution 1.

Just let me know which one, and I can PR it.

Fully removing the bulk functions will be a big penalty to Ktorm IMO.

vincentlauvlwj · 2021-04-29T17:43:08Z

Oh, you got me wrong. I didn't remove the bulk functions, I just remove the chunk logics in it. When 32767 is exceeded, an exception will be thrown to force the developers to handle chunks on their side, that's exactly solution 1. I've updated the code, no need another PR.

PedroD · 2021-05-03T09:54:06Z

@vincentlauvlwj ah ok!

Perfect! Thanks!

FeatureSpitter added 3 commits March 5, 2021 11:08

Adding explicit do-nothing to on-conflict clauses

a7303e4

Adding remaining documentation

21d00cd

Adding forgotten error messages

0ee0f38

vincentlauvlwj changed the base branch from master to v3.4.x March 13, 2021 15:32

vincentlauvlwj merged commit b998f10 into kotlin-orm:v3.4.x Mar 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding explicit do-nothing to on-conflict clauses #255

Adding explicit do-nothing to on-conflict clauses #255

PedroD commented Mar 5, 2021 •

edited

PedroD commented Mar 5, 2021 •

edited

vincentlauvlwj commented Mar 13, 2021

vincentlauvlwj commented Apr 29, 2021

PedroD commented Apr 29, 2021 •

edited

PedroD commented Apr 29, 2021 •

edited

vincentlauvlwj commented Apr 29, 2021

PedroD commented May 3, 2021

Adding explicit do-nothing to on-conflict clauses #255

Adding explicit do-nothing to on-conflict clauses #255

Conversation

PedroD commented Mar 5, 2021 • edited

PedroD commented Mar 5, 2021 • edited

vincentlauvlwj commented Mar 13, 2021

vincentlauvlwj commented Apr 29, 2021

PedroD commented Apr 29, 2021 • edited

PedroD commented Apr 29, 2021 • edited

vincentlauvlwj commented Apr 29, 2021

PedroD commented May 3, 2021

PedroD commented Mar 5, 2021 •

edited

PedroD commented Mar 5, 2021 •

edited

PedroD commented Apr 29, 2021 •

edited

PedroD commented Apr 29, 2021 •

edited