Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding explicit do-nothing to on-conflict clauses #255

Merged
merged 3 commits into from Mar 13, 2021

Conversation

PedroD
Copy link

@PedroD PedroD commented Mar 5, 2021

@vincentlauvlwj

This PR allows for using doNothing() within onConflict clauses in Postgresql extension methods.

Additionally I improved the bulk operations by allowing it to batch an infinite amount of items by using chunks internally. Before that ktorm was capped to 32767 (Short.MAX_VALUE) parameters as per Postgres specification, which would result in a very limited number of items. These bulk operations are now ready to handle millions if not billions of items.

@PedroD
Copy link
Author

PedroD commented Mar 5, 2021

Travis sometimes fails initializing docker for tests, weird.

Not a problem with the PR.

org.ktorm.support.postgresql.PostgreSqlTest > classMethod FAILED

    java.lang.ExceptionInInitializerError at Unsafe.java:-2

        Caused by: org.testcontainers.containers.ContainerFetchException at PostgreSqlTest.kt:37

            Caused by: org.testcontainers.containers.ContainerFetchException at PostgreSqlTest.kt:37

                Caused by: com.github.dockerjava.api.exception.DockerClientException at PostgreSqlTest.kt:37

1 test completed, 1 failed

@vincentlauvlwj vincentlauvlwj changed the base branch from master to v3.4.x March 13, 2021 15:32
@vincentlauvlwj vincentlauvlwj merged commit b998f10 into kotlin-orm:v3.4.x Mar 13, 2021
@vincentlauvlwj
Copy link
Member

Awesome!!

@vincentlauvlwj
Copy link
Member

Additionally I improved the bulk operations by allowing it to batch an infinite amount of items by using chunks internally. Before that ktorm was capped to 32767 (Short.MAX_VALUE) parameters as per Postgres specification, which would result in a very limited number of items. These bulk operations are now ready to handle millions if not billions of items.

Finally I decided to remove the bulk chunking feature. I made this decision because chunking would face a transaction issue. Ktorm doesn't know how to handle transactions. When an exception thrown in one of the chunk operations, we should ignore the exception or rethrow it? If we rethrow the exception, the transaction should rollback or not? Or just abort it and left the data in an inconsistent state?

Since transactions should be handled in application level, I would rather not to implement chunking. If you need this feature, you can implement it in your own application. The bulkInsert & bulkInsertOrUpdate in Ktorm should be atomic operations.

You can check the latest code in the v3.4.x branch. It's almost ready to be released. #277

@PedroD
Copy link
Author

PedroD commented Apr 29, 2021

@vincentlauvlwj what if we allow the user to pass the exception handler as a parameter of those bulk functions, letting Ktorm know what to do in case of an exception being thrown?

The speed difference between these bulks and the basic batch operations, plus the "returning" feature, makes Ktorm the most performant tool to work with PostgreSQL.

About being atomic operations, technically that's not possible with the current PostgreSQL parameter number limit, unless we use a transaction within.

@PedroD
Copy link
Author

PedroD commented Apr 29, 2021

@vincentlauvlwj Two possible solutions to make these atomic:

The reason why they are not atomic is because I am batching them in batches of MAX_SQL_EXPR_BATCH_SIZE, which means that every time this limit is exceeded, a new query is performed, hence, if any subsquent query fails, the previous successful ones will not be rolled back.

Eg.: We want to bulk-insert MAX_SQL_EXPR_BATCH_SIZE * 2 + 5 elements, this will result in 3 batches (and batch queries) of:

  • Batch 1, size: MAX_SQL_EXPR_BATCH_SIZE
  • Batch 2, size: MAX_SQL_EXPR_BATCH_SIZE
  • Batch 3, size: 5

If Batch 3 fails, Batch 3 is rolled back, but not Batch 1 and 2.

Solution 1: Instead of batching, just throw an exception if the MAX_SQL_EXPR_BATCH_SIZE is exceeded, and the developers are forced to handle the batching and atomicity-concerns on their side.

Solution 2: Just make these bulk methods transactional:

private fun <T : BaseTable<*>> Database.bulkInsertReturningAux(
    table: T,
    returningColumns: List<Column<*>>,
    block: BulkInsertStatementBuilder<T>.(T) -> Unit
): Pair<Int, CompositeCachedRowSet> {
    var affectedTotal = 0
    val cachedRowSets = CompositeCachedRowSet()

    val builder = BulkInsertStatementBuilder(table).apply { block(table) }

    if (builder.assignments.isEmpty()) return Pair(0, CompositeCachedRowSet())

    val execute: (List<List<ColumnAssignmentExpression<*>>>) -> Unit = { assignments ->
        val expression = BulkInsertExpression(
            table.asExpression(),
            assignments,
            returningColumns = returningColumns.map { it.asExpression() }
        )

        val (total, rows) = executeUpdateAndRetrieveKeys(expression)

        affectedTotal += total
        cachedRowSets.add(rows)
    }

    this.useTransaction { // <-------------- The transaction should make this atomic
        executeQueryInBatches(builder, execute)
    }

    return Pair(affectedTotal, cachedRowSets)
}

Maybe you are more leaning towards solution 1.

Just let me know which one, and I can PR it.

Fully removing the bulk functions will be a big penalty to Ktorm IMO.

@vincentlauvlwj
Copy link
Member

Oh, you got me wrong. I didn't remove the bulk functions, I just remove the chunk logics in it. When 32767 is exceeded, an exception will be thrown to force the developers to handle chunks on their side, that's exactly solution 1. I've updated the code, no need another PR.

@PedroD
Copy link
Author

PedroD commented May 3, 2021

@vincentlauvlwj ah ok!

Perfect! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants