Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction support #304

Open
gramakri opened this issue Jun 5, 2014 · 8 comments
Open

Transaction support #304

gramakri opened this issue Jun 5, 2014 · 8 comments

Comments

@gramakri
Copy link

gramakri commented Jun 5, 2014

Transactions can be performed either using

  • db.exec("BEGIN TRANSACTION; ... ; COMMIT") - When creating multi-statement strings, I cannot find a way to escape the values using sqlite3. It would be nice to either export an escape() function for arguments or allow an argument array/binding just like run/all/get.
  • Using db.serialize(function() { /* issue one or more db.run() commands */ }); I cannot see where to issue ROLLBACK or COMMIT if one of the queries in the transaction fail.
db.serialize(function () {
  db.run('begin transcation');
  db.run('some_command1', function (err) { });
  db.run('some_command2', function (err) { });
  // now need to issue rollback or commit based on success/failure of above 2 commands
});

Note that we cannot call rollback/commit in the callbacks of db.run() because they aren't part of the serialize.

Am I missing something obvious?

@yelouafi
Copy link

yelouafi commented Jun 5, 2014

It seems impossible to ensure transaction properties with db.serialize only.
As documentation say

Isolation Between Database Connections

If the same database is being read and written using two different database connections (two different sqlite3 objects returned by separate calls to sqlite3_open()) and the two database connections do not have a shared cache, then the reader is only able to see complete committed transactions from the writer. Partial changes by the writer that have not been committed are invisible to the reader. This is true regardless of whether the two database connections are in the same thread, in different threads of the same process, or in different processes. This is the usual and expected behavior for SQL database systems.
...

No Isolation Between Operations On The Same Database Connection

SQLite provides isolation between operations in separate database connections. However, there is no isolation between operations that occur within the same database connection.

If i understand properly, new sqlite3.Database(...) rely on the sqlite3_open_...()
so you have to use multiples connections to ensure proper isolation between concurrent updates

function tx_handler(callback, commit) {
    return function() {
        var args = Array.prototype.slice.call(arguments);
        var err = args[0];
        if(err)  db.run('rollback');             
        else if(commit) db.run('commit');
        if(callback)
            callback.apply(this, args);
    }
}
var db1 = new sqlite3.Database('file');
var db2 = new sqlite3.Database('file');

db1.serialize(function () {
  db1.run('begin transcation');
  db1.run('insert something', tx_handler( ... ) );
  ...
  db1.run('insert something', tx_handler( ..., true ) ); // will commit upon success
});

db2.serialize(function () {
  db2.run('begin transcation');
  db2.run('insert something', tx_handler( ... ) );
  ...
  db2.run('insert something', tx_handler( ..., true ) ); // will commit upon success
});

@gramakri
Copy link
Author

gramakri commented Jun 5, 2014

@yelouafi Right, separate database objects are required.

So, I guess the conclusion is: If I want to do a transaction then I have to create a new sqlite3.Database object. I should not use this database object for anything else other than this transaction (alternatively i have to somehow track that there is no pending transaction).

The reason for the above conclusion is because https://github.com/mapbox/node-sqlite3/wiki/Control-Flow says that

Note that queries scheduled not directly in the callback function are not necessarily serialized:

This means that the commit and rollback commands in your tx_handler are not serialized. So if something makes calls after db1.serialize using db1 then we are in trouble.

This further leads me to conclude that db.serialize() itself is pointless with transactions. If I am doing a transaction, I just create a new database object and throw it away after the transaction.

Do you agree?

I guess a feature suggestion at this point is to implement database object pooling with the above in mind.

@yelouafi
Copy link

yelouafi commented Jun 6, 2014

@gramakri

I should not use this database object for anything else other than this transaction

Yes, this could be achieved by some pool implementation (see below)

alternatively i have to somehow track that there is no pending transaction

I don't think this is a good idea; you'll have to ensure Transaction semantics yourself (specially Isolation) ; this is a burden the database should take care of (and can better handle)

This further leads me to conclude that db.serialize() itself is pointless with transactions. If I am doing a transaction, I just create a new database object and throw it away after the transaction.

This is true if there is no dependence between executed statements; otherwise (as in master-detail updates) you'll have to ensure proper order of execution

I guess a feature suggestion at this point is to implement database object pooling with the above in mind.

+1 for pooling. note that you can have separate needs following the operation you need

  • For simple read operations you may not need multiples database objects. a single database object can be shared between readers. It will read only from committed data.
  • Simple update statements can also could be also achieved with a a shared database object as there is no risk of overlapping transactions (unless i miss something)
  • For complex transactions that needs multiples writes (and reads) we create separate databases objects to ensure proper isolation;

According to documentation it would be safe to start the transaction with 'begin immediate'

If X starts a transaction that will initially only read but X knows it will eventually want to write and does not want to be troubled with possible SQLITE_BUSY_SNAPSHOT errors that arise because another connection jumped ahead of it in line, then X can issue BEGIN IMMEDIATE to start its transaction instead of just an ordinary BEGIN. The BEGIN IMMEDIATE command goes ahead and starts a write transaction, and thus blocks all other writers. If the BEGIN IMMEDIATE operation succeeds, then no subsequent operations in that transaction will ever fail with an SQLITE_BUSY error.

@gramakri
Copy link
Author

gramakri commented Jun 6, 2014

@yelouafi Thanks for your comments and analysis! Really helped me think it through.

@Strix-CZ
Copy link

alternatively i have to somehow track that there is no pending transaction

I have implemented exactly that some time ago. It works quite well for small amount of transactions.
https://github.com/Strix-CZ/sqlite3-transactions

@akc42
Copy link

akc42 commented Aug 19, 2014

I too am trying go get my head around this issue. In particular trying to understand why @Strix-CZ thinks you need https://github.com/Strix-CZ/sqlite3-transactions

I have a web application https://github.com/akc42/MBBall which I am about to rewrite the client, and am trying to understand whether to leave the server in php or convert to javascript for the server using express. .With node.js and potentially a single thread for multiple requests, I want to ensure transaction isolation between the different requests coming in from multiple clients (all running the same version of the application).

The thing that I think is confusing me is the same words that @gramakri reference

Note that queries scheduled not directly in the callback function are not necessarily serialized:

If at the start of every request I do something like

var express = require('express');
var app     = express();
...
app.get('/sample', function(req, res) {

var db = new Database('database.db',function(){
    db.serialize(function(){
        db.exec('BEGIN TRANSACTION')
            .run("INSERT something")
            .run("SELECT the result")
            .exec("COMMIT")
            .close(function(){
                res.send("Some data from the database"); 
            });
    });
});
});


Will it be isolated from all other requests. Will other simulteneous requests somehow be able to be interspersed on the same connection.

In particular see my thoughts here http://stackoverflow.com/questions/3630/sqlite-vs-mysql/3933794#3933794 about using WAL mode, and having a very efficient transactions. If I have to effectively isolate each request myself - which I think @Strix-CZ sqlite3-transactions module does, I think it would be inefficient.

@Strix-CZ
Copy link

Just a quick answer for now. I will have more time in the evening. In this case you don't need it. You need it only when there is an asynchronous function call between BEGIN and COMMIT like so:

cb.exec('BEGIN');
setTimeout(function() { // Something else might be executed in between.
    cb.exec('COMMIT'); 
}, 100);

@kroggen
Copy link

kroggen commented Mar 22, 2023

One option to avoid problems with transactions is to use stored procedures

On this case your code would store pre-defined procedures on the database, and later it will just call the procedure with arguments, like this:

  db.run("CALL add_new_sale(ARRAY( ARRAY('DJI Avata',1,1168.00), ARRAY('iPhone 14',1,799.90), ARRAY('iWatch',2,249.99) ))")

Procedure calls are handled on their own transactions, so they are isolated by default. Changes from one call will not interfere in changes from another call.

You will need to rebuild node-sqlite3 to use the modified sqlite3 library, though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants