Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Programming model for runTransaction is hard to use with promises. #249

Closed
majelbstoat opened this issue Jun 26, 2018 · 31 comments
Closed
Assignees
Labels
api: spanner Issues related to the googleapis/nodejs-spanner API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@majelbstoat
Copy link

majelbstoat commented Jun 26, 2018

For single operations, using promises is straightforward:

function getUser() {
  return transaction
    .run(query)
}

return getUser().then((result) => console.log(result)).catch((err) => console.log(err))

However, the story becomes a lot more complicated with Database.runTransaction. It takes a callback, and doesn't return a promise.

I have a use-case that involves reading from the database, making a request to a remote service, then conditionally writing a row. Finally, I need to do some post-processing on the result. A read-write transaction is important for data integrity. However, the remote request is made with a promise.

Programming this is really quite complicated. In the happy case, I need to put all the post-processing code inside the transaction callback because this:

let user
db.runTransaction(async (err, transaction) => {
 user = await transaction.run(userQuery)
 // ...
})

convertUser(user)

clearly doesn't work because the transaction is run asynchronously and user isn't set by the time convertUser is executed. However, it seems wrong to put object marshalling code inside the transaction when I'm done with all the data needs earlier than that. Error handling for the bad case is also a pain, because any failures in the query or remote request will trigger a promise-rejection, and the handling of that looks pretty confusing code-wise when you're inside a callback.

Ideally, I'd love to be able to do something like the simple case:

user = await db.runTransaction(async (err, transaction) => {
  user = await transaction.run(queryQuery)
  // ...
  transaction.update('users', {updated: data})
  return transaction.commit().then(() => user)
})
.catch((err) => {
  console.log("Error: ", err)
})

where runTransaction takes care of running the callback, returning the result of the commit, or rejecting a promise with an error that happens inside the callback.

(As a side-node, the callback would then no longer need to take an error as its first parameter, because runTransaction could just reject with it before the callback is even started. As it stands, I have to handle the possibility of that error on the first line of every transaction.)

Environment details

  • Node.js version: 8.11.2
  • @google-cloud/spanner version: 1.5.0
@JustinBeckwith JustinBeckwith added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Jun 26, 2018
@callmehiphop
Copy link
Contributor

This has been suggested before, but truthfully I can't remember what the reasoning was for shooting it down. The only tricky thing I see with this proposal would be continuing support for a callback mode.

@stephenplusplus @googleapis/yoshi-nodejs thoughts?

@stephenplusplus
Copy link
Contributor

I recall we had promise support and had to ditch it because we can't replay a promise chain when retrying a transaction.

@walshie4
Copy link

^ Yeah this is the blocker I've run into when going down this path. This is something we're actively working on creating a cleaner, less error-prone, way to do that fits a bit better with promise based codebases.

I'll report back with any findings, our current idea is to use a using/disposer pattern.

http://bluebirdjs.com/docs/api/promise.using.html

@callmehiphop
Copy link
Contributor

I think what @majelbstoat is proposing is not trying to keep track of the entire promise chain, but allow the callback to return a promise.

database.runTransaction(async (err, transaction) => {
  return transaction.run(`SELECT * FROM Table`)
    .then(([rows]) => transaction.commit());
}).then(() => {
  // transaction finished
});

I know this has been pitched before and shot down

@stephenplusplus
Copy link
Contributor

Hmm, could you see if you can find those discussions?

@eric4545
Copy link
Contributor

eric4545 commented Jun 27, 2018

We also get same pain, but we used follow method to kinda make it compatible with async/await .

exports.withTranscation = (fn) => {
	return new Promise((res, rej) => {
		database.runTransaction(async (err, transaction) => {
			if (err) return rej(err);
			let result;
			try {
				result = await fn(transaction);
			} catch (error) {
				return transaction.rollback((rollBackError) => {
					if (rollBackError) {
						logger.error({message: 'Rollback error occur', error: rollBackError});
					}
					return rej(error);
				});
			}

			return transaction.commit((commitError) => {
				if (commitError) {
					logger.error({message: 'Commit error occur', error: commitError});
					return rej(commitError);
				}
				return res(result);
			});
		});
	});
};

Usage:

const result = await withTranscation(async (transaction)=>{
// code
})

@majelbstoat
Copy link
Author

majelbstoat commented Jun 27, 2018

@callmehiphop: Yeah, I understand you might want to re-run that callback multiple times, which makes sense (and which I'm glad of!). But when it finally succeeds, getting the final result of it back would be very helpful.

Async error handling inside the callback is still painful, but I can just make do with handling those manually myself for now.

FWIW, this is basically what the golang SDK does, though transaction.commit does not need to be specified and it returns ts, err as the result of the transaction, as is idiomatic in go.

@majelbstoat
Copy link
Author

@eric4545, thank you for sharing your code :) do you ever have problems with that if the inner callback runs multiple times due to retries or something? Like promise resolved after rejected, or similar?

@eric4545
Copy link
Contributor

@majelbstoat, inner callback you mean the runFn of database.runTransaction ?

@majelbstoat
Copy link
Author

@eric4545 yes. As I understand it, the spanner client will under certain circumstances run that callback more than once. (In case of transaction conflicts, IIRC?)

@eric4545
Copy link
Contributor

Yes, we did some load test to our system, and our APM show that the runFn have called 26 times.

You are right, the spanner client will exec runFn again and again as long it is retry-able error.

@callmehiphop
Copy link
Contributor

callmehiphop commented Jun 27, 2018

Typically we only retry runTransaction calls for ABORT errors - these usually happen when one or more transactions perform conflicting operations. We also expose a getTransaction method that does not retry, you could use this method to either:

  • Not retry aborted errors
  • Create a promisified version of runTransaction
const retry = require('p-retry');
const Spanner = require('@google-cloud/spanner');

const ABORTED = 10;

const spanner = new Spanner();
const database = spanner.instance('my-instance').database('my-database');

function runTransaction(fn) {
  return retry(async () => {
    const [transaction] = await database.getTransaction();

    try {
      return await fn(transaction);
    } 
    catch (e) {
      if (e.code === ABORTED) throw e;
      throw new retry.AbortError(e.message);
    }
  });
}

runTransaction(async (transaction) => {
  const [rows] = await transaction.run('SELECT * FROM MyTable');
  const data = rows.map(row => row.thing);
  
  await transaction.commit();
  return data;
}).then(data => {
  // ...
});

@majelbstoat
Copy link
Author

That's really helpful, thanks, @callmehiphop.

@majelbstoat
Copy link
Author

majelbstoat commented Jun 28, 2018

If it's helpful to anyone else struggling with this, here's a Typescriptified wrapper:

import { Query, Row, Database as SpannerDatabase, Table, Transaction } from '@google-cloud/spanner'
import retry from 'p-retry'

const ABORTED = 10

type TransactionFn<T> = (transaction: Transaction) => Promise<T>

class Database {
  private db: SpannerDatabase
  constructor(db: SpannerDatabase) {
    this.db = db
  }

  runTransaction = <T extends {}>(fn: TransactionFn<T>): Promise<T> => {
    return retry(async () => {
      const [transaction] = await this.db.getTransaction()

      try {
        return await fn(transaction)
      } catch (e) {
        if (e.code === ABORTED) throw e
        throw new retry.AbortError(e.message)
      }
    })
  }

  run = (query: Query): Promise<[Row[], object]> => {
    return this.db.run(query)
  }

  table = (name: string): Table => {
    return this.db.table(name)
  }
}

export default Database

Used like this:

      await this.db.runTransaction<boolean>((transaction) => {
        this.positions.create(position, transaction)
        return transaction.commit().then(() => true)
      })

Which gives you type-safe transaction results. A reasonable update would be to allow it to take transaction options as an additional parameter.

(This uses typescript types that i defined and submitted to DefinitelyTyped. Currently failing some of their strict lint cases, and probably slightly wrong in one or two places, but mostly usable anyway. Gist of them here: https://gist.github.com/majelbstoat/cfe80d665baec8631d25f10a7b23e917)

@walshie4
Copy link

walshie4 commented Jul 1, 2018

Trying to make a version of this without retries but am finding some conflicting info on how to handle errors from commit.

I remember being told on a call with the Spanner team that we should be calling .end() after .commit() if we receive an error from the .commit() call.

Is this correct or is the usage described here what I should follow [0]?

[0] #103 (comment)

@walshie4
Copy link

walshie4 commented Jul 1, 2018

Here's what we've cooked up thus far on this front, requires Bluebird:

We're calling rollback() in case a non-spanner related error occurs in the transaction. I think I remember this function being referred to as safe to call in this way. .end() call refers to my previous question if commit() fails. Users of this function could also bail out manually by calling dbTxn.rollback() and returning from their trxFunc below.

// Add disposer to getTransaction
function getTransaction(spannerConn) {
  return spannerConn.getTransaction()
  .disposer(async (dbTrx, trxActionResultPromise) => {
    if (trxActionResultPromise.isRejected()) {
      // Call in case of non-spanner related error being thrown
      await dbTrx.rollback();
      // Make sure failed txn gets closed
      // Waiting on response about if we should actually do this
      // https://github.com/googleapis/nodejs-spanner/issues/249#issuecomment-401637757
      await dbTrx.end();

    }
  });
}

/*
 * Background info: https://github.com/googleapis/nodejs-spanner/issues/249
 *
 * Promisified transaction helper
 * Returns Promise with transaction result.
 * Will *not* retry automatically on transient errors.
 *
 * @param {Database} spannerConn - Spanner db instance
 * @param {Function} trxFunc - Function to run with db transaction as an argument to perform operations
 */
export function getAndRunTransaction(spannerConn, trxFunc) {
  return Promise.using(getTransaction(spannerConn), ([dbTrx]) => trxFunc(dbTrx));
}


// Elsewhere...


const spanner = new Spanner();
const doStuff = (dbTxn) => {
  dbTxn.insert(...);
  /* ... */
  dbTxn.commit();
};
getAndRunTransaction(spanner, doStuff) // txn cleanup handled for you via disposer
.then((result) => {
  // do success stuff
}).catch((err) => {
  // Do failure things
});

Could fairly easily be combined with the above approaches to retry on transient errors as well.

Disposer docs: http://bluebirdjs.com/docs/api/disposer.html
.using docs: http://bluebirdjs.com/docs/api/promise.using.html

@callmehiphop
Copy link
Contributor

@walshie4 you should call end() after commit() if you're not retrying (if there's an error). end() will release the session from that transaction.

@majelbstoat
Copy link
Author

@callmehiphop The documentation says end() should only be called in a readonly transaction. Is that incorrect? https://cloud.google.com/nodejs/docs/reference/spanner/1.4.x/Transaction#end

@callmehiphop
Copy link
Contributor

@majelbstoat generally that is true. All end() does is release the associated session back into the session pool. We basically assume that whenever commit() or rollback() are called, that you are done with the Transaction. In the event that either of those calls fail, we don't call end() in case you start making more requests.

@stephenplusplus
Copy link
Contributor

@callmehiphop what's the takeaway from this issue? Do we need to make something better / something new-- in either case, can you please state the specifics of what that would be? :)

@callmehiphop
Copy link
Contributor

The takeaway is that our users want a Promise friendly version of runTransaction. I think our best plan of action would be to wait and follow the changes outlined in googleapis/google-cloud-node#2556 (comment) - which will make supporting this feature much easier.

@walshie4
Copy link

We basically assume that whenever commit() or rollback() are called, that you are done with the Transaction. In the event that either of those calls fail, we don't call end() in case you start making more requests.

A bit confused by this. What do you mean by in case you start making more requests?

Would that include the case of the transaction being retried automatically?

Related to that, if .end() were called inside a transaction after a transient .commit() or .rollback() error was thrown would the retry try to acquire a new session or because the one it was holding was released would fail with a non-transient error?

@callmehiphop
Copy link
Contributor

A bit confused by this. What do you mean by in case you start making more requests?

If for some reason commit or rollback throw an error (this includes us retrying internally and it still failing) we don't release the session. I think the main use case is to allow users to try and rollback on a failed commit. @snehashah16 should we always call end here?

Would that include the case of the transaction being retried automatically?

We don't throw an error until after we've retried internally.

Related to that, if .end() were called inside a transaction after a transient .commit() or .rollback() error was thrown would the retry try to acquire a new session or because the one it was holding was released would fail with a non-transient error?

Assuming the callback for rollback/commit was called, we've already stopped retrying the transaction function. These two functions callbacks shouldn't get called until we've either 1. succeeded or 2. hit our retry limits.

As far as retrying a transaction goes, we do not fetch a new session, but instead begin a new transaction on the same session that was used prior.

@walshie4
Copy link

Made a gist here [0] with a sample runTransaction setup to help explain this in the future, would really appreciate it if @callmehiphop @stephenplusplus or someone else could chime in if this is correct.

The main thing I'm unsure of is if I need to call rollback if commit fails.

On that note regarding your answer above:

I think the main use case is to allow users to try and rollback on a failed commit.

Can you elaborate here? How could one rollback a failed commit? If it failed isn't there nothing to rollback? Or are you referring to the locally queued operations?

[0] https://gist.github.com/walshie4/d340f7f91cf396326933af5a7b835fa6

@callmehiphop
Copy link
Contributor

@walshie4 currently that is correct, there would be nothing to rollback, however my understanding is that soon there will be other asynchronous ways to update/insert data besides commit.

@majelbstoat
Copy link
Author

"soon there will be other asynchronous ways to update/insert data besides commit"

that is intriguing 🤔

@walshie4
Copy link

Yeah for sure, @callmehiphop do you have any other details or general examples in the space you could point to for more details about this future functionality?

Also less related will either you or @stephenplusplus be at Google Next next week in SF?

@majelbstoat
Copy link
Author

Just to update this, you actually get (further) errors if you're trying to rollback after an error on commit():

Cannot rollback a transaction after Commit() has been called

It does make it a bit tricky to write a single general-purpose wrapper for a transaction where you don't know if commit was called or not (if the error happened earlier in the transaction, like a not found error or something). Would be nice if rollback() after commit() was just a no-op.

@callmehiphop
Copy link
Contributor

@walshie4 I've probably already said too much! And I wish, but I am unable to attend. 😦

@majelbstoat I'm not sure if the client library is the right place to make a change like that, but we could add a committed property to the Transaction object that indicates whether or not a commit occurred, making it easier to determine. Thoughts?

@majelbstoat
Copy link
Author

@callmehiphop yeah, that would be great :)

@callmehiphop
Copy link
Contributor

@crwilcox this one should be safe to close now, yeah?

@google-cloud-label-sync google-cloud-label-sync bot added the api: spanner Issues related to the googleapis/nodejs-spanner API. label Jan 31, 2020
@JustinBeckwith JustinBeckwith self-assigned this Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the googleapis/nodejs-spanner API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

6 participants