Race conditions in Customer.get_or_create() #455

jleclanche · 2017-03-10T04:38:48Z

Since landing #440 I've been trying to figure out the race conditions in Customer.get_or_create().

The issue is the same as #429, except that in this case we don't have a stripe ID to play with. Let's look at the code:

    @classmethod
    def get_or_create(cls, subscriber, livemode=djstripe_settings.STRIPE_LIVE_MODE):
        try:
            return Customer.objects.get(subscriber=subscriber, livemode=livemode), False
        except Customer.DoesNotExist:
            return cls.create(subscriber), True

When this path is reached by two threads at the same time and the customer does not exist in the local db, both of them reach the last line, which creates the customer.
The Stripe API, having no concept of our per-user unique, will happily create both of them. dj-stripe will crash on the second thread because the unique_together constraint fails... but the customer is still created upstream; not only are we in a desynced state now, but the stripe db is polluted with a useless and confusing object.

So, what to do? The correct acid-compliant fix is to create the Customer object in the db beforehand. But... we don't have a stripe_id for it yet. We can save it with an empty stripe id, or a random one, but there is still a synchronicity problem: The second thread will be able to get the customer object, but the data on it is not current. That's detectable and not a problem right? Well, no, because we dont have a stripe_id, we can't sync it.

I'm a bit stumped. My first idea is we could create the Customer beforehand, attach the user id in the metadata and use that for synchronization? This is part of a larger overall issue of keeping customers in sync between downstream and upstream: If a customer is deleted locally, it's impossible to retrieve the upstream object.

kavdev · 2017-03-10T09:22:01Z

I'm adding a new label for the race conditions you're finding :)

jleclanche · 2017-03-10T14:56:54Z

Okay, a quick email conversation with Stripe support pointed out that we can use Stripe idempotency keys for this scenario.

Now I'm wondering, how to integrate these keys into the database to ensure ACID compliance?

I think there's two ways to do this: Either we integrate them directly on the Customer object, or we make a separate table for keys.
Now, we're likely to need these keys for more than just this scenario. It just occured to me that dj-stripe does not use them anywhere right now, even in places where it really, really should (subscribing/charging users namely).

In the scenario I described above, what needs to happen is that get_or_create looks for an idempotency key in the database first, corresponding to the request. First drycoded prototype:

class IdempotentRequest(models.Model):
    idempotency_key = UUIDField(max_length=36, primary_key=True, editable=False)
    object_type = CharField(choices=...) # "customer", ...
    denominator = IntegerField(unique=True)

    class Meta:
        unique_together = ("object_type", "denominator")

Now the get or create would look like this:

def get_or_create(cls, subscriber, livemode=djstripe_settings.STRIPE_LIVE_MODE):
	try:
		return cls.objects.get(subscriber=subscriber, livemode=livemode), False
	except cls.DoesNotExist:
		random_key = uuid.uuid4()
		request, _ = IdempotentRequest.objects.get_or_create(
			object_type="customer", denominator=subscriber.id,
			defaults={"idempotency_key": random_key}
		)
		return cls.create(subscriber, idempotency_key=request.idempotency_key)

This system lets us reuse the IdempotentRequest model for other requests as well as long as there's an object type and a common denominator. I don't know how useful that is though because I haven't thought about charges yet.

jleclanche · 2017-03-10T15:03:55Z

The alternative to this is not to have a model at all. Instead, we add an idempotency key as creation_key on the Customer model itself. I'm torn, that means it's not reusable ... although, I don't know how reusable that unique_together denominator system is either.

jleclanche · 2017-03-10T15:05:34Z

Could also make that denominator a unique CharField and have a pattern for it. Like "customer:123" in case of a customer. It just has to be consistently reproducible.

lskillen · 2017-03-10T15:13:58Z

@jleclanche Fantastic digging - I came across the idempotence documentation before and had meant to bring it up for discussion, but got sidetracked away from djstripe work at the time. A potential suggestion is, assuming that our model instances are hashable, we could use either __hash__ or a special/new method for calculating the idempotence key based on attributes of the object. This would allow it to vary the key based on the object itself. I would also add some sort of "bypass" mechanism on the create for the situations where we really did want to create (for example) duplicate charges with exactly the same details.

jleclanche · 2017-03-10T15:18:36Z

I don't think hashing would work. I believe it would fail if some attributes change.

I actually quite like the denominator being a mere charfield. With a consistent pattern I think it's super flexible. This is my current prototype, I'm going to try it on live:

        try:
            return Customer.objects.get(subscriber=subscriber, livemode=livemode), False
        except Customer.DoesNotExist:
            action = "customer:create:%i" % (subscriber.id)
            idempotency_key, _ = IdempotencyKey.objects.get_or_create(action=action)
            return cls.create(subscriber, idempotency_key=idempotency_key.uuid), True
...
class IdempotencyKey(models.Model):
    uuid = UUIDField(max_length=36, primary_key=True, editable=False, default=uuid.uuid4)
    action = CharField(unique=True)
    created = DateTimeField(auto_now_add=True)

lskillen · 2017-03-10T15:27:56Z

Sorry, what I actually meant was something like __hash__ and not __hash__ itself - I realise that offering every attribute wouldn't work because it wouldn't be consistent (e.g. timestamps updated on save, or other fields on derived classes that we don't control).

If we offered a method or a property on each model then the model itself could determine what parts of itself contribute to its own idempotence key. In your case it might be more like:

StripeObject:

def get_idempotence_key(self):
    fields = self.get_idempotence_key_fields()
    # calculate idempotence key based on field hashing (or something)
    # possible to also store/retrieve/check idempotence key at this point, if a model-based key is desired

@abc.abstractmethod()  # maybe, or sane default
def get_idempotence_key_fields(self):
   pass

@property
def idempotence_key(self):
    return self.get_idempotence_key()

Customer:

def get_idempotence_key_fields(self):
    return {
        'subscriber': subscriber.pk
    }

A nice bonus is that we don't need to touch the create code to modify the idempotence key, and it also supports changes made by derived classes (if necessary). We'd also be able to support the idempotence key from the base create method instead of implementing it on create for each type of object. Thoughts?

jleclanche · 2017-03-10T16:02:10Z

This limits you to one action per model though. What if you want, say, an idempotency key for deleting a customer as well?
I'm also quite liking more verbosity for the lookup :)

Anyway, I'm trying this on a live site right now:
https://github.com/jleclanche/dj-stripe/compare/1e3eef88cc350955f510115c08583b2ac4677158~...0e5f27b

Will report back if it's suitable.

lskillen · 2017-03-10T16:06:58Z

Good point. In that case I would suggest offering a grouping based on action then let the subclass itself decide what it actually means.

E.g. self.get_idempotence_key('create') calls self.get_idempotence_key_fields('create'). :-)

Sounds good though. Let us know how your testing goes!

kavdev · 2017-03-10T19:34:49Z

Awesome digging @jleclanche. Idempotency has always been in the back of my mind; I just never put aside the time to sit down and think of an architecture for it.

I think the best way to go about this is to combine the best of both solutions:

@lskillen, I like your idea of having each StripeObject tell the idempotency generator what key (thoughts on using the word "seed"?) to use. @jleclanche, it seems using the database to constrain idempotency keys might be the easiest way to do this; I like your mini-architecture.

We'll want some clever way of selecting CRUD action names to feed to the generator.

@lskillen, can you come up with a use case for opening up seed fields to the dev?

jleclanche · 2017-03-11T05:46:17Z

I've been thinking about this and I'm still -1 on having a get_idempotency_key method. The reason for that being that, two different actions on the same model may want different fields to unify on.

I'm not against making the interface publicly available but I don't see a way of doing it that is actually useful in any scenario. I also made certain that the idempotency key can be long (max length of 100) and to use %s so that UUIDs are usable.

These keys are scoped to a particular action, therefore the action should control how to generate the key. The only part where I really hesitated is whether to make the object type a separate field or not. I like keeping it simple though, less chances to screw up.

lskillen · 2017-03-11T22:20:52Z

I've been thinking about this and I'm still -1 on having a get_idempotency_key method. The reason for that being that, two different actions on the same model may want different fields to unify on.

@jleclanche Yep, agreed on the last bit, but that's why I said about using a differentiator in my comment above. The client method call it can pass it an identifier/key for the action, such as "create", and the class itself (or a derived class) can determine what fields actually contribute towards it. This would also make it much easier to (a) implement the main logic in one place; (b) make it easier to roll out idempotency to all of the other models that probably should have it also; and (c) consistent hashing is (I think) superior to a call to uuid4.

In the PR that you have, if you are also proposing supporting test mode and live mode in the same installation then using "customer:create:%s" % (subscriber.pk) as the idempotency key would result in the next call failing if the first had used livemode=True and the second had used livemode=False (for example). Separating the logic of what fields are used would make this extremely easy to vary depending on the use case.

The main reason that I'm personally not extremely keen on the model-based approach with a string-based key is that it doesn't protect against duplication in all scenarios. E.g. If I make an API call to create a customer, but my code then (for whatever reason) throws an exception, and I'm running inside a transaction, then the idempotency key also won't have been committed, which might defeat the point of having it if on a subsequent request the key differs. This probably won't happen with your Customer example, but what about other models that aren't necessarily keyed by a primary key, such as a Charge? With the model-based approach an exception thrown might mean a subsequent charge is duplicated.

These are more minor but I'm also not keen on the overhead of maintaining a transient table of keys that will come and go, because it means an insert and delete each time we make a call. Or the performance overhead of looking up an interpolated string-based key (on a larger website) per action. If we don't delete them there will also need to be some form of reaper task or management command to cleanup old keys. Also, depending on the isolation level in the database (and whether you're using transactions or not), if two duplicate requests happen at the same time, and we're relying on the database table for the unique uuid, it is still possible to have duplicate requests.

Thoughts on all of that? Mostly just trying to ensure we're moving in the right direction, but I think any movement with this is the right direction. As a fellow contributor and user of dj-stripe, I'm more than grateful for additional help in improving the library! 👍

lskillen · 2017-03-11T22:27:45Z

@kavdev Good question on the key fields.

To be honest it looks like most fields should apply to the idempotency key. In fact if we did just start with all of the keys, that would still result in a stronger guarantee of idempotence than having no idempotency key at all. I'd be in two minds if we should make it a whitelist or a blacklist. If it's a whitelist, we'll explicitly list the keys that contribute. If it's a blacklist, we'll explicitly list the keys that don't.

The intention isn't really to open them up to derived classes as a primary use case, it's more to enable easier implementation and maintenance. In saying that, there might be things that users would like to contribute to idempotence that is out of our control and understanding (e.g. a metadata value that the expects to contribute). I personally don't have a use case for changing the fields from what would be defined as the standard set.

jleclanche · 2017-03-11T22:32:23Z

Oh my, good call on the live vs. test mode! I'll think about that. I also need to verify whether the idempotency keys are scoped to live/test mode, or global to all modes, on stripe's side.

E.g. If I make an API call to create a customer, but my code then (for whatever reason) throws an exception, and I'm running inside a transaction, then the idempotency key also won't have been committed

Elaborate? I don't see this happening.

There's four scenarios in which the code can crash:

Before the idempotency key is committed to the DB
After the idempotency key is committed to the DB, but before the request is sent to stripe
After the request is sent to stripe, but before the result is written back to the DB.
After the result is written to the DB.

Scenario 1: Nothing matters, no risk of data duplication or desync as no data is ever written/sent.
Scenario 2: The key will be picked up on retry, but the database is not out of sync because no data was sent to stripe.
Scenario 3: The key will be picked up on retry, thus syncing the Stripe state to the local DB
Scenario 4: The DB is up to date. Does not matter if it crashes.

jleclanche · 2017-03-11T22:35:00Z

@lskillen The idempotency keys can't take in stripe fields as idempotence source, because Stripe creates those fields.

lskillen · 2017-03-11T22:54:56Z

@jleclanche For the scenarios above, it absolutely could happen and I've seen something like this happen in a completely different context (not dj-stripe). Assuming the user is in a mode like ATOMIC_REQUESTS (or at least the entire request is in a transaction), then any synchronous calls to Stripe would result in the creation of a uuid on your table. Although these are written to the database, if anything causes the entire transaction to fail, of which you have no real control over in ATOMIC_REQUESTS mode, then the write for your IdempotencyKey table would be rolled back along with everything else in that request if an exception is raised.

The Stripe call was synchronous, so it happened already. The next request will generate a new uuid with the model-based solution, despite the other parts of the contributing key being static "customer:create:1234" in this case. I realise that this partially depends on the user, the client code and what actual keys are used but if we're going to implement it we want to offer a fairly strong guarantee about consistency, right?

The other scenario I had mentioned related to duplicate requests depends on the isolation level of the client code. I realise you're a Postgres user, but the default isolation level for transactions in MySQL is "REPEATABLE READ", which means writes aren't visible to other transactions after they've read once. In this case any writes to the IdempotencyKey table just won't be visible to other requests until the entire transaction has completed, which does mean that there might be two entries for the same key, each with a different uuid, and yet the request was intended to be the same.

As for the contributing keys/fields themselves, yep, I realise that Stripe generated keys don't count. It'd have to be anything on our side, i.e. what the user themselves has provided, but I think that's OK/acceptable.

jleclanche · 2017-03-11T23:14:46Z

I think you are misunderstanding the way idempotency keys work and how the system is implemented. The keys will reliably generate the same action for the same customer getting created, none of it depends on the return from Stripe. The uuid is created first, alongside the action, and because both are constrained the consistency is guaranteed up to this point. If an item matching the action already exists, we get the same key back. Then, we send the key to stripe, thus guaranteeing that the request never duplicates. Stripe is now in charge of guaranteeing the consistency of its returns within the next 24 hours. The result we get back does not even really matter by that point. We are in a completely new transaction when we write it to the db.

…

On Sun, Mar 12, 2017, 00:55 Lee Skillen ***@***.***> wrote: @jleclanche <https://github.com/jleclanche> For the scenarios above, it absolutely could happen and I've seen something like this happen in a completely different context. Assuming the user is in a mode like ATOMIC_REQUESTS (or at least the entire request is in a transaction), then any *synchronous* calls to Stripe would result in the creation of a uuid on your table. Although these are written to the database, if anything causes the entire transaction to fail, of which you have no real control over in ATOMIC_REQUESTS mode, then that means thewrite for your IdempotencyKey table would be rolled back along with everything else in that request if an exception is raised. The Stripe call was synchronous, so it happened already. The next request will generate a new uuid with the model-based solution, despite the other parts of the contributing key being static "customer:create:1234" in this case. I realise that this partially depends on the user, the client code and what actual keys are used but if we're going to implement it we want to offer a fairly strong guarantee about *consistency*, right? The other scenario I had mentioned related to duplicate requests depends on the isolation level of the client code. I realise you're a Postgres user, but the default isolation level for transactions in MySQL is "REPEATABLE READ", which means writes aren't visible to other transactions after they've read once. In this case any writes to the IdempotencyKey table just won't be visible to other requests until the entire transaction has completed, which does mean that there might be two entries for the same key, each with a different uuid, and yet the request was intended to be the same. As for the contributing keys/fields themselves, yep, I realise that Stripe generated keys don't count. It'd have to be anything on our side, i..e what the user themselves has provided, but I think that's OK/acceptable. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#455 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAOXkirahGLXtwieZkqnRiY7Coyh1fKhks5rkyZBgaJpZM4MY9l5> .

lskillen · 2017-03-11T23:26:05Z

I don't believe I am misunderstanding. I realise that Stripe isn't involved at all and I realise that they use the idempotency key to guarantee that the same request won't happen twice, but only if they get the same key from us twice. What I'm saying is that the UUID that we generate will be lost, despite it being created first, despite the same input key of "customer:create:1234" being used, despite the request being sent to Stripe, if anything causes that transaction to rollback. The next time we try to do the exact same actions it will result in a brand new uuid. So the point is, we have no way to send them the same UUID again in the disaster scenario. Maybe using Customer is a bad example of this, or I'm just awful at explaining ... or it isn't a valid concern. Be worth seeing what the others think, but concurrency and consistency of anything is typically a nightmare. :-)

@kavdev Tell me I'm an idiot and I'll be quiet. ;-)

jleclanche · 2017-03-11T23:35:53Z

Why would the transaction roll back? The transaction completes before we submit the first POST. If it rolls back we are not in a state where Stripe knows about the key yet, as we do not send it until it's complete.

…

On Sun, Mar 12, 2017, 01:26 Lee Skillen ***@***.***> wrote: I don't believe I am misunderstanding. I realise that Stripe isn't involved at all and I realise that they use the idempotency key to guarantee that the same request won't happen twice, but *only if they get the same key from us twice*. What I'm saying is that the UUID that we generate will be lost, despite it being created first, despite the same input key of "customer:create:1234" being used, despite the request being sent to Stripe, if *anything* causes that transaction to rollback. The next time we try to do the exact same actions it will result in a brand new uuid. So the point is, we have *no way* to send them the same UUID again in the disaster scenario. Maybe using Customer is a bad example of this, or I'm just awful at explaining ... or it isn't a valid concern. Be worth seeing what the others think, but concurrency and consistency of anything is typically a nightmare. :-) @kavdev <https://github.com/kavdev> Tell me I'm an idiot and I'll be quiet. ;-) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#455 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAOXkgOcci2o4PCUg4hND8pxASAqdeFlks5rky2OgaJpZM4MY9l5> .

lskillen · 2017-03-11T23:38:10Z

It'd roll back if any exception is raised after the call into dj-stripe, and that's out of our control. If something happens in the client code it'll occur. For example, imagine the user is batch processing and then hits an error, or maybe they create a customer but then do something else that's intended to be an atomic set of actions. These aren't multiple transactions when you use something like ATOMIC_REQUESTS, they all happen within the same transaction, so rolling back any point of that will roll absolutely everything back to the start of the request, including our insertion into IdempotencyKey. I haven't looked too in-depth into the code, but unless we're using something like transaction.on_commit hooks in order to actually dispatch to Stripe then it isn't protected by the transaction rollback.

jleclanche · 2017-03-11T23:39:34Z

To be clear I'm still talking about the Customer object here. But I believe this pattern applies equally to other actions as long as we follow it very strictly: - Generate a consistent action key scoped to the action being performed, with its parameters - Get-or-create an idempotencykey object using the action key - Send a request specifying the idempotency_key

…

On Sun, Mar 12, 2017, 01:35 Jerome Leclanche ***@***.***> wrote: Why would the transaction roll back? The transaction completes before we submit the first POST. If it rolls back we are not in a state where Stripe knows about the key yet, as we do not send it until it's complete. On Sun, Mar 12, 2017, 01:26 Lee Skillen ***@***.***> wrote: I don't believe I am misunderstanding. I realise that Stripe isn't involved at all and I realise that they use the idempotency key to guarantee that the same request won't happen twice, but *only if they get the same key from us twice*. What I'm saying is that the UUID that we generate will be lost, despite it being created first, despite the same input key of "customer:create:1234" being used, despite the request being sent to Stripe, if *anything* causes that transaction to rollback. The next time we try to do the exact same actions it will result in a brand new uuid. So the point is, we have *no way* to send them the same UUID again in the disaster scenario. Maybe using Customer is a bad example of this, or I'm just awful at explaining ... or it isn't a valid concern. Be worth seeing what the others think, but concurrency and consistency of anything is typically a nightmare. :-) @kavdev <https://github.com/kavdev> Tell me I'm an idiot and I'll be quiet. ;-) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#455 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAOXkgOcci2o4PCUg4hND8pxASAqdeFlks5rky2OgaJpZM4MY9l5> .

jleclanche · 2017-03-11T23:42:34Z

These aren't multiple transactions when you use something like ATOMIC_REQUESTS, they all happen within the same transaction

I don't think that's correct; but if it is, that is what's wrong. The entire system relies on one transaction to get-or-create the key, then a separate transaction to perform the write. If we want to batch-write, we can use a larger scoped action.

lskillen · 2017-03-11T23:46:49Z

You may perform subtransactions using savepoints in your view code, typically with the atomic() context manager. However, at the end of the view, either all or none of the changes will be committed.

It's correct, as seen here, in the sense that it isn't possible to do partial-writes with subtransactions in this mode of operation (but partial rollbacks are of course supported with the subtransaction savepoints, but that doesn't help here). It'll also still happen if the client application wraps the outer code within a transaction. The only way for an actual commit to be guaranteed is that the wrapping code isn't within a transaction, which although possible with ATOMIC_REQUESTS the client has to explicitly disable the transaction using a decorator or similar call.

jleclanche · 2017-03-11T23:56:11Z

Hm yeah I see what you mean. Are we using that pattern in djstripe though? If this is something clients are using then we can probably pass the responsibility downstream. How else would you do it to prevent this from happening?

…

On Sun, Mar 12, 2017, 01:46 Lee Skillen ***@***.***> wrote: You may perform subtransactions using savepoints in your view code, typically with the atomic() context manager. However, at the end of the view, either all or none of the changes will be committed. It's correct, as seen here <https://docs.djangoproject.com/en/1.10/topics/db/transactions/>. It isn't possible to do partial-writes with subtransactions in this mode of operation. It'll also still happen if the user wraps the outer code within a transaction. The only way for an actual commit to be guaranteed is that the wrapping code *isn't* within a transaction, which although possible with ATOMIC_REQUESTS the client has to explicitly disable the transaction using a decorator or similar call. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#455 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAOXkhoVv_iG9lnLGLBvXn3tnyLqlgwlks5rkzJpgaJpZM4MY9l5> .

lskillen · 2017-03-12T00:07:42Z

Sorry @jleclanche, I didn't mean to derail you or ruin your night. My company personally runs with ATOMIC_REQUESTS enabled for various reasons, and we've encountered this issue in other places outside of dj-stripe, which is why I've brought all of this up as this would definitely apply to me. I imagine ATOMIC_REQUESTS can't be too uncommon since it is mentioned almost at the top of the Django documentation. As they say, concurrency is dark and full of terrors, or something like that!

My main suggestions are:

(a) Use a deterministic method of calculating the idempotency (my suggestions from earlier).
(b) Instead of the database use a caching mechanism for storing idempotency keys.
(c) Examine the use of transaction.on_commit as an alternative to performing synchronous Stripe calls.

I've already ranted about (a) enough, but (b) and (c) still have issues.

(b) is very specific to client application deployment, is still fraught with some potential racey headaches, and deploying a caching backend that is shared between all instances of a Django application (if the client runs a horizontally scaled application) isn't simple.

(c) might be interesting but means that Stripe calls are no longer synchronous in that they would happen after the current (or request) transaction completes successfully. To be honest this is probably something we should look into as an option anyway, regardless of the idempotency key support.

jleclanche · 2017-03-12T07:52:01Z

No problem at all! This is why we have these discussions. Looking at this with a fresh head, how about an overridable get_idempotency_key callback? This way the idempotency key can be stored anywhere you would want to implement it... Redis, the db, a different model/db, etc. Would this work?

…

On Sun, Mar 12, 2017, 02:07 Lee Skillen ***@***.***> wrote: Sorry @jleclanche <https://github.com/jleclanche>, I didn't mean to derail you or ruin your night. My company personally runs with ATOMIC_REQUESTS enabled for various reasons, and we've encountered this issue in other places outside of dj-stripe, which is why I've brought all of this up. As they say, concurrency is dark and full of terrors, or something like that! My main suggestions are: - (a) Use a deterministic method of calculating the idempotency (my suggestions from earlier). - (b) Instead of the database use a caching mechanism for storing idempotency keys. - (c) Examine the use of transaction.on_commit as an alternative to performing synchronous Stripe calls. I've already ranted about (a) enough, but (b) and (c) still have issues. (b) is very specific to client application deployment, is still fraught with some potential racey headaches, and deploying a caching backend that is shared between all instances of a Django application (if the client runs a horizontally scaled application) isn't simple. (c) might be interesting but means that Stripe calls are no longer synchronous in that they would happen after the current (or request) transaction completes successfully. To be honest this is probably something we should look into as an option anyway, regardless of the idempotency key support. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#455 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAOXkmhAw_RQLxWmsUT_G4axC9Dc3fnhks5rkzdOgaJpZM4MY9l5> .

kavdev · 2017-03-13T01:26:13Z

@lskillen you're not being an idiot lol; this is a very worthwhile discussion. Highlighting why a proposal won't work for your (and possibly others') use case is why all major decisions are brought to the dj-stripe community for a weigh in.

I think this is going in the right direction. @lskillen as far as fields go, keys are so closely tied to actions that there's a good chance the fields will be different for each action. Also, some seeds should be based on fk instance ids, such as in Customer.get_or_create example

lskillen · 2017-03-13T16:04:54Z

@kavdev @jleclanche The more I think about it, the more it seems like it should be the client application's responsibility to calculate the actual key, but we should facilitate making it as easy as possible to use/specify. We're in danger of making the wrong assumptions for generating a key based on the above information, and we need to consider other scenarios, such as "failed upstream", where the request failed at Stripe (for whatever reason) but we've stored an idempotence key.

The next request might be ignored because the idempotence key hasn't changed, but something external has happened that we would now expect the request to succeed. E.g. A Charge is made but fails because the user has no source. The user adds a source and the charge is tried again, but still "fails" because the idempotence key hasn't changed. If it was the client application providing the idempotency key, then it would know the previous API call failed for this reason and would know to use a different idempotency key with the next request.

It's a tough call because I like the autonomy that @jleclanche is proposing, but I suppose these concerns need to be aired. If I were to ignore all of the above concerns then my suggestion would have been to default to a consistent hashing method with "sane defaults" (or even to default it to being user-provided only) that are changeable by the developer (in case we get it wrong), allow it to be configurable with a heavily caveated model-based and cache-based backend (and I think the latter is preferable because of the transactional concerns), and allow it also the key and/or fields be overridable by the client application on each call through to the API.

jleclanche · 2017-03-13T16:15:22Z

@lskillen what do you think of the approach in #460 that uses a callback? It lets you define the storage behaviour yourself.

The next request might be ignored because the idempotence key hasn't changed, but something external has happened that we would now expect the request to succeed. E.g. A Charge is made but fails because the user has no source.

Such an action is not a second instance of a single idempotent request, it's a separate request. We consider it a bug if the action key is not specific enough to be consistently generated.

The question really is, is the framework I'm providing here flexible enough to allow usage in other patterns such as idempotent charge requests? I feel like the question cannot be answered without one of us getting down to it and designing such a system.

lskillen · 2017-03-15T00:12:47Z

Comment on #460:

I'd still like to consider the consistent hashing approach as the default rather than the model-based approach (but obviously leave it as a backend implementation for the keys). If I get the time I'll try to code up a slightly more concrete suggestion, but at the very least some of this functionality should be in StripeObject IMHO.

If we only delegate to a single callback function then it makes it really difficult to implement something like the consistent hashing approach because there's no easy way to get back to source model. If StripeObject had the get_idempotence_key then delegated to the callback for calculation then at the very least it means each model object can define its own rules prior to calling the callback. IYSWIM.

I wonder if anyone at Stripe or anyone else has considered any of this? I know that their documentation just says "create a UUID ... somehow", and actually finding any meaningful discussion on this hasn't yielded anything particularly compelling (disclaimer: I didn't try very hard).

@jleclanche You're right though, we probably do need to delve into design a little bit more, but I think for your use case your model-based approach is definitely adequate. Although I still have reservations for the model-based from the performance overhead (string-based interpolation and insert/delete per request cycle) and cleanup (when/who/how are the records removed) angle. It probably isn't a concern for a small installation, but might make a difference at an Enterprise-level application. I'll think on it tomorrow and see if I can make a suggestion that would also fit the model-based approach for you.

Sorry about the delay in contributing to the discussion for this, it's just been very busy this week. :-(

lskillen · 2017-03-15T00:24:10Z

Other comments from #460:

@jleclanche:

Stripe doesn't care what the idempotency key looks like, just that it's a stream of usable characters and is sufficiently random and unique for your purposes.

I still don't see what your approach of consistent hashing would even look like and its benefits. It seems extremely specific to the use case of performing actions on customers, very little so on other objects.

@lskillen:

Yeah, I get that. Maybe thinking about a different model object would help. How do you think you'd apply a string-based idempotence key to something like a Charge? Note that I know we generate a UUID, and I know Stripe doesn't care what it is other than it hasn't been seen before, but in the model-based approach it is identified by a string-interpolated key, and it's that bit I'm talking about.

jleclanche · 2017-03-15T00:30:28Z

Yep no problem at all.

string-based interpolation

There's no way this is a performance issue even at extreme scales. We're interpolating directly in Python, not at the DB level.

insert/delete per request cycle

Yes, this is more of an issue. Right now though I'm not doing a delete, so it's a single insert per cycle. Deletes can be batched every 24 hours (when the keys themselves expire); when they happen depends very much on the action.

I'm not sure the insert is significant even at scale. If it is though, that's why there's the possibility to switch to another idempotency key backend.

and cleanup (when/who/how are the records removed)

I have my concerns on this. It's worth mentioning that because Idempotency Keys have a lifetime of 24 hours, they're completely safe to just leave in the db. The only issue is when the table grows large enough that queries slow down (which takes a very long time for the type of model this is).

Also remember that currently, because it's only used for customer creation, the idempotency key table will never grow larger than your user table (x2 if you're doing live + test mode in the same db which you probably aren't at that scale :P). If you clean it up regularly, it'll never grow past your daily new customer count (a subset of your daily new user count). This makes it, right now, a very small table even if you're Ebay.

This will change of course when we get to using those keys for other actions, but we still haven't gotten to a design for those :)

lskillen · 2017-03-15T00:38:07Z

Sorry, when I said string-based interpolation, what I really meant was indexing on potentially sizable string-based keys in a potentially large table (but like you said, it only starts to matter at scale and when API calls other than customer are introduced, but are really not a concern otherwise). You're absolutely right though, it's hard to judge before we've even designed any other part of it, so this is all just conjecture right now! Let me have a look again tomorrow to give better feedback (although I'm really trying not to hold you back). :-)

jleclanche · 2017-03-15T00:41:19Z

Sounds good! Don't worry about holding me back - I'm using my fork in prod, and I have to use it until a lot more stuff lands (Postgres JSONField support, stuff like that). But right now I'm getting roughly at the point where my fork is in a production-ready state, as our product is subscription-based (we don't really have a lot of idempotency-critical paths).

kbrownlees · 2017-03-15T00:55:20Z

@jleclanche presumably a client side lock would be sufficient (on subscriber in this case), not quite as elegant as having stripe deal with it for you - but at 34 comments in it doesn't appear simple either.

(having only skimmed this thread feel free to ignore!)

jleclanche · 2017-03-15T00:56:15Z

A clientside lock does not solve the issue for distributed servers. The lock has to be at the db level (which this achieves).

kbrownlees · 2017-03-15T01:10:22Z

Obviously I am not talking about 'threading.Lock'. select_for_update would solve this example, but if you used a callback people could use what ever they like (etcd, redis, etc).

jleclanche · 2017-03-15T01:17:58Z

Yeah, I considered select_for_update(). I ended up not using it simply because Stripe themselves advised using their Idempotency Key system for this scenario, which I figured we would want to introduce anyway.

kbrownlees · 2017-03-15T20:31:45Z

Sorry if I missed it, but what is the need to turn two unique_together fields into a random uuid? Why can't we just send 'live:subscriber:<subscriber_id>:create:customer' (or a hash of it) as the idempotency_key?

I like @lskillen's idea to allow the classes to specify the fields which should contribute to the key, with a callback so the user can override it. I wonder if you would need two different methods - one for when the model is an identifier to another model's action (subscriber is the unique identifier to customer in this case), and one to handle when the model itself is the target of the action. Eg:

On Subscriber:

def get_idempotence_identifier(self):
    return 'subscriber:{}'.format(self.pk) # 'subscriber:12'

On Customer:

@classmethod
def get_idempotence_key(cls, action, live_mode, **extras):
    if action == 'create':
        return '{}:{}:create:customer'.format(live_mode, extras['subscriber'].get_idempotence_identifier()
    
    ...

jleclanche · 2017-03-15T21:39:23Z

Having a natural key as the idempotency key creates a risk that, if the request was bad the first time around, it's now impossible to recreate the request for 24 hours.

kbrownlees · 2017-03-15T21:52:01Z

Fair enough, but that doesn't mean you can't use the natural key by default and have a way to add exceptions. Eg:

key = cache.get(key, key)

jleclanche · 2017-03-15T21:53:02Z

I don't understand how that gives you exceptions -- also, if you get exceptions, doesn't that defeat the whole point?

kbrownlees · 2017-03-15T22:12:59Z

cache.get is cache.get(key, default) so in the general case you would just get the natural key back, but it would allow the exception handler (or a user) to do something like:

cache.set('live:subscriber:12:create:customer', 'live:subscriber:12:create:customer:retry-2', ttl=24h)

Obviously this could also be done with a table.

You are proposing adding an extra insert to every stripe request (assuming this expands to all / most requests) for the sake of a <<1% exception case?

The natural key could be passed into the user callback function so it would be trivial for a user to implement a table like the one you propose in #460 if that was important to them.

jleclanche · 2017-03-15T22:15:24Z

assuming this expands to all / most requests

Right now it expands just to Customer creation. In the future it'll probably expand to charge and subscription creation. It's unlikely to expand to more than that.

kavdev · 2017-03-22T17:26:56Z

@lskillen @kbrownlees: @jleclanche has a working implementation that's battle tested in production at high volume. I'm rolling with it unless you can give me a concrete reason not to.

lskillen · 2017-03-22T18:11:40Z

@kavdev @jleclanche Nope, you're good to go. Sorry about the delay in getting back, I'm currently on the ninth level of Hell with work and just couldn't get to it on time. We can examine this again if the idempotency functionality is added to anything else (especially Charge), but it's a thumbs up from me. It might still be at least worth mentioning that with the model-based approach, a rollback of the current transaction may also rollback the idempotency key. I also see that @kavdev already mentioned added a utility to cleanup the old keys. I think that's worthwhile. If it's a utility function that is wrapped by the management command then we advise users to automate it either via command-line or via something like a periodic celery task.

kavdev · 2017-03-22T18:26:07Z

I'm currently on the ninth level of Hell with work and just couldn't get to it on time

It's nice and cozy here, isn't it? lol

kavdev added bug discussion The premise or details of this issue is open for discussion race condition and removed bug labels Mar 10, 2017

jleclanche mentioned this issue Mar 11, 2017

Introduce an IdempotencyKey model #460

Merged

kavdev mentioned this issue Mar 23, 2017

Release 1.0.0 #307

Closed

kavdev closed this as completed in #460 Mar 23, 2017

Race conditions in Customer.get_or_create() #455

Race conditions in Customer.get_or_create() #455

Comments

jleclanche commented Mar 10, 2017

kavdev commented Mar 10, 2017

jleclanche commented Mar 10, 2017 • edited Loading

jleclanche commented Mar 10, 2017

jleclanche commented Mar 10, 2017

lskillen commented Mar 10, 2017

jleclanche commented Mar 10, 2017

lskillen commented Mar 10, 2017 • edited Loading

jleclanche commented Mar 10, 2017 • edited Loading

lskillen commented Mar 10, 2017 • edited Loading

kavdev commented Mar 10, 2017

jleclanche commented Mar 11, 2017

lskillen commented Mar 11, 2017 • edited Loading

lskillen commented Mar 11, 2017 • edited Loading

jleclanche commented Mar 11, 2017 • edited Loading

jleclanche commented Mar 11, 2017

lskillen commented Mar 11, 2017 • edited Loading

jleclanche commented Mar 11, 2017 via email

lskillen commented Mar 11, 2017

jleclanche commented Mar 11, 2017 via email

lskillen commented Mar 11, 2017 • edited Loading

jleclanche commented Mar 11, 2017 via email

jleclanche commented Mar 11, 2017

lskillen commented Mar 11, 2017 • edited Loading

jleclanche commented Mar 11, 2017 via email

lskillen commented Mar 12, 2017 • edited Loading

jleclanche commented Mar 12, 2017 via email

kavdev commented Mar 13, 2017

lskillen commented Mar 13, 2017

jleclanche commented Mar 13, 2017

lskillen commented Mar 15, 2017

lskillen commented Mar 15, 2017

jleclanche commented Mar 15, 2017

lskillen commented Mar 15, 2017 • edited Loading

jleclanche commented Mar 15, 2017

kbrownlees commented Mar 15, 2017

jleclanche commented Mar 15, 2017

kbrownlees commented Mar 15, 2017

jleclanche commented Mar 15, 2017

kbrownlees commented Mar 15, 2017

jleclanche commented Mar 15, 2017

kbrownlees commented Mar 15, 2017

jleclanche commented Mar 15, 2017

kbrownlees commented Mar 15, 2017

jleclanche commented Mar 15, 2017

kavdev commented Mar 22, 2017

lskillen commented Mar 22, 2017

kavdev commented Mar 22, 2017

jleclanche commented Mar 10, 2017 •

edited

Loading

lskillen commented Mar 10, 2017 •

edited

Loading

jleclanche commented Mar 10, 2017 •

edited

Loading

lskillen commented Mar 10, 2017 •

edited

Loading

lskillen commented Mar 11, 2017 •

edited

Loading

lskillen commented Mar 11, 2017 •

edited

Loading

jleclanche commented Mar 11, 2017 •

edited

Loading

lskillen commented Mar 11, 2017 •

edited

Loading

lskillen commented Mar 11, 2017 •

edited

Loading

lskillen commented Mar 11, 2017 •

edited

Loading

lskillen commented Mar 12, 2017 •

edited

Loading

lskillen commented Mar 15, 2017 •

edited

Loading