Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Payment batching blog post #461

Merged

Conversation

bitschmidty
Copy link
Contributor

This PR:

  • Creates a blog post version of the Payment Batching scaling book chapter
  • Makes some minor updates to the content of the post
  • (optional) Removes the scaling book version of the chapter to prevent need to maintain two versions

Once everything looks good, I will squash the commits. I think Harding should be author of the final commit for the post.

@bitschmidty bitschmidty added the book Scaling book label Sep 21, 2020
@bitschmidty bitschmidty self-assigned this Sep 21, 2020
@harding
Copy link
Contributor

harding commented Sep 21, 2020

Quickly skimmed the diff and this LGTM. Thanks!

I think Harding should be author of the final commit for the post.

Not necessary. You're moving a file to which git history already attributes me as author; it's a little hard to follow authorship back through file moves (you can't use git blame), but it's possible, so I don't think we need to do anything special---and you deserve credit for updating the post.

_includes/articles/payment-batching.md Outdated Show resolved Hide resolved
Comment on lines 39 to 43
being useful, although it does reduce its effectiveness. For example,
we expect a typical service to
receive payments of about the same value as the payments they make, so
for every output they add, they need to add one input on average.
Savings in this typical case peak at about 30%.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumption seems a bit surprising to me. I would consider that rather untypical. If I think about services, a lot of things that come to mind receive many small payments in exchange for the service they provide, but then make larger payments to their supplier. Some others, like brokerages exactly the opposite: large payments in, tons of small payments out. In fact, I'm having difficulty coming up with a service of which I'd expect that, except perhaps the odd bitcoin exchange. I'd actually surmise that a service has usually a payment asymmetry rather than a payment symmetry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to change the word from 'services' to 'users or services'? From what @xekyo is saying, large users such as exchanges and payment processors are more likely to have asymmetric payment sizes, whereas smaller users such as individuals may be more likely to have symmetric payment sizes?

_includes/articles/payment-batching.md Outdated Show resolved Hide resolved
_includes/articles/payment-batching.md Outdated Show resolved Hide resolved
_includes/articles/payment-batching.md Outdated Show resolved Hide resolved
_includes/articles/payment-batching.md Outdated Show resolved Hide resolved
Comment on lines 213 to 217
3. Within each time window, send all payments together in the same
transaction. For example, create an hourly [cronjob][] that sends all pending payments.
Ideally, your prior consolidations should allow the
transaction to contain only a single input.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been recommending the implementation of three limits:

  • a time limit how long a transaction is waiting
  • a maximum count of outputs that are waiting
  • a maximum total output value that is being paid out

If any of the three is exceeded, the batched payment should be kicked off.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of the second and third limits? The second only makes sense to me if the goal is to ensure your transaction is under the 100,000 vbyte limit (in which case, maybe make the limit "a maximum transaction size"). The third only makes sense to me to ensure the value is under what you have in the wallet being used (e.g. the hot wallet); in which case, there should probably already be some withdrawal limit mechanism in place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output count: the text argued earlier that batching has diminishing returns. Even in the optimal case of only a single input, e.g. in a transaction with 100 outputs, the cost per output is 31.78 vbytes. For a transaction with one input and 200 outputs, the cost only goes down only 1% to 31.39 vbytes.
On the other hand, even when customers may wait multiple confirmations before being fully confident that they have been paid (although in my experience most people are happy after one or two confirmations even amounts that I would consider sizeable), end-customers do tend to be antsy about seeing the service initiate the payment as soon as possible. Either because they just want to wrap up the open thread in their mind, or because they are looking to use the withdrawn amount elsewhere. Surely there is some expectation management involved there, but in my experience, services aim to keep the time until sending very brief because of the reduction in support burden it brings (some literally batch multiple times per minute, just to show the customers the transactions as quickly as possible). In a way the output count limit bounds the total customer wait time, still allowing the service to wait a bit longer for a bigger batch when there is little traffic in the middle of the night, but pushing out the transaction quicker in the rush hour when there are ten customers waiting after a minute already.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Total value: The text noted that once the value of the summed outputs exceeds the value of the largest available inputs, the addition of further inputs reduces the efficiency of batching. In practice, many send-and-receive hot wallets tend to mill through their bigger deposits rather quickly, because at higher fees they can be used to build the most space efficient transactions.
Some services consolidate consequently and have great stock in large UTXO, but especially in periods of longer congestion, consolidation transactions tend to be delayed.
Once the wallet gets into the bulk of their UTXO at e.g. ~0.02–0.07 BTC, they'll combine multiple inputs to fund their batches anyway. In that case, it may make sense to have a dynamic limit for transaction values at e.g. the value of 20 times a representative bulk UTXO. Keeping a few of the largest UTXO in stock is pertinent to be able to process the occasional huge withdrawal.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in sum, I'd argue that delimiting transaction size leads to chains of unconfirmed transactions exceeding the ancestor limits less often, enables quicker turnaround for end-users, and has only marginal efficiency loss versus the biggest possible transactions.

Edit:
Of course, I have to admit that the services I have been working most with were brokerages and exchanges. I do realize though, that all of these thoughts are a lot less rigorous than the prior text. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output count: the text argued earlier that batching has diminishing returns. [...] On the other hand, [...] end-customers do tend to be antsy about seeing the service initiate the payment as soon as possible.

I agree about users tending to be antsy---I've been that user wondering, oh, crap, did I enter the wrong address? Did I just blow a month's income?---but I guess the Spock part of my brain says that, if you're willing to make the people initiating withdrawls during slow periods wait up to x amount of time, you ought to be consistent and make everyone wait up to that amount of time. Varying the maximum amount of waiting time by current demand seems inconsistent. Also, it means your previous sends had less time to get confirmed and so you need a larger UTXO pool. (E.g. a wallet that only sends once per hour should work 99.99% of the time with just two UTXOs, which makes implementing consolidation easier.)

Keeping a few of the largest UTXO in stock is pertinent to be able to process the occasional huge withdrawal.

That makes sense: the earlier you send, the earlier you confirm (all other things being equal). Though maybe the rule should be tailored to change value rather than tx value.

(An alternative, but non-technical, reason to send high-total-value txes quickly is perhaps just to please you platform's whales.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to ensure the value is under what you have in the wallet being used

So, basically, a good batching implementation definitely needs to track the value of the pending queue anyway. I'd just argue that it is probably better to have it lower than 100% of the hot wallet. If dynamic maybe rather 5% of the hot wallet.

(E.g. a wallet that only sends once per hour should work 99.99% of the time with just two UTXOs, which makes implementing consolidation easier.)

I don't think that's necessarily true. If there is a mempool congestion, all funds could be in flight within an hour for a wallet with two UTXOs. Is that assuming that the spender uses unconfirmed inputs and automatically CPFPs unconfirmed preceding transactions? I've had a lot of fun this spring with thrifty services that mixed low feerate and high feerate sends as well as permitting any unconfirmed change unspents. There are a bunch of traps there that are fairly obvious when one understands how the mempool and UTXOs work, but there is a reason that certain wallet service providers avoid unconfirmed inputs by default in input selection. ;)

As mentioned before, I am wondering whether we are just generally thinking about wallets of very different sizes. When I read the term high-frequency spender, I'm thinking at least 1,000+ payments per day, but it sounds to me that you might have been thinking about a different scale. Maybe it would be good to define the term in the text. ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the second point (batch size) is good and should be added. If the fee savings from no batching to batching 100 payments is ~75% and the savings from batching 100 payments to 200 payments is an additional ~1%, then it makes sense for the service provider to broadcast a batch as soon as the marginal gains from a larger batch are no longer worth it. It seems like this would make it less likely to suffer issues around tx pinning and fee bumping too.

I'm less convinced of the third point (batch value) since I think it's difficult to give universal advice. The best practice seems very dependent on lots of factors specific to the wallet/service.

_includes/articles/payment-batching.md Outdated Show resolved Hide resolved
@harding
Copy link
Contributor

harding commented Oct 4, 2020

I interpret several of @xekyo's comments as him finding this documentation not comprehensive enough to be truly useful to readers. That's something I feared would be the case when I wrote this---I have minimal experience working with organizations on actually implementing batching or any of the other techniques we want to recommend. Murch is much more experienced than I am there and I don't want to ignore the theme of his comments by picking on details.

Maybe an option for this post and other posts in this planned series would be to focus on the theoretical benefits of the techniques without mentioning implementation details. It's easy to prove that, under certain conditions, batching reduces overall space and fees, consolidation reduces fees, the ability to fee bump sometimes lets you get away with lower fees, etc...

For actual practical suggestions, we could focus on field reports. Because each report focuses on only a specific organization's need, they're simpler than trying to write universally-applicable documentation and, as a bonus, they usually come with a nice narrative structure (we had a problem; we did X, Y, and Z; we lived happily ever after).

What do y'all think?

@murchandamus
Copy link
Collaborator

Uff, I didn't mean to prompt a big restructuring of the blog post. I think the article is a great intro to the topic for organizations that currently send single-payment transactions.
I was just trying to point out that best practices for batching might differ for various scales of operations, so in the places that fairly concrete recommendations are made, it might help to be more concrete regarding the scale of operation. Possibly, it would help to have a second example at a different scale in some instances.

@murchandamus
Copy link
Collaborator

murchandamus commented Oct 29, 2020

Sorry, I must have missed the commit added by @harding in a157b5e previously, and only saw the comment. These changes look great to me. I'd say ship it! :)

@harding
Copy link
Contributor

harding commented Oct 29, 2020

Re-skimmed it, looks fine to me (once the date is updated per the FIXME).

@jnewbery jnewbery force-pushed the 2020-09-payment-batching-blog branch from 91b516d to d043fdf Compare January 11, 2021 11:57
@jnewbery
Copy link
Contributor

I've rebased and squashed the update commits.

_includes/articles/payment-batching.md Show resolved Hide resolved
@@ -69,8 +66,8 @@ scenario.

In addition to payment batching directly providing a fee savings,
batching also uses limited block space more efficiently by reducing the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps /limited/the limited/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur:

Suggested change
batching also uses limited block space more efficiently by reducing the
batching also uses the limited block space more efficiently by reducing the

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this to "the limited". @harding - let me know if you disagree and think it should be reverted.

_includes/articles/payment-batching.md Show resolved Hide resolved
_includes/articles/payment-batching.md Show resolved Hide resolved
_includes/articles/payment-batching.md Show resolved Hide resolved
_includes/articles/payment-batching.md Outdated Show resolved Hide resolved
_includes/articles/payment-batching.md Outdated Show resolved Hide resolved
Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for leaving it so long to review this. It's great, and I think we should definitely publish and publicize it.

I have a bunch of very minor suggestions. @harding, feel free to take or leave whichever of them you like.

Github doesn't let me comment on parts of the text that are just moved from one file to another. Here are some additional comments:

(1) In the Concerns sections, there's a point of view shift from:

"In order to batch payments, the service must get the user to accept that their payment"

to:

"...until you send the batch containing their payment."

I think it would be easiest to just make this whole section use 'you' instead of service provider, starting with the introductory sentence:

"The fee-reduction benefits of payment batching do create tradeoffs and concerns that you will need to be address before using the technique."

and in the Delays section:

"Although some situations naturally lend ..., your service may primarily be sending money to users when those users make a withdrawal request. In order to batch payments, you..."

(2) In the Delays section perhaps it makes sense to change "accept that their payment will not be sent immediately" to "accept that their payment transaction will not be broadcast immediately", and in the Recommendations summary section change "users and customers don’t expect their payments immediately" to "users and customers don’t expect their payment transactions to be broadcast immediately". Bitcoin as a system cannot process payments immediately, since users should always wait for transactions to be confirmed, so whether or not a service implements batching, they should not create the expectation that payments are immediate.

@jnewbery
Copy link
Contributor

I've updated the dates to 2020-01-1.

I think this is ready to merge once @harding has had a chance to review the latest comments.

@harding
Copy link
Contributor

harding commented Jan 12, 2021

I don't understand #461 (comment) and I slightly prefer the original over #461 (comment) but @jnewbery's other suggestions SGTM.

@bitschmidty did you want to update this yourself, or do you want me to send you a patch?

@bitschmidty
Copy link
Contributor Author

@harding I think a patch / additional commit from you would be best.

Copy link
Collaborator

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just the one nit where it seems to me that an article is missing. The rest reads very nicely.

@@ -69,8 +66,8 @@ scenario.

In addition to payment batching directly providing a fee savings,
batching also uses limited block space more efficiently by reducing the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur:

Suggested change
batching also uses limited block space more efficiently by reducing the
batching also uses the limited block space more efficiently by reducing the

@jnewbery
Copy link
Contributor

I've pushed a commit that addresses my final comment (#461 (comment)), and updates the date to March 24. Everything else looks great.

I suggest that we squash all these and push tomorrow at the same time as the newsletter, with a link from the newsletter to the blog post. No rush though, we could also release it later this week and advertise it in next week's newsletter.

Thanks for your perseverance on this @harding, and sorry for taking so long to review it.

@jnewbery
Copy link
Contributor

Perhaps release this Friday and then advertise it in next week's newsletter?

@bitschmidty
Copy link
Contributor Author

Perhaps release this Friday and then advertise it in next week's newsletter?

ACK on this approach

@bitschmidty bitschmidty force-pushed the 2020-09-payment-batching-blog branch from 321720e to ddbf768 Compare March 24, 2021 14:28
@bitschmidty
Copy link
Contributor Author

Changed post date to 2021-03-26, squashed, rebased on master.

@jnewbery
Copy link
Contributor

Thanks @bitschmidty!

@bitschmidty bitschmidty force-pushed the 2020-09-payment-batching-blog branch from ddbf768 to 1895a42 Compare March 24, 2021 20:19
@jnewbery jnewbery merged commit db75257 into bitcoinops:master Mar 26, 2021
@harding
Copy link
Contributor

harding commented Mar 27, 2021

Thanks from me also, @bitschmidty!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
book Scaling book
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants