Re-write the weight/size API #2076

tcharding · 2023-09-17T22:00:53Z

Audit and re-write the weight/size API for Block and Transaction. First two patches are trivial, patch 3 contains justification and explanation for this work, copied here:

    Recently we introduced a bug in the weight/size code, while
    investigating I found that our `Transaction`/`Block` weight/size APIs
    were in a total mess because:
    
    - The docs were stale
    - The concept of weight (weight units) and size (bytes) were mixed up
    
    I audited all the API functions, read some bips (141, 144) and re-wrote
    the API with the following goals:
    
    - Use terminology from the bips
    - Use abstractions that mirror the bips where possible

Please note, this PR introduces panics if a sciptPubkey overflows the calculation weight = spk.size() * 4.

Fix #2049

tcharding · 2023-09-17T22:09:30Z

FTR I feel I may have been responsible for some of the confusion introduced recently because of my review suggestions being either incorrect or misleading.

bitcoin/src/blockdata/transaction.rs

apoelstra · 2023-09-19T14:15:55Z

bitcoin/src/blockdata/transaction.rs

+        // FIXME: Is this vbyte/byte comment correct, sane, and meaningful?
+        // For outputs size is equivalent to virtual size since it is the same for legacy and segwit.
+        // FIXME: Is this unchecked usage correct?
+        Weight::from_vb_unchecked(self.size() as u64) // Unchecked because size() cannot overflow.


The comment would be simpler as size is equivalent to virtual size since all bytes of a TxOut are non-witness bytes.

As for the claims of overflow, strictly speaking no. When convirting from vsize to weight there is a *4 multiplier, which could cause an overflow even if size does not overflow.

IMO we shoud panic here and document that if the script size exceeds 2^62 then it'll panic.

Cool thanks, will work in the suggestions.

Would it not be better to have all the weight functions return Option<Weight> and add weight_unchecked functions?

I dunno, this feels like a crappy API given that under normal usage weights will never come within a factor of a million of overflowing.

We could have weight functions that panic or truncate and weight_checked ones that don't.

I solved this by just using saturating add/mull and commenting that its for defensive reasons.

I think panicking would be better because if someone hits it it means something is very wrong somewhere else. Finding why the value is u64::MAX is much harder.

Agreed, after discussion I have also come around to "better to panic than to saturate" viewpoint. In theory it's a DoS vector. In practice if a user manages to hit this panic, they'll also hit panics using + on normal int types, and there's nothing we can do to prevent that.

apoelstra · 2023-09-19T14:16:40Z

bitcoin/src/blockdata/transaction.rs

+fn size_from_script_pubkey(script_pubkey: &Script) -> usize {
+    let len = script_pubkey.len();
+    Amount::SIZE + VarInt::from(len).size() + len
+}


In 3d69a9b:

Same story: this will panic if the script size is within a few bytes of 2^64. We should just document it and move on.

Is it worth introducing an invariant on the ScriptBuf and Script types to limit the size?

A quick websearch brought up this https://bitcoin.stackexchange.com/questions/117594/what-are-bitcoins-transaction-and-script-limits#117595

In case anyone is not using it this search site is awesome: https://bitcoinsearch.xyz/

Yeah, I think we should restrict Script to have a maximum 2^31 size.

Leaving this for another day because I don't want to hackishly add an invariant and I don't feel like thinking hard about it right now. I used saturating add/mul to resolve this issue.

Same comment as I made elsewhere, but just saturating can be bad IMO since it leads to a silent failure. I agree with @apoelstra here to just panic and mark as panic and move on if wrapping in an error or none is too heavy to do now.

@apoelstra why limit the script? I don't remember this limit in consensus rules. Also I suspect it could make using scripts much more annoying.

@Kixunil in practice, scripts have to fit into blocks, which are limited to 4M.

Currently we have limited script to "whatever the maximum number of elements a Vec can hold" which is at most 2^32 on 32-bit systems, but which is likely to be 2^31 in practice because 32-bit allocators want to be able to hold allocation-sized signed diffs. These system-specific limits require us to have overflow checks all over the place when manipulating scripts, such as in this case where we are trying to add 12 to a script size and are worried that we'll overflow usize.

If we had a 2^31 limit this would:

Never be encountered in practice.

Not affect the API for any of the "template constructors" like p2wsh, or conversions from addresses, or parsing from transactions (which I believe already have a "maximum vector size" check which has a much lower limit).

In the case of script::Builder we could defer the API impact to into_script so that users could continue to use push_* whatever without needing to check errors constantly, and only check for overflow in the end. (And we could provide a panicking variant for users who were sure they weren't going to overflow.)

We would also have to limit input and output counts and transaction counts, since you could technically have so many of them it overflows. I'm not convinced putting all these checks everywhere is better than overflow checks everywhere. Overflow checks are at least pretty easy to understand and get right.

This is true -- trying to bound all types to stay with in a non-overflowing window would be really hard (and probably impossible in some cases; e.g. you can have a taptree with 2^128 leaves in theory which would still be legal to use on-chain).

bitcoin/src/blockdata/transaction.rs

apoelstra

ACK 3d69a9b

apoelstra

ACK 07ad0bc

tcharding · 2023-09-21T04:29:38Z

Used saturating add/mul.

tcharding · 2023-09-21T04:36:20Z

So my effort yesterday did not adhere to the epic list of PR requirements #843, I'm not sure how we are supposed to enforce that other than everyone internalising it.

apoelstra · 2023-09-21T17:06:52Z

Works for me.

I wonder if we should change the Weight type to have multiplication/addition ops which saturate, and comment than we've done this. (Or have an internal "overflow" flag? Or something?)

apoelstra

ACK 14e796e

tcharding · 2023-09-22T00:46:06Z

I wonder if we should change the Weight type to have multiplication/addition ops which saturate, and comment than we've done this. (Or have an internal "overflow" flag? Or something?)

I had a play with it but it requires a bit of thinking and feels like there are more important things to do right now, I added #2086 to flag it.

yancyribbens · 2023-09-22T08:46:05Z

bitcoin/src/blockdata/block.rs

+    /// > Block weight is defined as Base size * 3 + Total size.
+    pub fn weight(&self) -> Weight {
+        // This is the exact definition of a weight unit, as defined by BIP-141 (quote above).
+        let wu = self.base_size().saturating_mul(3).saturating_add(self.total_size());


I'm still not convinced that using saturated_mul is better than checked_mul. As a downstream consumer, I'd prefer to know when the bounds are exceeded and get an error or panic instead of silently succeeding at the saturation point. There are a number of checked helper methods in weight https://github.com/rust-bitcoin/rust-bitcoin/blob/master/bitcoin/src/blockdata/weight.rs#L97).

No sane use case should ever hit this limit, if a user really wants to know they can use base_size() and total_size() themselves and use the checked versions.

As a downstream consumer, I'd prefer to know when the bounds are exceeded

I can tell you right now, without my reading or your writing the code, that the bounds are not exceeded. Would you really like to type a ton of untested panicking extra code every time you use the API, as a reminder?

Don't have any strong opinions on usage here, but I don't like this method returning a result/option. Any other way (including how it is done) of doing this is fine is me.

yancyribbens · 2023-09-22T10:01:26Z

bitcoin/src/blockdata/transaction.rs

@@ -298,6 +310,9 @@ impl Sequence {
    /// disables relative lock time.
    pub const ENABLE_RBF_NO_LOCKTIME: Self = Sequence(0xFFFFFFFD);

+    /// The number of bytes that a sequence number contributes to the size of a transaction.
+    const SIZE: usize = 4; // Serialized length of a u32.


If this is the serialized length of a u32, then why not just make this a u32 instead of a usize?

Lengths are type usize in Rust by convention.

yancyribbens · 2023-09-22T10:05:17Z

bitcoin/src/blockdata/transaction.rs

-    /// Keep in mind that when adding a TxOut to a transaction, the total weight of the transaction
-    /// might increase more than `TxOut::weight`. This happens when the new output added causes
-    /// the output length `VarInt` to increase its encoding length.
+    /// # Panics


It doesn't look like this ever Panics now.

Ah my bad, thanks. Will fix.

yancyribbens · 2023-09-22T10:06:48Z

bitcoin/src/blockdata/transaction.rs

+            Some(weight) => weight,
+            // This should not happen under normal conditions, but in case someone is doing
+            // something malicious return max so as not to silently overflow.
+            None => Weight::MAX,


It doesn't seem any better to return MAX silently instead of overflow silently. Could None be returned instead of Max if the bounds are exceeded so that the consumer knows.

In my opinion this code shows that the Weight::from_vb API is wrong and that it should do saturating multiplication, please see #2086.

yancyribbens · 2023-09-22T10:14:02Z

bitcoin/src/blockdata/transaction.rs

-        let outputs = self.output.iter().map(|txout| txout.script_pubkey.len());
-        predict_weight(inputs, outputs)
+        // This is the exact definition of a weight unit, as defined by BIP-141 (quote above).
+        let wu = self.base_size() * 3 + self.total_size();


Is it possible that this could overflow?

Wow sloppy work by me when I re-did this version of the PR, all math operations in the PR should be handled the same way. Will fix, thanks.

yancyribbens · 2023-09-22T10:20:00Z

This seems to conflict with a earlier PR #2069. I'll close 2069 when/if this one is merged.

One of our stated aims is to make it possible to learn bitcoin by using our library. To help with this aim add to private consts for the segwit transaction marker and flag serialization fields.

In an attempt to help super new devs add code comments about transaction serialization formats pre and post segwit.

tcharding · 2023-09-23T04:55:38Z

I've documented the saturating add/mull on all the size functions. I have not documented the saturating behaviour on weight functions because, as stated elsewhere, I think the code in this PR is highlighting that our Weight API is not ergonomic to use in the typical usecase (see #2086).

Also fixed my mistakes highlighted in review, sorry for the Friday afternoon sloppy AF push yesterday.

tcharding · 2023-09-23T05:00:20Z

To be explicit I have gone ahead with the saturating add/mull here, can we leave it like that and argue the pros/cons of that over in #2086, the point of this PR is just to clear up the size/weight stuff before the imminent release. The saturating behaviour can and should be argued through more thoroughly before v1.0.0

yancyribbens · 2023-09-23T09:17:15Z

bitcoin/src/blockdata/transaction.rs

@@ -200,9 +208,6 @@ pub struct TxIn {
 }

 impl TxIn {
-    /// The weight of a `TxIn` excluding the `script_sig` and `witness`.
-    pub const BASE_WEIGHT: Weight = Weight::from_wu(32 + 4 + 4);


I was using this. Can you leave this in? There's a comment by @stevenroose which is correct in saying this should be multiplied by 4.

No, "base [size]" is well defined term in the bip and this is not what it means. I should never have acked this const, its badly named, badly documented, and wrong. If you want this you can use Weight::from_vb(Sequence::SIZE + OutPoint::SIZE) (or from_vb_uncheked if you want const).

I should never have acked this const

You do a lot of code review which is a thankless job most of the time and helps the project a lot. And when one of the many you reviewed has a mistake then people notice your review. There are a lot of maintainers on this project that never bother to code review as far as I can tell. Really I'm not sure what the point is of having lots of maintainers that never code review.

yancyribbens · 2023-09-23T09:44:00Z

To be explicit I have gone ahead with the saturating add/mull here, can we leave it like that and argue the pros/cons of that over in #2086, the point of this PR is just to clear up the size/weight stuff before the imminent release. The saturating behaviour can and should be argued through more thoroughly before v1.0.0

IMO it would be better to just fix the semantics of the API and leave all of the saturation changes for a different PR. Although I can understand wanting to make sure there is no hidden overflows.

sanket1729

I have a small preference for serialized_size instead of size in the API names. Overall, I am okay with any decision in the PR as long as:

We don't silently overflow.
We don't have option/result types.

Any implementation using panics/saturation is good with me.

sanket1729 · 2023-09-24T00:10:43Z

bitcoin/src/blockdata/block.rs

+    /// > Block weight is defined as Base size * 3 + Total size.
+    pub fn weight(&self) -> Weight {
+        // This is the exact definition of a weight unit, as defined by BIP-141 (quote above).
+        let wu = self.base_size().saturating_mul(3).saturating_add(self.total_size());


Don't have any strong opinions on usage here, but I don't like this method returning a result/option. Any other way (including how it is done) of doing this is fine is me.

sanket1729 · 2023-09-24T00:16:32Z

bitcoin/src/blockdata/block.rs

+        let mut size = Header::SIZE;
+
+        size = size.saturating_add(VarInt::from(self.txdata.len()).size());
+        for tx in self.txdata.iter() {


nit: slightly prefer more rust idiomatic

self.txdata.iter().map(|tx| tx.base_size()) .fold(size, |acc, elem| acc.saturating_add(elem))

Some part of me does not like the inefficiency of having bounds check everytime in the loop when we are sure that we are not exceeding it. But I can live with it :)

Same comment suggesting to convert loop to rust iterators where possible. More reading if you are interested: https://ipthomas.com/blog/2023/07/n-times-faster-than-c-where-n-128/

Nice article. Besides a performance increase, using bitwise operations also helps prevent against timing attacks by avoiding branching. It would be possible to write a C version using bitwise operations as well I think.

tcharding · 2023-09-24T08:32:57Z

Ok, this whole saturating thing is starting to piss me off. I'm going to change this whole PR back to use normal operators +, * and just punt the whole panic/checked/saturate business to another day.

tcharding · 2023-09-24T21:54:21Z

We don't silently overflow.

We have option/result types.

Any implementation using panics/saturation is good with me.

In (2) did you mean to write "We don't return option/result types"?

Recently we introduced a bug in the weight/size code, while investigating I found that our `Transaction`/`Block` weight/size APIs were in a total mess because: - The docs were stale - The concept of weight (weight units) and size (bytes) were mixed up I audited all the API functions, read some bips (141, 144) and re-wrote the API with the following goals: - Use terminology from the bips - Use abstractions that mirror the bips where possible

tcharding · 2023-09-24T22:29:32Z

This now introduces silent overflows, the whole saturating/panic thing turned out to be an epic bikeshedding event and totally out of scope for this PR which is explicitly "fix the weight/size mess we recently introduced" - leaving all arithmetic stuff for #2086

@sanket1729 this is against your review, requires your concept ack/nack to proceed please.

apoelstra

ACK c34e3cc

sanket1729

ACK c34e3cc.

Sorry @tcharding, I forgot my "don't" in the comment. I meant we don't have Option/Result types.

I have this really bad habit of sometimes omitting negation while writing. I might need some professional diagnosis :P

tcharding · 2023-09-26T01:56:19Z

LOLZ

yancyribbens · 2023-09-27T09:40:54Z

Thanks for fixing this @tcharding. I'd have preferred to fix my own f* ups but I understand with imminent 1.0 release it's important to fix this quickly.

Kixunil · 2023-10-03T05:56:02Z

with imminent 1.0 release it's important to fix this quickly.

LOL, not really that imminent.

tcharding · 2023-10-03T06:01:38Z

I think OP meant v0.31.0 being imminent.

Kixunil · 2024-01-18T19:39:42Z

bitcoin/src/blockdata/transaction.rs

+            .input
+            .iter()
+            .map(|input| {
+                if self.use_segwit_serialization() {


Nice quadratic complexity we made here. How TF I didn't see this before?!

How TF I didn't see this before?!

The name for this function is bad. If the core developers make mistakes like that, they will be happening downstream a lot, and there will be no one to notice them for years.

Could be it, I'm not sure. I vaguely remember looking at it during my review but seeing that I didn't post a proper review I think I didn't actually do a proper review. Looking at my calendar there was a conference around that time so clearly I was preparing for that. :D

I agree that the name is bad and even after seeing this code it does not look quadratic.

tcharding force-pushed the 09-18-weight branch 3 times, most recently from 4cf8b56 to 3d69a9b Compare September 18, 2023 20:40

tcharding marked this pull request as ready for review September 19, 2023 00:06

apoelstra reviewed Sep 19, 2023

View reviewed changes

bitcoin/src/blockdata/transaction.rs Outdated Show resolved Hide resolved

apoelstra reviewed Sep 19, 2023

View reviewed changes

bitcoin/src/blockdata/transaction.rs Outdated Show resolved Hide resolved

apoelstra previously approved these changes Sep 19, 2023

View reviewed changes

tcharding mentioned this pull request Sep 20, 2023

dust_value duplicates logic #2083

Open

tcharding dismissed apoelstra’s stale review via 07ad0bc September 20, 2023 03:09

tcharding force-pushed the 09-18-weight branch from 3d69a9b to 07ad0bc Compare September 20, 2023 03:09

apoelstra previously approved these changes Sep 20, 2023

View reviewed changes

tcharding dismissed apoelstra’s stale review via 14e796e September 21, 2023 04:29

tcharding force-pushed the 09-18-weight branch from 07ad0bc to 14e796e Compare September 21, 2023 04:29

apoelstra previously approved these changes Sep 21, 2023

View reviewed changes

tcharding mentioned this pull request Sep 22, 2023

Silent overflow in release mode in size and weight functions #2086

Open

yancyribbens reviewed Sep 22, 2023

View reviewed changes

tcharding added 2 commits September 23, 2023 14:38

Add segwit serialization constants

29f20c1

One of our stated aims is to make it possible to learn bitcoin by using our library. To help with this aim add to private consts for the segwit transaction marker and flag serialization fields.

Add code comments to transaction serialization

73f7fbf

In an attempt to help super new devs add code comments about transaction serialization formats pre and post segwit.

tcharding dismissed apoelstra’s stale review via 7d1f04e September 23, 2023 04:53

tcharding force-pushed the 09-18-weight branch from 14e796e to 7d1f04e Compare September 23, 2023 04:53

yancyribbens reviewed Sep 23, 2023

View reviewed changes

sanket1729 reviewed Sep 24, 2023

View reviewed changes

tcharding force-pushed the 09-18-weight branch from 7d1f04e to c34e3cc Compare September 24, 2023 22:25

apoelstra approved these changes Sep 25, 2023

View reviewed changes

sanket1729 approved these changes Sep 26, 2023

View reviewed changes

apoelstra merged commit 0de8ec5 into rust-bitcoin:master Sep 26, 2023
29 checks passed

tcharding deleted the 09-18-weight branch September 28, 2023 22:53

apoelstra mentioned this pull request Jan 17, 2024

Remove BASE_WEIGHT #2350

Closed

Kixunil reviewed Jan 18, 2024

View reviewed changes

Kixunil mentioned this pull request Jan 18, 2024

Transaction::total_size has quadratic complexity instead of linear #2357

Closed

jirijakes mentioned this pull request May 12, 2024

[Draft] Upgrade rust-bitcoin dependency to 0.31 lightningdevkit/rust-lightning#3063

Draft

6 tasks

Re-write the weight/size API #2076

Re-write the weight/size API #2076

Conversation

tcharding commented Sep 17, 2023 • edited

tcharding commented Sep 17, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apoelstra left a comment

Choose a reason for hiding this comment

apoelstra left a comment

Choose a reason for hiding this comment

tcharding commented Sep 21, 2023 • edited

tcharding commented Sep 21, 2023

apoelstra commented Sep 21, 2023

apoelstra left a comment

Choose a reason for hiding this comment

tcharding commented Sep 22, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yancyribbens commented Sep 22, 2023 • edited

tcharding commented Sep 23, 2023

tcharding commented Sep 23, 2023

Choose a reason for hiding this comment

tcharding Sep 24, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yancyribbens commented Sep 23, 2023

sanket1729 left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tcharding commented Sep 24, 2023

tcharding commented Sep 24, 2023

tcharding commented Sep 24, 2023

apoelstra left a comment

Choose a reason for hiding this comment

sanket1729 left a comment • edited

Choose a reason for hiding this comment

tcharding commented Sep 26, 2023

yancyribbens commented Sep 27, 2023

Kixunil commented Oct 3, 2023

tcharding commented Oct 3, 2023

Choose a reason for hiding this comment

dpc Jan 19, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tcharding commented Sep 17, 2023 •

edited

tcharding commented Sep 17, 2023 •

edited

tcharding commented Sep 21, 2023 •

edited

yancyribbens commented Sep 22, 2023 •

edited

tcharding Sep 24, 2023 •

edited

sanket1729 left a comment •

edited

sanket1729 left a comment •

edited

dpc Jan 19, 2024 •

edited