Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add taproot data structures #677

Merged
merged 7 commits into from Nov 12, 2021

Conversation

sanket1729
Copy link
Member

@sanket1729 sanket1729 commented Oct 8, 2021

I am still working on adding tests and testing these changes against bitcoin core. Feedback on code and API design is welcome.

Supercedes #666

  • Builder API to construct trees
  • Huffman encoding API that outputs optimal taproot tree given odds.

TODO:

  • Add control block APIs. Verify merkle proofs, create a control block from spend data
  • Add tests and verify them against bitcoin consensus

@sanket1729 sanket1729 force-pushed the taptree_utils branch 4 times, most recently from 637ee89 to 032894b Compare October 11, 2021 01:50
@sanket1729 sanket1729 marked this pull request as ready for review October 11, 2021 01:53
@sanket1729 sanket1729 added this to the 0.28.0 milestone Oct 11, 2021

/// Serialize self as bytes
pub fn serialize(&self) -> Vec<u8> {
let mut buf = Vec::with_capacity(TAPROOT_CONTROL_NODE_SIZE * self.0.len());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe more compact as: self.0.iter().map(|e| e.as_inner()).flatten() ?
Not sure if it's not as efficient...

Also, isn't maybe better to have the serialization logic inside a non-allocating method using a writer and calling it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will add another method that has serialization logic inside a writer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterators should be more efficient because of TrustedLen specialization. (My understanding is that it pre-allocates correct size and then fills it without additional bounds check.) However I'm not sure if it can see that all those slices are equal-sized.

pub fn from_slice(sl: &[u8]) -> Result<ControlBlock, TaprootError> {
if sl.len() < TAPROOT_CONTROL_BASE_SIZE
|| sl.len() > TAPROOT_CONTROL_MAX_SIZE
|| (sl.len() - TAPROOT_CONTROL_BASE_SIZE) % TAPROOT_CONTROL_NODE_SIZE != 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe more rusty to have first let var_len =sl.len().checked_sub(TAPROOT_CONTROL_BASE_SIZE).unwrap_or_else(...) and then this if simplified?

/// Merkle Tree depth must not be more than 128
InvalidMerkleTreeDepth(usize),
/// Nodes must be added specified in DFS order
NodeNotInDFSOrder,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC there is a naming convention in rust for acronyms to be like this NodeNotInDfsOrder

src/util/taproot.rs Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/sighash.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
@GeneFerneau
Copy link

Concept ACK 032894b

Some small suggestions from an initial review, will spend more time reviewing against the spec.

Will also take some time to experiment with building Taproot contracts with this API, and report back.

src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
Comment on lines 585 to 622
self.0.push(h);
if self.0.len() > TAPROOT_CONTROL_MAX_NODE_COUNT {
return Err(TaprootBuilderError::InvalidMerkleTreeDepth(self.0.len()));
} else {
Ok(())
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: check for error condition before mutating internal state

Suggested change
self.0.push(h);
if self.0.len() > TAPROOT_CONTROL_MAX_NODE_COUNT {
return Err(TaprootBuilderError::InvalidMerkleTreeDepth(self.0.len()));
} else {
Ok(())
}
if self.0.len() + 1 > TAPROOT_CONTROL_MAX_NODE_COUNT {
Err(TaprootBuilderError::InvalidMerkleTreeDepth(self.0.len() + 1))
} else {
self.0.push(h);
Ok(())
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewriting it to self.0.len() >= TAPROOT_CONTROL_MAX_NODE_COUNT would avoid overflow if the invariant that self.0.len() <= TAPROOT_CONTROL_MAX_NODE_COUNT is ever accidentally broken.

src/util/taproot.rs Outdated Show resolved Hide resolved
Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few little comments, I didn't get all the way through the review.

src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more minor comments, thanks.

src/util/taproot.rs Show resolved Hide resolved
Comment on lines 659 to 660
/// script path. This serialization does not the VarInt prefix which would be
/// applied why encoding this element as a witness.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it meant to be 'This serialization does not include the VarInt prefix'? Also: s/why/when

src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
{
return Err(TaprootError::InvalidControlBlockSize(sl.len()));
}
let output_key_parity = (sl[0] & 1) == 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turning the slice into iterator would make it easier to reason about this code IMO, but we really need the extension trait I mentioned somewhere already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you suggest approaching this? I don't see a way to process an iterator u8 and ask it to yield slices without explicating allocating expected size arrays.

@sanket1729
Copy link
Member Author

Sorry for the force push, addressed the feedback from @GeneFerneau and @tcharding, did not address comments from @Kixunil. I will add fixup commits from now on for easier review and later squash them when it is ready.

src/util/taproot.rs Outdated Show resolved Hide resolved
src/util/taproot.rs Outdated Show resolved Hide resolved
@sanket1729
Copy link
Member Author

@dr-orlovsky addressed all of your comments. I think if we get another review for this. We should just merge this and proceed with other taproot stuff. I can work on fixing up and making sure changes in #691 are reflected here.

Finally, I think about the API guidelines, I think we should have a broader discussion. My rule is to always follow the current convention in the codebase. Using new_* and from_* over with_*. If we both(with and new) of these it would be even worse. If we want to switch to with_ (assuming we have consensus), we should do it for all APIs in a non-breaking way in a future release.

@Kixunil
Copy link
Collaborator

Kixunil commented Nov 10, 2021

My understanding of naming:

  • new - obvious constructor, may be empty (String::new(), Box::new(T))
  • from_ - takes same information as new just in different representation (e.g. String::from_utf8(Vec<u8>))
  • with_ - takes additional information along with what new takes String::with_capacity(usize)

const KEY_VERSION_0: u8 = 0u8;

/// Information related to the script path spending
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
pub struct ScriptPath<'s> {
script: &'s Script,
code_separator_pos: u32,
leaf_version: u8,
leaf_version: LeafVersion,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be better as Option? Or I suppose ScriptPath always only in Tap context (in which case, small nit, but maybe TapScriptPath would be clearer name)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a part of the current PR under discussion. But I like the name TapScriptPath over ScriptPath

@tcharding
Copy link
Member

My understanding of naming:

* `new` - obvious constructor, may be empty (`String::new()`, `Box::new(T)`)
* `from_` - takes _same_ information as `new` just in different representation (e.g. `String::from_utf8(Vec<u8>)`)
* `with_` - takes _additional_ information along with what `new` takes `String::with_capacity(usize)`

I thought String was a bit of a special case but a quick look at the std lib shows its not. Perhaps conventions have changed for more recent code? I was under the impression that

  • Constructors with zero args should use Default
  • Constructors with one arg should use From<Foo> (unless the foo needs further restriction like the String;:from_utf8 case)
  • Constructors with multiple args should use new.
  • Constructors using the builder pattern should use with(self, foo: Foo)

I've never managed to work out the pattern for times when we need multiple constructors with multiple args?

Amusingly I was looking at all the instances of single arg constructors for Script yesterday wondering why they used the pattern new_.

I don't mean to bike shed here, I'm genuinely interested in these sorts of conventions.

Copy link
Contributor

@JeremyRubin JeremyRubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did some light review

Comment on lines +107 to +111
if let Some(h) = merkle_root {
eng.input(&h);
} else {
// nothing to hash
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and_then would also work, but i'm indifferent (fine as is, altho superfluos else)

/// a witness stack consisting of the script's inputs, plus the script itself and the control block.
///
/// If one or more of the spending conditions consist of just a single key (after aggregation),
/// the most likely one should be made the internal key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: or the most fee sensitive one? e.g. contested closing?

TAPROOT_CONTROL_BASE_SIZE + TAPROOT_CONTROL_NODE_SIZE * TAPROOT_CONTROL_MAX_NODE_COUNT;

// type alias for versioned tap script corresponding merkle proof
type ScriptMerkleProofMap = BTreeMap<(Script, LeafVersion), BTreeSet<TaprootMerkleBranch>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a BtreeSet internally in case there are >1 instances of a given script?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is to allow > 1 instances of a given script. In practice, this should never be used, but since descriptors spec does not disallow it, we have to support it.

/// constructing the tree as a Huffman tree is the optimal way to minimize average
/// case satisfaction cost. This function takes input a list of tuple(usize, &Script)
/// where usize represents the satisfaction weights of the branch.
/// For example, [(3, S1), (2, S2), (5, S3)] would construct a TapTree that has optimal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm i think you generally shouldn't construct the vec tuple from a btreemap/hashmap as it can introduce subtle bugs (e.g., when you might want a script to appear > 1 time)

///
/// # Errors:
///
/// - When the optimal huffman tree has a depth more than 128
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't be an error condition IMO, should fallback to creating a binary tree of all failing (>128) scripts and attaching at the largest possible depth?

but I think it's OK to have it this way for now...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an improvement. I can create a good first issue for this. If someone really wants this, they are welcome to implement this and I am happy to review it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the optimal huffman tree has depth exceeding 128 it means the user has set insane weights. We should error rather than silently doing the wrong thing.

pub fn new_key_spend<C: secp256k1::Verification>(
secp: &Secp256k1<C>,
internal_key: schnorr::PublicKey,
merkle_root: Option<TapBranchHash>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this return type, this should either be a free function or bound to the TapBranchHash impl.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type is TaprootSpendInfo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i misread as TapBranchHash

// Combine the last two elements and insert a new node
let (p1, s1) = node_weights.pop().expect("len must be at least two");
let (p2, s2) = node_weights.pop().expect("len must be at least two");
// Insert the sum of first two in the tree as a new node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend doing a checked_add here because users could be dumb with what values they put in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

internal_key: schnorr::PublicKey,
script_weights: I,
) -> Result<Self, TaprootBuilderError>
where
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: avoid usize in API, prefer u64 or u32

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking: otherwise the serialization/deserialization of types for pluggin in to this function would be platform dependent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probability is usually expressed as float in range 0..=1, so it may be unintuitive/confusing to use integer. Perhaps worth a newtype with some helper methods? Or maybe more details explaining how would one compute the numbers?

// This would return Err if the merkle root hash is the negation of the secret
// key corresponding to the internal key.
// Because the tweak is derived as specified in BIP341 (hash output of a function),
// this is unlikely to occur (1/2^128) in real life usage, it is safe to unwrap this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack this seeming safe, as PK is fed into the hash.

Would prefer the comment clarifying the hash cycle, but a 'Safe' result API might be future proof if this would ever have any other reason to fail in the future?

secp: &Secp256k1<C>,
internal_key: schnorr::PublicKey,
node: NodeInfo,
) -> TaprootSpendInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be Self

Add tests for taproot Builder
Add tests for taproot huffman tree encoding
Add tests for merkle proof verification
@apoelstra
Copy link
Member

This PR has exceeded Github's ability to reasonably display comments. I would prefer future reviews to be in the form of followup PRs.

Suggestions for non-github platforms are also welcome but that's a long term thing.

@sanket1729
Copy link
Member Author

sanket1729 commented Nov 12, 2021

So far, I think the only things I see remaining are:

Anything else that I am missing? If there is nothing critical I would request to proceed on this and I can address all of the nits in a followup PR.

Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fa8c3f6

@sanket1729
Copy link
Member Author

@GeneFerneau @RCasatta, @dr-orlovsky, sorry for dismissing your Acks. if you are interested in taking another look.

Taproot automation moved this from In review to Ready for merge Nov 12, 2021
Copy link
Collaborator

@dr-orlovsky dr-orlovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fa8c3f6

@dr-orlovsky dr-orlovsky merged commit 425c616 into rust-bitcoin:master Nov 12, 2021
Taproot automation moved this from Ready for merge to Done Nov 12, 2021
This was referenced Nov 12, 2021
@sanket1729 sanket1729 mentioned this pull request Nov 12, 2021
dr-orlovsky added a commit that referenced this pull request Dec 2, 2021
…g tweak for UntweakedPublicKey

5b21a9c Use TapTweakHash method for computing tweak (Noah)

Pull request description:

  Quick follow up PR to #691 using a method from #677.

  ### Changes
  - Updated `UntweakedPublicKey::tap_tweak(...)` to use `TapTweakHash::from_key_and_tweak(...)`

ACKs for top commit:
  Kixunil:
    ACK 5b21a9c
  dr-orlovsky:
    utACK 5b21a9c

Tree-SHA512: d00455bba51981e9ec942a6cf69672666e227850d073b1fdcd92d2eb6ad553659fb2967aec2ce12d3ed109cee5fa125cdda649cddb25404f08adae2bfd3e19bb
RCasatta added a commit that referenced this pull request Dec 23, 2021
7aacc37 Add tests from BIP341 (sanket1729)
61629cc Make taproot hashes forward display (sanket1729)

Pull request description:

  Add tests for taproot.
  - ~Also fixes one bug in #677, namely, I was returning `LeafVersion::default()` instead of given version~
  - ~ Fixes a bug in #691 about taking secp context as a reference instead of consuming it. This should have not passed my review, but this is easy to miss. ~
  - Makes the display on taproot hashes forward instead of the reverse (because the BIP prints in a forward way, I think we should too and it is more natural. )

ACKs for top commit:
  RCasatta:
    ACK 7aacc37
  apoelstra:
    ACK 7aacc37

Tree-SHA512: 2e0442131fc036ffa10f88c91c8fc02d9b67ff6c16c592aa6f4e6a220c26a00fc6ca95a288f14aa40667a289fb0446219fd6c76c0196ead766252356592b9941
dr-orlovsky added a commit that referenced this pull request Dec 30, 2021
7d982fa Add all tests from BIP 371 (sanket1729)
d22e014 Taproot psbt impl BIP 371 (sanket1729)
108fc3d Impl encodable traits for TapLeafhash (sanket1729)
c7478d8 Derive serde for taproot stuctures (sanket1729)

Pull request description:

  Built on top of #677 . Will rebase and mark ready for review after #677 is merged.

ACKs for top commit:
  apoelstra:
    ACK 7d982fa
  dr-orlovsky:
    re-tACK 7d982fa basing on `git range-diff`. The original PR before last re-base was tested commit-by-commit.

Tree-SHA512: feb30e4b38d13110a9c0fabf6466d8f0fb7df09a82f4e01d70b8371b34ab0187004a6c63f9796c6585ee30841e8ee765ae9becae139d2e1e3d839553d64c3d1e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API break This PR requires a version bump for the next release
Projects
Development

Successfully merging this pull request may close these issues.

None yet

8 participants