# Goal of this notebook

The goal of this notebook is to provide newcomers reading BIP 340 with additional explanatory context.
I'm mainly writing this for myself as I'm reading and dissecting BIP 340, and will learn it better by making a tutorial as I'm reading it.
Also if I ever go back to BIP 340 in the future I'll have a deeper reference here than the source BIP PR.

Before reading BIP 340, I highly recommend understanding elliptic curve groups and the original ECDSA algorithm first.
This repository has another notebook named `elliptic_curves` with a tutorial there.

# Schnorr Signature Basics

Suppose Alice has a private/public key pair $d_A$, $Q_A$ and wishes to sign a message hash $m$.
Recall the original ECDSA algorithm:

Sign:
1. Generate random number $k$
2. Calculate $R=k \times G$
3. Calculate $r=\text{X-coordinate}(R)$
3. Calculate $s=(m + r \cdot d_A) \cdot k^{-1}$
4. Broadcast $(r,s)$

Verify:
1. Calculate $u_1 = m \cdot s^{-1}$
2. Calculate $u_2 = r \cdot s^{-1}$
3. Calculate $R' = u_1 \times G + u_2 \times Q_A$
4. Calculate $r'=\text{X-coordinate}(R')$
5. Check that $r'=r$

The Schnorr signature algorithm is also over elliptic curves, but is even simpler than ECDSA. The Schnorr algorithm works as follows:

Sign:
1. Generate random number $k$
2. Calculate $R=k \times G$
3. Calculate $s = k + \text{Hash}(R||m) \cdot d_A$
4. Broadcast $(R,s)$

Verify:
1. Calculate $P = s \times G$
2. Calculate $P' = R + \text{Hash}(R||m) \times Q_A$
3. Check that $P=P'$

This is trivial to check for correctness, since $R=k \times G$ and $Q_A = d_A \times G$. It is also secure since there is no way for an attacker to split $s$ into its additive components $k$ and $\text{Hash}(R||m) \cdot d_A$.

## Schnorr n-of-n Multi-Sig

One of the neat things about Schnorr is that because it is based on addition, it provides m-of-m multi-sig "for free" simply by adding the individual public keys $Q_i$ and signatures $R_i$ and $s_i$. Since the resulting $Q$, $R$, and $s$ are just another key/sig, it looks identical to a single-sig, resulting in shorter scripts and more privacy. Suppose Alice and Bob are constructing and signing a 2-of-2 multi-sig, then it works as follows.

1. Alice constructs a private/public key pair $d_A$, $Q_A$
2. Bob constructs a private/public key pair $d_B$, $Q_B$
3. Alice and Bob share their individual public keys with each other and derive $Q=Q_A + Q_B$
4. They use $Q$ as the address to send funds (or technically speaking in Bitcoin, the hash of $Q$)

Sign:
1. Alice generates a random number $k_A$ and calculates $R_A=k_A \times G$
2. Bob generates a random number $k_B$ and calculates $R_B=k_B \times G$
3. Alice shares $R_A$ with Bob and vice-versa Bob shares $R_B$ with Alice
4. Alice and Bob both calculate $R=R_A + R_B$
5. Alice calculates $s_A = k_A + \text{Hash}(R||m)$
6. Bob calculates $s_B = k_B + \text{Hash}(R||m)$
7. Alice shares $s_A$ with Bob or vice-versa Bob shares $s_B$ with Alice
8. Alice or Bob calculates $s=s_A + s_B$ and broadcasts (R,s)

Verify:
1. Calculate $P = s \times G$
2. Calculate $P' = R + \text{Hash}(R||m) \times Q$
3. Check that $P=P'$

It works because:
\begin{align*}
P &= s \times G \\
&= (s_A + s_B) \times G \\
&= s_A \times G + s_B \times G \\
P' &= R + \text{Hash}(R||m) \times Q \\
&= R_A + R_B + \text{Hash}(R||m) \times (Q_A + Q_B) \\
&= (R_A + \text{Hash}(R||m) \times Q_A) + (R_B + \text{Hash}(R||m) \times Q_B)
\end{align*}

Then $P=P'$ only holds if each of the individual signatures are correct. This trivially generalizes to any m-of-m signature scheme.

# [BIP 340](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki)

This BIP is long and dense, so let's start with the [design](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki#Design) section:

> ### Design
>Schnorr signature variant Elliptic Curve Schnorr signatures for message m and public key P generally involve a point R, integers e and s picked by the signer, and the base point G which satisfy e = hash(R || m) and s⋅G = R + e⋅P. Two formulations exist, depending on whether the signer reveals e or R: 

The (R,s) algorithm described in this notebook is only one of two possible implementations. The BIP explores both, discusses the various trade-offs, and picks the one which shares (R,s). Let's see why that is.

> 1. Signatures are pairs (e, s) that satisfy e = hash(s⋅G - e⋅P || m). This variant avoids minor complexity introduced by the encoding of the point R in the signature (see paragraphs "Encoding R and public key point P" and "Implicit Y coordinates" further below in this subsection). Moreover, revealing e instead of R allows for potentially shorter signatures: Whereas an encoding of R inherently needs about 32 bytes, the hash e can be tuned to be shorter than 32 bytes, and a short hash of only 16 bytes suffices to provide SUF-CMA security at the target security level of 128 bits. However, a major drawback of this optimization is that finding collisions in a short hash function is easy. This complicates the implementation of secure signing protocols in scenarios in which a group of mutually distrusting signers work together to produce a single joint signature (see Applications below). In these scenarios, which are not captured by the SUF-CMA model due its assumption of a single honest signer, a promising attack strategy for malicious co-signers is to find a collision in the hash function in order to obtain a valid signature on a message that an honest co-signer did not intend to sign.
> 2. Signatures are pairs (R, s) that satisfy s⋅G = R + hash(R || m)⋅P. This supports batch verification, as there are no elliptic curve operations inside the hashes. Batch verification enables significant speedups.[4]

> Since we would like to avoid the fragility that comes with short hashes, the e variant does not provide significant advantages. We choose the R-option, which supports batch verification. 

## Description and advantages of the e-variant

The e-variant works as follows.

Sign:
1. Generate random number $k$
2. Calculate $R=k \times G$
3. Calculate $s = k + \text{Hash}(R||m) \cdot d_A$
4. Broadcast $(e,s)$

Verify:
1. Calculate $e'=\text{Hash}(s \cdot G - e \cdot P)$
3. Check that $e=e'$

### Simpler Implementation
Because (e,s) are just two numbers, whereas (R,s) is an elliptic curve point and a number, it's easier to encode (e,s).
However there's known ways of dealing with the implementation complexity of encoding a point, so it's not a huge deal.

### Shorter Signatures
In bitcoin, all signatures must be saved to the blockchain, so shorter signatures allow for more transactions per block and lower fees per transaction. Since $s$ is 32 bytes, $R$ is 32 bytes, but $e$ can be trimmed to 16 bytes in the e-variant, the e-variant is 48 bytes while the S-variant is 64 bytes, resulting in a space savings of 25%!

However, the shortened hashes are only secure for single-sig transactions, and are not secure for the n-of-n multisig described above. If an honest party owns one key, and a malicious party owns the rest, the malicious party can forge a signature using an algorithm which takes $\sqrt{N}$, where $N$ is the number of bits in the hash. For a 256-bit hash this is ok, but for a 128-bit hash, a determined attacker can break the signature with only $2^{64}$ operations. This means if the e-variant were used, and we wanted native multisig, we'd still require 256-bit hashes. But if we're using 256-bit hashes either way, then the e-variant offers no space savings.

## Advantages of the R-variant

Something called batch verification, which I don't know about yet.

## Key-Prefixing

From the BIP:

> **Key prefixing** Using the verification rule above directly makes Schnorr signatures vulnerable to "related-key attacks" in which a third party can convert a signature (R, s) for public key P into a signature (R, s + a⋅hash(R || m)) for public key P + a⋅G and the same message m, for any given additive tweak a to the signing key. This would render signatures insecure when keys are generated using BIP32's unhardened derivation and other methods that rely on additive tweaks to existing keys such as Taproot.

The hash mentioned above, $\text{Hash}(R||m)$, is insecure when signed by a party with two public keys in which the private keys are offset by some number $a$ known by the attacker. This is actually the case if the attacker knows your xpub, as described in BIP32.

The attack works as follows. Suppose Alice has two addresses in her wallet with public keys $Q_{A1}$ and $Q_{A2}$ and private keys $d_{A1}$ and $d_{A2}$, in which $d_{A2} - d_{A1} = a$.

Suppose Alice sends 1 BTC to Bob from her $Q_{A1}$ address. That transaction corresponds to a message $m$ and signature $(R,s)$ such that $s \cdot G = R + \text{Hash}(R||m) \cdot Q_{A1}$.

Bob can use his knowledge of $a$ to generate a new signature $s'=s + a \cdot \text{Hash}(R||M)$. This will produce a valid signature for $Q_{A2}$, but only for the same message $m$. The math works as follows.

\begin{align*}
s' \times G &= (s + a \cdot \text{Hash}(m||R)) \times G \\
&= s \times G + a \cdot \text{Hash}(m||R) \times G \\
&= R + \text{Hash}(R||m) \times Q_{A1} + a \cdot \text{Hash}(m||R) \times G \\
&= R + (\text{Hash}(R||m) \cdot d_{A1} + a \cdot \text{Hash}(m||R)) \times G \\
&= R + (\text{Hash}(R||m) \cdot (d_{A1} + a)) \times G \\
&= R + (\text{Hash}(R||m) \cdot d_{A2}) \times G \\
&= R + \text{Hash}(R||m) \times Q_{A2}
\end{align*}

A simple fix is to put the public key $Q$ inside the hash as well:

> To protect against these attacks, we choose key prefixed[5] Schnorr signatures which means that the public key is prefixed to the message in the challenge hash input. This changes the equation to s⋅G = R + hash(R || P || m)⋅P. It can be shown that key prefixing protects against related-key attacks with additive tweaks. In general, key prefixing increases robustness in multi-user settings, e.g., it seems to be a requirement for proving the MuSig multisignature scheme secure (see Applications below). 

Now if Bob tried to sign the message $m$ with $Q_{A2}$ he would be unable since it uses a different hash.

But is this even really a big deal? After all, in regular bitcoin transactions the public key is already in the hash implicitly by being part of the message, due to referencing the spending UTXO. If Bob tried to sign the *same* message with $Q_{A2}$, the message would be irrelevant because those funds are locked in a *different* UTXO.

> We note that key prefixing is not strictly necessary for transaction signatures as used in Bitcoin currently, because signed transactions indirectly commit to the public keys already, i.e., m contains a commitment to pk. However, this indirect commitment should not be relied upon because it may change with proposals such as SIGHASH_NOINPUT (BIP118), and would render the signature scheme unsuitable for other purposes than signing transactions, e.g., signing ordinary messages. 

Essentially, there are future plans to add a new SIGHASH to OP_CHECKSIG which would allow signing transactions that don't reference a specific UTXO. For example, if ten different transactions sent to a certain address, then one transaction could spend from all of them as one input, instead of needing to reference ten different inputs, which is 
more space-efficient. This SIGHASH is also needed for eltoo, an improved implementation of the lightning network.

If Alice had signed her transaction with SIGHASH_NOINPUT or SIGHASH_ANYPREVOUT, then Bob could steal from her. For those reasons, the hash used in Schnorr is $\text{Hash}(R||P||m)$.