New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make provably unsignable standard P2PK and P2MS outpoints unspendable. #28400
Make provably unsignable standard P2PK and P2MS outpoints unspendable. #28400
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Changing the When the pruning of OP_RETURN outputs was implemented over 10 years ago (see #2791) this was seemingly less of a problem, but now that many projects depend on a normalized view of the UTXO set (muhash, assumeutxo, utreexo etc.), I think that would lead to much unwanted chaos. Happy to hear other inputs how this could be solved, maybe I'm missing something. |
3840cd8
to
5892815
Compare
I think muhash provides a pretty straightforward way of dealing with this. Namely: rather than only maintaining a single muhash for the utxo set you have, also maintain one for the outputs you've pruned as unspendable (or multiple indexes if you introduce more over time). Then you might have:
if you compare with a node running current master, you then verify that your It's easy to initialize these values on upgrade: Z=0, then iterate through your utxo set looking for bad pubkeys, adding them to Z as you find them, and removing them from the utxo set. Downgrading breaks things though. I don't think you can bootstrap utreexo from just the utxo set at an arbitrary height in the first place, so I don't think its really affected by this. For assumeutxo, if you want to be able to distribute the utxo set snapshots over p2p, then peers that have pruned unspendable utxos won't have them to share with peers that don't realise they're unspendable in the first place, but if you're just producing signed snapshots, then you could probably just publish two versions until the old software goes out of its support window. |
@ajtowns: Neat idea with the additional MuHash! I agree this should work, but obviously at the cost of increased complexity (as there is much more to change than only the logic of a single method), and there should be a really good reason for doing it.
Right, and you'd also need to somehow detect if an upgrade has already happened. In #2791 it was proposed to either scan for a particular output in the UTXO set (#2791 (comment)) or to introduce a flag in the chainstate database (#2791 (comment)). @russeree: Can you give some additional motivation? While I enjoy very much reasoning about these kind of topics, it's still unclear to me what concrete problem this PR is trying to solve. The description claims that about 20k of outputs could be pruned. As of now (block 806205), that's merely 0.016% of the total UTXO set size, freeing up about 1,24 MB of chainstate space (if we assume an average size of 65 bytes per UTXO). I'd argue that these numbers are way too low (even if you 10x them for the sake of projecting into the future) to justify the increased complexity in different areas that would need to be touched (i.e. differentiating between different Also, calling |
src/script/script.cpp
Outdated
|
||
// Last byte is equal to OP_CHECKSIG / OP_CHECKSIGVERIFY | ||
const unsigned char last_byte = *(this->end() - 1); | ||
if (!(last_byte != OP_CHECKSIG ^ last_byte != OP_CHECKSIGVERIFY)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (back() != OP_CHECKSIG && back() != OP_CHECKSIGVERIFY)
?
src/script/script.cpp
Outdated
} | ||
|
||
// The script contains 2 ops and is equal to the script length | ||
if (first_byte + 2 == size){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've already checked size is between 2 and 77, so if first_byte + 2 == size
, first_byte
is between 0 and 75, so the check for OP_PUSHDATA1
is redundant (and the check for OP_0
could be subsumed if you checked size < 3
).
Also, first_byte == front()
.
if (size() > MAX_SCRIPT_SIZE) { | ||
return true; | ||
} else if (this->IsPayToPublicKey()) { | ||
std::vector<unsigned char> pubkey(this->begin() + 1, this->end() - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should just be a span
return true; | ||
} | ||
} else if (this->IsPayToMultisig()) { | ||
CScript::const_iterator pc = this->begin() + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it would be better to have this->IsInvalidMultisig()
and move this code all into one place? Something like:
if (size() < 3) return false;
if (back() != OP_CHECKMULTISIG && back() != OP_CHECKMULTISIGVERIFY) return false;
int n;
if (!IsOpN(*(end() - 2), n ) return false;
int m;
if (!IsOpN(front(), m) return false;
std::vector<Span<unsigned char>> keys;
keys.resize(n);
// parse into keys, finish checking it's well-formed
if (m > n) return true;
int good = 0;
int max_good = n;
for (k : keys) {
if (CPubKey(k).IsFullyValid()) {
++good;
} else {
--max_good;
}
if (good >= m) return false;
if (max_good < m) return true;
}
return false; // unreachable
@@ -201,6 +202,71 @@ unsigned int CScript::GetSigOpCount(const CScript& scriptSig) const | |||
return subscript.GetSigOpCount(true); | |||
} | |||
|
|||
bool CScript::IsPayToPublicKey() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be simpler to follow the style of the existing script template checks like 'IsPayToScriptHash'. You can also use the existing constants to check against the compressed and uncompressed sizes
@@ -343,6 +447,40 @@ bool IsOpSuccess(const opcodetype& opcode) | |||
(opcode >= 187 && opcode <= 254); | |||
} | |||
|
|||
bool IsOpN(unsigned char op_code){ | |||
return OP_0 || (OP_1 <= op_code && op_code <= OP_16); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant:
return op_code == OP_0 || (OP_1 <= op_code && op_code <= OP_16);
return OP_0 || (OP_1 <= op_code && op_code <= OP_16); | ||
} | ||
|
||
bool IsOpN(unsigned char op_code, unsigned char& value){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more concise if you used the above function in this one:
bool IsOpN(unsigned char op_code, unsigned char &value) {
if (IsOpN(op_code)) {
if (op_code == OP_0) {
value = 0;
} else {
value = op_code - 0x50;
}
return true;
}
return false;
}
Sorry for the lack of work on this PR last week, was at TABConf working on #27260 . Work has resumed and I will update this thread with changes over the next few days. |
5892815
to
ec9fa74
Compare
Author: russeree <git@qrsnap.io> @ajtowns - Remove OP_O case from IsPayToPublicKey()
ec9fa74
to
d684194
Compare
d601e1a
to
131ee4d
Compare
131ee4d
to
3733665
Compare
Concept NACK. At the moment the UTXO set has 130 million entries. Reducing it by 0.015% isn't worth the technical risk. By comparison, there have been 53 million OP_Return outputs. Removing those from the UTXO set was a not-so-trivial ~30% reduction. |
🐙 This pull request conflicts with the target branch and needs rebase. |
Concept ACK |
I have no business adding comments here but I may humbly add that, despite this only affecting 0.016% of the UTXOs, it is a smart preemptive defense against arbitrary data on chain using a method that is far more damaging than inscriptions. |
There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
So this would make all current and future STAMPS unspendable? Or just future ones? How would this affect other potential legitimate use cases? What is the projected growth of node requirements if this situation were to get worse and how does that track with Moore's Law of the increase in computer performance? Or is there a better way to think about this? It's worth noting, making these unspendable may not stop their proliferation. For example, solutions that allow you to trade private keys off-chain, in the event that there is market demand for STAMPS, would still incentivize their creation to the same degree even if they are unspendable. |
Concept ACK Responses:
@owenstrevor it does nothing of the sort. You should read more carefully before asserting a slippery slope economic argument about STAMPS.
These outputs are already unspendable mathematically, and the person spending to them agreed to that condition upon signing.
OP_RETURN outputs are not provably unspendable though, they have valid spending paths. So the comparison is irrelevant, the point is why would you incentivize the storage of something that is basically a burned output? Sure right now the storage space is negligible, however you would have to have a completely non-adversarial mindset to suggest that no one would ever exploit this further. |
Fun challenge for everyone: can you link to a specific bare multisig UTXO that has been created within the past year (let's say, since block 769785) where the prunable detection in this PR would hit? Looking at an UTXO snapshot from now (block 827855), from all the 749496 P2MS UTXOs that have been created since the start of the year 2023, I haven't managed to find a single one which is provably unspendable. |
they have provided a list: https://gist.github.com/russeree/85fb9519e0a1fc166177a6d5e0e15be2 |
The most recently confirmed txs in that list are from block 792783 (June 2023), namely c18fe6..., 74c96e... and a6062e.... All but ~50 of the txs are over 300k blocks ago (ie 6+ years). The ones that aren't are:
|
Most STAMPS txs are spendable and are not affected by this PR. As I understand it, that protocol sets up a 1-of-3 multisig, where of the keys is valid (providing a spendable path) and the other two are used for data, and are thus provably unspendable about 50% of the time as the data doesn't match a valid secp point. The only times those txs become unspendable are when the one valid key is replaced by an invalid key -- eg the txs linked in my previous comment have a pubkey |
Concept ACK. Note, however, that pubkeys can be modified to be valid and still be used to store data, for example by using a byte that is shuffled until the pubkey becomes valid. |
However, note that from the time this PR run on a node, its UTXO Set hash will start to differ from any peer that didn't start using this PR at the same time. This can be solved by cleaning up the UTXO Set when the node starts. Unsure how expensive would it be. It would be nice to know how much space is saved running this PR from genesis.
EDIT: It turns out that our problem wouldn't be solved see #28400 (comment) |
Sorry for not understanding this, but why 50%? I thought compressed pubkeys covered almost everything on the X axis because the y is derived? Also many of the pruned outpoints are not compressed pubkeys but instead uncompressed. |
In my own non-cryptographer words: for about half of all possible field elements x, there exists no field element y such that the secp256k1 equation |
The formula is |
This pull request won't make any difference. Top 5 output types in UTXO set are P2PKH, P2WPKH, P2TR, P2SH and P2WSH. https://txstats.com/d/000000054/utxo-set-repartition-by-output-type?orgId=1 |
@1440000bytes the repartition you are showing is on outputs count, not size. Since ordinal, utxoset size grew from 5GB to 10GB. Even if they are marginal in term of output count, in term of size, it takes around 50% of the UTXO Set at the moment. I haven't looked how they put JPEG and JSON files on the chain exactly, I assumed it was with invalid P2PK, but didn't look closely into it. |
Ordinal inscriptions don't use P2PK or bare multisig, they use P2TR with tapscripts. In order for an inscription to be in the blockchain, it must create and then spend a UTXO, so inscriptions themselves have no impact on the UTXO set. Ordinals may have an impact as there are people creating small outputs for their "rare" sats, but those are all perfectly valid P2TR outputs, so nothing can be done there. Stamps is using the old Counterparty protocol that uses bare multisigs. These are 1-of-3 multisigs where one key is valid, and the other two are the data. These are also spendable so this PR would not remove them.
The point of OP_RETURN outputs (defined as the ones following the template |
Closing because the fragility of this PR does not justify it's limited impact. |
Overview
This pr introduces additional conditionals into
IsUnspendable()
to remove provably unspendable P2PK and P2MS tx outpoints from the UTXO set. This is done by usingIsFullyValid()
to check that the public key(s) for the outpoints script are valid. This trims nearly 20K outpoints from the UTXO set at height 805618. https://gist.github.com/russeree/85fb9519e0a1fc166177a6d5e0e15be2A side effect of this PR is it removes the use case for uncompressed public keys through standard tx types to store arbitrary data in the UTXO set.
P2PK
P2PK outpoints with a single pubkey that is invalid can by flagged as unspendable becuase the public key does not exist on the SECP256K1 curve and thus no private key exists to make OP_CHECKSIG evaluate to true.
Script must be in the format of
OP_PUSHBBYTES PUBKEY OP_CHECKSIG(VERIFY)
P2MS
P2MS outpoints that do not have enough valid public keys to meet the threshold. where (n-k < m) where k is the number of invalid public keys.
Script must be in the format of
OP_(N) PUBKEY1 .... OP_(N) OP_CHECKMULTISIG(VERIFY)