Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Witness scripts being abused to bypass datacarriersize limit (CVE-2023-50428) #29187

Open
luke-jr opened this issue Jan 5, 2024 · 54 comments
Open

Comments

@luke-jr
Copy link
Member

luke-jr commented Jan 5, 2024

The datacarriersize policy option is meant to limit the size of extra data allowed in transactions for relaying and mining. Since the end of 2022, however, attackers have found a way to bypass this limit by obfuscating their spam inside OP_FALSE OP_IF patterns instead of using the standardized OP_RETURN. This remains under active exploitation to a degree very harmful to Bitcoin even today.

A straightforward way to address this is to simply fix the bug (#28408), but that was (inappropriately) closed due to social attacks.

This remains an active issue that needs to be addressed.

Other solutions have been proposed, including:

  • Rejecting transactions entirely if they attempt to bypass the datacarriersize limit (so-called "Ordisrespector")
  • Extending the existing policy script size limits to Tapscript. Currently, this varies for pre-segwit (1650) and segwit v0 transactions (3600), but could be unified and applied equally to all scripts. (This is included in Knots v25.1 currently)
  • Identifying extra data yet removing the witness discount rather than filtering it out entirely. It's not clear this would be effective alone, but is supported by Knots v25.1.
  • Adding a second datacarriersize with a broader scope like in datacarriersize: Match more datacarrying #28408 (suggested by glozow).
  • Moving back to a whitelist-based transaction policy, where only known/standardized scripts are accepted.
  • A softfork making data storage impractical. This would likely be very "painful" to implement, review, and even adopt (it would require upgrades from all wallets). It would also be a slow process and entail risks without significant economic support upfront.

I have written code for the first 3, which I am happy to PR if there's interest. Perhaps others can suggest and/or implement other solutions.

@glozow
Copy link
Member

glozow commented Jan 5, 2024

The datacarriersize policy option is meant to limit the size of extra data allowed in transactions for relaying and mining.

History of this config option suggests datacarriersize is meant to limit the size of data in OP_RETURN outputs, so this statement is untrue.

    -datacarrier=0/1 : Relay and mine "data carrier" (OP_RETURN) transactions if this is 1.
    -datacarriersize=n : Maximum size, in bytes, we consider acceptable for "data carrier" outputs.

The docs from #11058 have not changed except for moving files:

/**
* A data carrying output is an unspendable output containing data. The script
* type is designated as TxoutType::NULL_DATA.
*
* Maximum size of TxoutType::NULL_DATA scripts that this node considers standard.
* If nullopt, any size is nonstandard.
*/
std::optional<unsigned> max_datacarrier_bytes{DEFAULT_ACCEPT_DATACARRIER ? std::optional{MAX_OP_RETURN_RELAY} : std::nullopt};

Instead of retroactively deciding that -datacarriersize applies to more than just OP_RETURNs, why not propose a new config option?

@luke-jr
Copy link
Member Author

luke-jr commented Jan 5, 2024

History of this config option suggests datacarriersize is meant to limit the size of data in OP_RETURN outputs, so this statement is untrue.

It's meant to limit extra data in transactions. OP_RETURN was supposed to be the only tolerated way to do that. datacarriersize has no possible use if it's trivial to bypass. The "Ordisrespector" approach would take us back to that prior status quo.

why not propose a new config option?

It would be confusingly redundant. But if that is Concept ACK'd, I would be happy to do it. (Added to list of solutions in OP for now)

@dergoegge
Copy link
Member

It's meant to limit extra data in transactions. OP_RETURN was supposed to be the only tolerated way to do that.

Can you add any references for this? because the GitHub and git history very clearly contradicts your statements (#29187 (comment)).

@luke-jr
Copy link
Member Author

luke-jr commented Jan 5, 2024

No, it doesn't contradict it. That overview conveniently leaves out the context of OP_RETURN being the only way tolerated, and the datacarriersize limit makes no sense the way you want to spin it.

@wizkid057
Copy link

wizkid057 commented Jan 5, 2024

TL;DR - My suggestion is to implement the bug fix as above with a combination of these two bullet points. ^

It seems like an acceptable compromise solution would be the combination of these two, provided that the default for the new second otherdatacarriersize option (or whatever it ends up being called) is the same as the existing one for OP_RETURN (80 bytes of arbitrary data), and there is an option to disable the equalization of the fees, if desired (as there is in Knots).

This fully addresses the only somewhat valid objection I saw to the original (now closed) PR, in my opinion, which was that there was no mechanism for users to run nodes which use the current policy if they so desired without code changes. While I'm personally at a loss to why anyone would want to run a node that permits active exploitation of an obvious bug, I'm all for giving people choices. Adding the options above resolves that objection entirely, as anyone with that mindset can simply set them to whatever they want.

This effectively adds OP_IF/OP_FALSE as a new "tolerated" method of data carrying, with a limit on data carrying equal to that of the existing tolerated method of OP_RETURN. But keep in mind that the OP_FALSE/OP_IF exploit method of data carrying is arguably worse for the network than OP_RETURN, though, since these types of store-data-in-the-witness transactions require two transactions, and they tend to just leave a dust output in the UTXO set forever once the data is "revealed". The above would disincentivize that harm sufficiently, since equalizing the fee-per-byte with OP_RETURN should effectively drive data carrying back to OP_RETURN (never enters the UTXO set) from a cost basis. This is because the two-transaction method of performing the OP_IF/OP_FALSE exploit would end up being more costly overall than one transaction to an OP_RETURN.

Further, any implementation of these new flags should be 100% unambiguous in documentation and comments such that they apply to all known data carrying methods besides OP_RETURN, and thus avoiding the potential situation where folks in the future can attempt to ignore an active exploit by simply changing the documentation.

Finally, let's be clear that this is a valid issue, that the use of OP_FALSE/OP_IF for storing arbitrary data IS in fact an exploit, one that is being actively harmful, and fixing (or even slowing down, in the case of the suggested implementation above) that exploitation is in the best interest of the network as a whole. There is no script that actually validates anything that can make use of such a mechanism. It explicitly and provably does NOT use the arbitrary data for any validation whatsoever. Any such method of storing data is clearly not an intended use case for the Bitcoin network any more than someone storing arbitrary data in other types of outputs (like P2PKH, multisig, etc) is an intended and valid use case, as this is harmful to the network. The latter was effectively discouraged by tolerating a small amount of arbitrary data with OP_RETURN, which is at least not as harmful to the network. While the existence of OP_RETURN doesn't prevent anyone from using more harmful methods, it does make them prohibitively more expensive to do so and seems to have all but eliminated that practice.

The OP_FALSE/OP_IF exploit gets around the clear intentions of allowing OP_RETURN by both avoiding the data size limit, AND getting a huge discount on fee per byte for that data compared to regular transactions. A clear bug to be addressed. If, for example, people figured out a way to exploit OP_RETURN to get around the 80 byte relay limit when it was implemented, I'm certain there would have been a quick fix for that. Also, if you have people constantly dumping garbage in your driveway that you have to clean up any time you want to move your car, and you decide to put in a security system to prevent that as best possible (maybe by making your neighbor's yard more attractive to the dumpers)... making the argument that the ones doing the dumping are now somehow the victims because they can no longer take advantage of you is just not valid. The same goes for people actively exploiting a clear bug in Bitcoin. What happens with people actively exploiting it as a result of a fix that levels the playing field is not of consequence to users of Bitcoin. Heck, this proposal doesn't even prevent the practice, just puts it on par with other data carrying. In my opinion, the exploit should be closed completely, but the two bullet points above are an excellent compromise.

Let's get this fix done, and let's not let perfect be the enemy of good.

@1440000bytes
Copy link

Since CVE ID is used to validate this as a vulnerability, I wanted to share that cve.org has added "disputed" tag for CVE-2023-50428. This tag is used when there are differences of opinion about whether its a vulnerability based on the CVE program's definition.

image

A note has also been added to CVE by MITRE on 4 Jan 2024:

NOTE: although this is a vulnerability from the perspective of the Bitcoin Knots project, some others consider it "not a bug."

https://nvd.nist.gov/vuln/detail/CVE-2023-50428#VulnChangeHistorySection

@wizkid057
Copy link

wizkid057 commented Jan 6, 2024

Since CVE ID is used to validate this as a vulnerability

It's not, really. In fact, I didn't even mention "CVE" once in my detailed comment above.

This is clearly a bug, clearly a vulnerability in the software, clearly unintended behavior, and clearly being actively exploited in the wild. That more than qualifies this particular issue as a valid CVE, but absolutely no one credible is doing the reverse of using the CVE ID's existence to "validate this as a vulnerability." It's an unambiguous vulnerability regardless of if it has a CVE number assigned to it.

I wanted to share that cve.org has added "disputed" tag for CVE-2023-50428

The people actively exploiting a bug, or otherwise supporting the exploit of a bug, now also abusing the mechanisms related to CVE reporting... is frankly unsurprising, and really doesn't qualify as something to be used as a "gotcha" here. In fact, I think it gives the need to patch the exploit even more weight, given how desperate the active exploiters are being in attempts to prevent this bug from being patched.

The ability to inject arbitrary data into any system in a way that isn't explicitly documented as permitted for an expected purpose of the system is always a vulnerability and always something that should be addressed. This is cybersecurity 101 stuff, and isn't even remotely specific to Bitcoin. The data proceeding OP_FALSE/OP_IF is clearly and provably arbitrary and not in any way required nor intended for Bitcoin to function as designed.

Storing virtually limitless amounts of arbitrary data (effectively up to the max block size) in the Bitcoin public ledger (and thus forcing everyone running a full node to not only download and deal with this data, but requiring it to be stored for all eternity in one form or another) by using a clearly unintended exploit that gets around not only previously imposed limitations on tolerated arbitrary data size, but also gets a discount for doing so vs standard transactions? That's got to be probably the most obvious exploit on the Bitcoin network since the infinite money exploit. Fixing it shouldn't even be remotely controversial.

If someone can show me code, code comments, documentation, community discussion about the pros/cons of such a mechanism, and confirmation of "rough consensus" on this, from before this exploit was developed and abused, that clearly defines this specific OP_FALSE/OP_IF method of storing arbitrary data in the Bitcoin blockchain as a well defined intended behavior that was agreed upon to be permitted by the community and nodes, then, I'd likely have to concede that this is not a vulnerability that needs fixing since it would then be clearly defined and intended behavior. In the absence of that (which is the reality of the situation, because none of the above exists), this is unintended behavior, and a clear vulnerability that needs addressing.

In fact, this is orders of magnitude worse than similar vulnerabilities in other systems that are clearly and unambiguously defined as a vulnerability and bug to be fixed. The only reason this one is being disputed in any way, in my opinion, is because there's a massive conflict of interest and major financial incentive to the exploiters to retain their ability to abuse the Bitcoin network. So the full resources of those groups are being utilized to sow discord on what should be a cut and dry vulnerability patch. Sadly, that's a very noisy minority in this case.

"Bitcoin is a decentralized digital currency that enables instant payments to anyone, anywhere in the world. Bitcoin uses peer-to-peer technology to operate with no central authority: transaction management and money issuance are carried out collectively by the network." By finally addressing this issue (or heck, even acknowledging that it is and issue), the developers would be upholding the primary purpose of the network, being a digital currency, thereby maintaining user trust in its reliability, long term viability, and overall security. Fortunately some developers are capable of seeing through the noise here, and hopefully other developers will regain their collective sanity and cut through the noise as well to properly address this bug. It's been quite the spectacle, and admittedly very disheartening, watching so many developers on this project whom I've held in pretty high regard over the years be so clearly and blatantly deceived and swayed by the obvious noise from attackers on this particular issue. That's been very sad to see.

@totient
Copy link

totient commented Jan 8, 2024

It's meant to limit extra data in transactions. OP_RETURN was supposed to be the only tolerated way to do that.

Can you add any references for this? because the GitHub and git history very clearly contradicts your statements (#29187 (comment)).

"There was [sic] been some confusion and misunderstanding in the community, regarding the OP_RETURN feature in 0.9 and data in the blockchain. This change is not an endorsement of storing data in the blockchain. The OP_RETURN change creates a provably-prunable output, to avoid data storage schemes – some of which were already deployed – that were storing arbitrary data such as images as forever-unspendable TX outputs, bloating bitcoin’s UTXO database." (https://github.com/bitcoin/bitcoin/blob/master/doc/release-notes/release-notes-0.9.0.md#op_return-and-data-in-the-block-chain)

If datacarriersize was meant to apply only to arbitrary data in op_return and not to all data, why wasn't it called op_returncarriersize?

Its quite clear that op_return was meant to deprecate all other arbitrary data injection schemes, and it was then limited to 40 (later 80) bytes. The release notes history never show an intention to change the client to accommodate large amounts of arbitrary data either in relay policy or permanent storage in blockchain data.

This debate actually goes back quite a bit further, at least to v0.3. This size of op_return data was supposed to be limited to sha256 output (32 bytes), plus a few bytes for an identifier tag. This design choice was in compliance of Satoshi's suggestion "I also support a third transaction type for timestamp hash sized arbitrary data." (https://bitcointalk.org/index.php?topic=2162.msg28549#msg28549), At the time Satoshi wrote the above post signaling his support for 32 bytes of arbitrary data, Satoshi and Gavin (maybe others too) had agreed to implement a script whitelisting policy, which was an intentional effort to stop people from finding clever uses of script to create transactions which didn't relate to the monetary purpose of the client software.

The PR 28408 Review note posted by @glozow starts recording history from 0.10. I'm not sure why the incredibly relevant context about Bitcoin's design principles from 0.9 (and earlier) is being ignored.

@mzumsande
Copy link
Contributor

A softfork making data storage impractical. [...] It would also be a slow process

It's going to be a slow process either way:

Even if Bitcoin Core changed its default policy, the effect on propagation would be negligible for a long time.
The p2p network is densely connected, with reachable nodes having 8 tx-relaying outbound peers and typically many more tx-relaying inbounds. Some network simulations I did in the context of full-RBF a few years ago indicated that already a minority of 10% of nodes supporting a feature (that is non-standard for the rest) is sufficient for propagation with a >95% probability (assuming both sender and miner run well-connected reachable nodes).

Since the txns in question are standard for almost all nodes today and the network tends to upgrade to newer versions slowly (https://bitnodes.io/dashboard/1y/#user-agents), it would take several years to get anywhere near that 10% threshold if upgrades would just happen naturally. After that, preferential peering would be a simple and effective countermeasure - there wouldn't even be a real need for out-of-band delivery.
In general, I think that using policy to prevent transactions from propagating when there is an active minority that wants them propagated is just not going to work in practice.

Adding an option that's not enabled by default would be purely symbolic for the purpose of transaction relay. However, it could still make sense to have it because if miners want to run that and earn less fees, a config option would make that easy for them and they wouldn't have to resort to alternative clients.

@saltduck
Copy link

saltduck commented Jan 11, 2024 via email

@nsvrn
Copy link
Contributor

nsvrn commented Jan 16, 2024

I strongly think that the datacarriersize should be updated to what it truly supposed to do in the first place(not what it's documentation was changed to after ord spam started). This is like saying that every option that exists now literally doesn't apply to future additions in the functionality.
And people who say it's a cat and mouse game, isn't every attack mitigation in network security like that?
May be you want to admit that there isn't a good solution you've found but even then what makes you oppose the accuracy of what already exists as options.
It will never add up for me that the dev community here is basically saying that ord injection is neither a bug nor "data".

@tromp
Copy link

tromp commented Jan 17, 2024

No matter what new restrictions are imposed, it's safe to assume that the economic reality of people wanting to inscribe data on the blockchain will persist and use whatever means is available and cheapest. There will always be means available such as spreading the data over multiple outputs or multiple txs, encoding the data as arbitrary taproot scripts, or encoding the data as fake public keys or key hashes. Some of these encodings will be detectable while others will not.
Raising the cost of inscriptions will reduce the amount of it, but it's not clear by how much. Doubling the cost could reduce it by half or reduce it by only by 10%. I don't know of any studies of the price sensitivity of inscribers. There is likely some degree of irrationality there.
It would be nice to see, for any proposed measure, what alternative encoding scheme(s) the inscribers can be expected to migrate to, on a cost minimization basis, and what implications that will have on things like full node resource usage, and whether we would be happy with that.

@GregTonoski
Copy link

what alternative encoding scheme(s) the inscribers can be expected to migrate to, on a cost minimization basis, and what implications that will have on things like full node resource usage, and whether we would be happy with that.

If it had been about cost minimisation only then they would have spammed testnets (for virtually free) instead of the Mainchain.

@tromp
Copy link

tromp commented Jan 17, 2024

If it had been about cost minimisation then they would have spammed testnets

Their perceived value is of inscriptions on the bitcoin main chain of course. Not on a testnet, not on an altcoin chain, and not of an inscribed link to some other hosting site. They want their data to still be downloadable a century from now, guaranteed. Any cost minimization is conditional on that.

@GregTonoski
Copy link

and not of an inscribed link to some other hosting site

To the contrary, they put text (link) in many cases, e.g. txid: ddcec4687acb054e94f5ad803b1f87574e93a37e560b0011b76ed8f36d6ad88a

@tromp
Copy link

tromp commented Jan 17, 2024

they put text (link) in many cases

Yes, some do. But many others don't, and those are the ones these proposals are targetting.

@joeyvee1986
Copy link

joeyvee1986 commented Jan 17, 2024 via email

@totient
Copy link

totient commented Jan 17, 2024

No matter what new restrictions are imposed, it's safe to assume that the economic reality of people wanting to inscribe data on the blockchain will persist and use whatever means is available and cheapest. There will always be means available such as spreading the data over multiple outputs or multiple txs, encoding the data as arbitrary taproot scripts, or encoding the data as fake public keys or key hashes. Some of these encodings will be detectable while others will not. Raising the cost of inscriptions will reduce the amount of it, but it's not clear by how much. Doubling the cost could reduce it by half or reduce it by only by 10%. I don't know of any studies of the price sensitivity of inscribers. There is likely some degree of irrationality there. It would be nice to see, for any proposed measure, what alternative encoding scheme(s) the inscribers can be expected to migrate to, on a cost minimization basis, and what implications that will have on things like full node resource usage, and whether we would be happy with that.

Read the v0.9 release notes. We already had a debate about this and the community came to a consensus that the only legitimate inscription scheme was op_return, and it should be limited to 40 (or 80) bytes. Exploiting methods of inscribing data outside of op_return is antisocial behavior in violation of a broad agreement we used to have, which was never revised. Exploiting flaws in script should not be tolerated in this community, and especially not tolerated by the codebase.

@ArmchairCryptologist
Copy link

As far as I can tell, the default value for datacarriersize has always been named MAX_OP_RETURN_RELAY, so claiming that it was intended to be applicable to anything else seems like a bit of stretch.

It's not that people are disagreeing that storing cat pictures on the blockchain is a bad idea, they disagree with the proposed way to limit it, as it would be a game of whack-a-mole. If you wanted to specifically target people using the segwit discount to store data cheaply in general, a better way would probably be to cap the segwit discount by some factor of the number of UTXOs the transaction consumes - say 500 bytes per consumed UTXO, to make some allowance for complex scripts. That way, transactions that use segwit "properly" would not be penalized, while transactions abusing it for data storage would pay ~4x the rate.

However, it is important to emphasize that this wouldn't actually address the blockchain spam we are currently seeing, which mostly consists of a large volume of BRC-20 tokens with negligible transaction size that, with some tweaking, could probably even fit in an OP_RETURN.

@Earnestly
Copy link

Earnestly commented Jan 18, 2024

with some tweaking, could probably even fit in an OP_RETURN.

Which could then be discarded by node operators, demonstrating its purpose.

The 80 byte limit on OP_RETURN is a filter, it is one which can be bypassed simply by having a miner include transactions which exceed it out of band. Why is this not deemed "censorship" and why does bitcoin core bother placing an 80 byte limit on it?

@1440000bytes
Copy link

Which could then be discarded by node operators, demonstrating its purpose.

It wont be discarded by default because they use less than 80 bytes for JSON. Could use even lesser bytes by changing the protocol.

The 80 byte limit on OP_RETURN is a filter, it is one which can be bypassed simply by having a miner include transactions which exceed it out of band. Why is this not deemed "censorship" and why does bitcoin core bother placing an 80 byte limit on it?

Related pull request and a project to fix this: #28130

https://x.com/lightcoin/status/1733257392668766494

@Retropex
Copy link

It wont be discarded by default because they use less than 80 bytes for JSON

You persist in not understanding that the goal is not only to protect the network from spam but also to give the node runners the tools to choose their mempool policies.

@1440000bytes
Copy link

It wont be discarded by default because they use less than 80 bytes for JSON

You persist in not understanding that the goal is not only to protect the network from spam but also to give the node runners the tools to choose their mempool policies.

If some users really want such options, I had suggested an alternative approach in this comment: #28408 (comment)

OR

You can add another config option which is enabled by default and filters transactions based on opcode templates saved in filter_templates.json file in data directory. This file should be empty by default and users can keep adding templates in it with time based on their preference.

@eragmus
Copy link

eragmus commented Jan 18, 2024

Since CVE ID is used to validate this as a vulnerability, I wanted to share that cve.org has added "disputed" tag for CVE-2023-50428. This tag is used when there are differences of opinion about whether its a vulnerability based on the CVE program's definition.

image

A note has also been added to CVE by MITRE on 4 Jan 2024:

NOTE: although this is a vulnerability from the perspective of the Bitcoin Knots project, some others consider it "not a bug."

https://nvd.nist.gov/vuln/detail/CVE-2023-50428#VulnChangeHistorySection

The previous thread was closed by Bitcoin Core, due to obvious lack of consensus, yet for some reason this new thread that duplicates the agenda of the previous thread was created. I am confused? People who voiced objection in the previous thread appear largely to be ignoring this thread, since their points were already made.

In addition to the new context @1440000bytes has provided -- I want to make some points, just for the record (so it cannot be suggested that this thread lacks objections), since I detect a lot of emotion in some posts in this thread, combined with being apparently stuck in a previous era that no longer exists in Bitcoin, since the free market has clearly spoken and changed the game:

Bitcoin moves value* through time + space.

*Value is subjective. (Value, in Bitcoin, is subjectively determined by the fee-rate of the tx.)

Bitcoin is a cypherpunk system. Bitcoin is based on crypto-anarchy. This is a colorful way of saying that Bitcoin is about freedom, including free markets.

Bitcoin's anti-DoS mechanism is not about Bitcoin Core sending political messages to dictate to the economy what it should or should not do.

Bitcoin's anti-DoS mechanism is based on fee-rate of txs.

Therefore, there should be no prejudice on the mempool layer, as far as which kind of value is "good" or "bad", due to Bitcoin's primary ethos being dedicated to freedom, permissionlessness, and censorship-resistance.

In the case of arbitrary data txs, this is even more critical due to financial incentive. In 2023, the free market already voted in favor of arbitrary data txs, with a total of 8,500 BTC paid in fees by arbitrary data buyers of block space to sellers of block space, which was more than 1/3 of total fees.

image

Thus, any attempt to interfere carries obvious and severe risks and unintended consequences, for both Bitcoin Core and Bitcoin:

  1. It sends a message that Bitcoin Core is being political, by trying to interfere in free market activity on the network by censoring valid fee-paying txs. It also would not dissuade activity, rather it would invigorate the arbitrary data economy due to being able to credibly claim censorship attempt. Permissionlessness quality means Bitcoin users do not require permission, not from Bitcoin Core nor from any other. Bitcoin operates on financial incentives of free markets. We have already seen in 2023 that the arbitrary data economy was only motivated by attempts to restrict their activity. It created a counter-culture that seemed "cool", in a rebellious "stick it to the man" sense.

  2. The update to Bitcoin Core is very likely ineffective, just virtue signaling (which Bitcoin Core should not be in the business of doing, as it makes a mockery of Bitcoin Core to implement changes that lack efficacy), since to change the public mempool relay situation, you would need iirc 85-92% of nodes to update. This is very low probability, as a significant minority or more of nodes simply refuse to update, and others wait years before updating. Also, users of arbitrary data obviously have an incentive to respond by running nodes without the change.

  3. Even if (2) is resolved in a low probability scenario, the arbitrary data economy can simply and quickly stop using the public mempool relay, and switch to miners' private mempool relay, by sending txs directly to miners. (Remember, there is a very strong financial incentive to route around any attempted censorship by Bitcoin Core.) This splits the mempool into 2 pieces: public and decentralized vs. private and centralized. Even if miners made their own mempool public at some point, it would need to be trusted, cannot verify, which incentivizes miners to lie about the real state of the mempool to increase fees paid. Nothing about this is good. It can cause pressure to centralize mining.

  4. As explained by others in recent comments, the arbitrary data can be easily switched to any number of different formats, with format chosen based on intended effect. It has various unintended consequences -- It would make already-created arbitrary data more valuable, as it makes their supply finite if the old designs cannot be reused, this injects energy into the market instead of removing it. It can cause new designs to be more disruptive, not less. It creates an inevitable cat and mouse game, where Bitcoin Core must waste its precious time and energy and other resources on a pointless virtue-signaling exercise, and in fact vice-signaling by trying to censor fee-paying txs.

  5. re: UTXO set, various fear-mongering claims about the UTXO set are thrown around to artificially magnify the problem with scary words like "bloat the UTXO set" and "increase RAM usage", but these claims are false or misleading. -- As @achow101 said: "With current usage, even with the bare multisig protocols, I don't think the effect on full nodes is particularly egregious. The UTXOs created for the arbitrary data are still small and bounded. I did a calculation a little while ago which suggested that the UTXO set growth due to bare multisigs is around a few MB. Compared to the rest of the UTXO set and the blockchain, that's nothing. All that means is that the storage required goes up by that amount. It doesn't have any effect on [RAM] memory usage, since there's a fixed amount of cache. If the UTXOs are never spent, then they won't be taking up any memory. -- @murchandamus has said: "We load into the UTXO cache any UTXO that get referenced by transactions, and new UTXOs that get created. We flush the UTXO cache whenever it hits the limit, so UTXOs that haven’t been used in a long time are likely not in the RAM." and "By default, Bitcoin Core limits the mempool datastructure to 300 MiB of memory (RAM)." and "When this limit is hit, nodes start to discard the transactions with the lowest "descendant set feerate", i.e. the sum of fees divided by the sum of sizes of the transactions and its descendant transactions, until they are below the limit." -- @evoskuil has said: "The only necessary difference is prunability, as both must be retained for bootstrapping. Yet widespread pruning also isn’t good for bootstrapping, so really not an important distinction. Disk space is the cheapest computing resource. There is no good reason to be concerned about linear chain growth."

  6. Bitcoin's governance is not a democracy. It is not about who can vote more in favor vs. against, on Github. It is about achieving rough consensus, where the primary objective is the precautionary principle of "do no harm" (Bitcoin is a conservatively operated network, not "move fast and break things"), and thus where objections have to be properly addressed. If lack of consensus, then obviously no change.

  7. Bitcoin Core developers do not control Bitcoin. Users, businesses, miners, etc. choose which node implementation to run. If someone does not like Bitcoin Core, Bitcoin Core has no obligation to disregard its principles of (6), bow to pressure, and make changes. Bitcoin Core follows its principles and provides software to the market. It is a free market actor. The rest of the free market's actors then choose whether to accept that software, or fork it to create their own software to run.

  8. Uneconomic lower fee-rate txs that are priced out (by economically-viable higher fee-rate txs) are not entitled to be able to be made on L1 Bitcoin with its highest censorship-resistant assurances. L2 networks like LN exist to be able to handle txs with lower economic value (txs with lower fee-rate), by allowing to amortize the cost of the L1 fee across many L2 txs. Really low economic value txs like coffee payments arguably do not need L1's censorship resistance, and if for example they are not viable with L2 either due to fees, then arguably can use different layers: semi-decentralized federated networks like Liquid, or Fedimint, etc. Or, even non-decentralized custodial LN for coffee payments.

  9. The other realistic solution besides (8) is to focus on improving Bitcoin's L1 and L2 efficiency via consensus changes like CTV, for UTXO sharing to increase fee density and thus economic value of lower-value txs (to be better able to compete with higher-value economic txs), and to use CTV to be able to improve L2s like LN and Ark and Enigma.

  10. One other possible way to deal with arbitrary data txs, at its root, is to break Ordinal Theory, iiuc. This requires a fork, such as if Bitcoin implements Confidential Transactions. This too will need consensus, but it at least offers something valuable to many, in exchange for the lost fees: privacy (by default?) on L1 Bitcoin.

I realize the truth is often unpopular, and truth is a bitter pill to swallow sometimes when it conflicts with one's ego, but nothing (not even ego) is more important than truth and reality.

@wizkid057
Copy link

The previous thread was closed by Bitcoin Core, due to obvious lack of consensus, yet for some reason this new thread that duplicates the agenda of the previous thread was created. I am confused? People who voiced objection in the previous thread appear largely to be ignoring this thread, since their points were already made.

Every objection made both here and in the original PR was quite thoroughly and clearly debunked as not having any bearing whatsoever on the issue at hand. The post above is no different, sadly, but unfortunately I don't have the patience to continue to combat such posts. Thus far no one has made any valid arguments against anything I've posted above. In fact, it's the people like myself with completely valid arguments with logical backing that are entirely ignored in this thread and others on the topic, drowned out by noise from people like yourself bringing up completely unrelated points that have nothing to do with the core issue at hand.

Again, don't have the patience to sit here and waste time invalidating every useless post, but let's go through some key points just to prove a point that should at least show that on the whole that such arguments have no merit here.

  1. It sends a message that Bitcoin Core is being political, by trying to interfere in free market activity on the network by censoring valid fee-paying txs.

Exploiting a bug in software is not "interfering in free market activity." It's fixing an unintended exploitable issue. Nothing more. This is, again, this is cybersecurity 101 stuff. From a code perspective, this is no different than the create-infinite bitcoin bug that was exploited and subsequently fixed. An exploit is an exploit. There was never consensus to allow discounted arbitrary data injection of limitless size into the blockchain. Never.

This isn't political in any way. It's just sane development practice. Someone exploits a bug, you fix the bug. You don't let the people exploiting the bug gaslight developers.

  1. The update to Bitcoin Core is very likely ineffective, just virtue signaling

Completely false. IsStandard and other such mempool filtering has clearly been quite effective over the years. Otherwise these spammers would just be using 1MB OP_RETURNs for their data. The percentage of transactions that comply with IsStandard of the time period is a clear supermajority of transactions.

  1. Even if (2) is resolved in a low probability scenario, the arbitrary data economy can simply and quickly stop using the public mempool relay, and switch to miners' private mempool relay.

Again, this isn't supported by history nor is it viable on any real scale.

  1. As explained by others in recent comments, the arbitrary data can be easily switched to any number of different formats, with format chosen based on intended effect.

You're basically saying that as exploits are discovered, they shouldn't be patched and fixed... which is again, opposite of common sense in development.

There's always going to be people attempting to attack software like Bitcoin. There should also always be people fixing such issues.

When developers stop caring about people exploiting, projects die. That's the path we're on here, sadly, if we allow exploits to continue.

  1. re: UTXO set, various fear-mongering claims about the UTXO set are thrown around to artificially magnify the problem with scary words like "bloat the UTXO set" and "increase RAM usage", but these claims are false or misleading. [...]

It's not so much a matter of if people can absorb the costs of dealing with such an exploit. It's that no one agreed to be exploited. That's like saying it's ok for people to rob a store, because the store can afford it. Neither situation is acceptable.

The cost of the exploit is borne by full nodes and archive nodes. The miners being bribed to include such data with a broken policy that allows the exploit to continue are not the ones bearing the cost of the exploit. The cost is ongoing, and is non-zero. No matter how much that is downplayed, it's not what the code was intended to support.

  1. Bitcoin's governance is not a democracy.

Not sure the relevance. Has nothing to do with patching an exploit.

  1. Bitcoin Core developers do not control Bitcoin. Users, businesses, miners, etc. choose which node implementation to run.

Sadly, Bitcoin Core is the defacto "standard" for implementing Bitcoin. Sure, people can and do run other implementations. But the majority blindly follow Core.

This isn't ideal, and I think it should change, but again... absolutely nothing whatsoever to do with the issue at hand.

The code has a bug that is being exploited.

If users decide to run an alternative implementation that doesn't patch the bug, that's on them. It doesn't have any bearing whatsoever on whether or not a cut and dry exploit should be fixed or not.

  1. Uneconomic lower fee-rate txs that are priced out (by economically-viable higher fee-rate txs) are not entitled to be able to be made on L1 Bitcoin with its highest censorship-resistant assurances.

Discussions on fees, again, are a non-sequitur. How much someone pays to exploit a bug makes no difference whatsoever.

If most-fees are the be-all-end-all of whether or not a bug is allowed to be actively exploited, then what happens if a bug is found that creates bitcoin out of thin air, and the exploiters just pay high fees when doing so? Or somehow override the block size limitation? Or manage to exploit any number of other bugs yet-undiscovered? "Well, they're valid transactions with high fees, so we'll allow it." That's the argument being made here, and it's completely invalid. It's horrible to even suggest such a precedent be set here.

  1. The other realistic solution besides (8) is to focus on improving Bitcoin's L1 and L2 efficiency via consensus changes like CTV, for UTXO sharing to increase fee density and thus economic value of lower-value txs (to be better able to compete with higher-value economic txs), and to use CTV to be able to improve L2s like LN and Ark and Enigma.

Improving efficiency is good. Doing this and fixing an exploit are no exclusive items. Again, unrelated to the issue at hand.

  1. One other possible way to deal with arbitrary data txs, at its root, is to break Ordinal Theory, iiuc. This requires a fork, such as if Bitcoin implements Confidential Transactions. This too will need consensus, but it at least offers something valuable to many, in exchange for the lost fees: privacy (by default?) on L1 Bitcoin.

The entire concept of "Ordinal Theory" again, has nothing to do with the issue at hand... a recurring theme here. Something made up by people exploiting a bug in the software to build on top of exploiting that bug doesn't negate the issue or have any bearing on the issue at all.


As is painfully obvious, there's no merit at all in any of the points presented as arguments for not patching an exploited bug. Just a lot of non-sequitur info that's either completely unrelated, mean the exact opposite of what the poster actually intended as a "gotcha" (such as the point about alternate implementations above), or are easily disproven (IsStandard point above).

Again, I'd suggest folks read my original comments above. It doesn't matter how much someone is willing to pay to exploit a bug. That doesn't make it not a bug.

Away from the technical for a moment, it's also crucial to highlight the broader risks associated with not addressing this exploit. The implications extend far beyond network inefficiencies; they venture into legal and ethical territories that could jeopardize the integrity and future of Bitcoin itself. By turning a blind eye to arbitrary data schemes, we're inviting some of the very problems we seek to avoid as Bitcoin supporters. Additionally, failing to address obvious critical issues like this exploit undermines trust in the Bitcoin ecosystem, potentially driving away contributors and users who are vital to Bitcoin's growth and evolution. Leaving an arbitrary data exploit like this unaddressed can and likely will hinder additional innovation and adoption. Instead of, for example, finding sane ways for people wanting to tie arbitrary data to the blockchain in a sensible way, allowing this exploit to go on unaddressed and unpatched sends the wrong message that the community simply won't act to protect the network, and there's no point in doing so.

As I revisit the unfolding events around this issue, a deep sense of disappointment takes hold. The situation is not just disheartening, it's a stark reminder of how even the most skilled and respected developers can be swayed. Many, whom I've held in high regard for their contributions over the years, now seem entangled in defending the exploitation of a bug. This shift, possibly influenced by economic incentives or just the sheer volume of noise and relentless arguments from those profiting from the exploit, is a troubling development.

The fact that fixing this exploit is not only being ignored by, but getting active push back from the same developers, should be enough to give everyone in the Bitcoin community pause.

@1440000bytes
Copy link

1440000bytes commented Jan 18, 2024

@wizkid057 I searched for the word "exploit" and found you have used it several times in your comments. Which of these opcodes is not working as implemented or documented?

If you want to add another config option for certain transaction that are considered "spam" by a few users, why not open a pull request that allows users to do it without changing defaults?

I have suggested 2 approaches. Some developers suggested their preferred approach in recent meeting: https://bitcoin-irc.chaincode.com/bitcoin-core-dev/2024-01-11#998497

@wizkid057
Copy link

wizkid057 commented Jan 18, 2024

@wizkid057 I searched for the word "exploit" and found you have used it several times in your comments. Which of these opcodes is not working as implemented or documented?

You already know the answer to this, but are asking as if it's some kind of "gotcha" here. It is not.

Whether or not stuffing arbitrary data between valid opcodes is currently treated as "working" or not by current code isn't the issue. There's no documented intention that a loophole be created to permit copious amounts of fee-discounted arbitrary data be injected into a transaction. I see zero documentation in the code nor links provided that expresses the use case (OP_FALSE,OP_IF,DATA) as a valid method for storing arbitrary data.

Again, let's be clear: Cybersecurity 101 stuff here. The ability to inject arbitrary data into any system in a way that isn't explicitly documented as permitted for an expected purpose of the system is always a vulnerability and always something that should be addressed.

There is, however, one such permitted method, which is documented in both of the links you posted (OP_RETURN), with a sane limitation on its use. But again, you already knew that.

If you want to add another config option for certain transaction that are considered "spam" by a few users, why not open a pull request that allows users to do it without changing defaults?

Should we add a config option for other bug fixes as well? Last I checked there wasn't a "disableinfinitemoneyglitch" option that was disabled by default. Bug fixes in code that bring things back to intended operation are just patched in, with no option to disable them.

The only reason anyone's even considering making this an option at all, vs just patching like it rightfully should be from a development standpoint, is because of the noise involved from folks exploiting and supporting the exploiting.

Flipping your question then, why not add fixing the exploit as a default option and let anyone wanting to mine transactions that exploit the bug enable it explicitly? Seems fixing an exploit is the sane approach, and if other's then make a use case for enabling it on their nodes, that's up to them. The community as a whole shouldn't be running code that is exploitable by known methods.

I have suggested 2 approaches. Some developers suggested their preferred approach in recent meeting: https://bitcoin-irc.chaincode.com/bitcoin-core-dev/2024-01-11#998497

That meeting was completely pointless (as you may have noticed, I was "there"). There was zero opportunity to actually discuss the issue, address concerns, find a path forward that was potentially acceptable, and doesn't waste developer time. It was just brushed under the rug as quickly as possible. Complete waste of time. Had the same vibes as people bringing up completely sane points at a town hall meeting, only to have the politicians in charge completely ignore the issues presented and press onward. Disgusting, IMO, especially from the development community.

@tromp
Copy link

tromp commented Jan 18, 2024

The ability to inject arbitrary data into any system in a way that isn't explicitly documented as permitted for an expected purpose of the system is always a vulnerability and always something that should be addressed.

Bitcoin script is just a simple programming language with specific technical rules that all programs must follow. On top of that there are additional technical rules that programs must follow to fall in certain categories, like isStandard.
None of these rules can express intention or purpose or what is injection of arbitrary data. All they can do is set technical limits.

What you're really saying then, is that having such a programming language in Bitcoin, with so much entropy in its set of valid programs, is a vulnerability that should be addressed.

To force inscribers to use a space limited OP_RETURN, you must then reduce the entropy of program space, and witness program space in particular. And present convincing proof that you did so.

Of course such entropy reduction also means that tons of script developers will find the cost of implementing their programming logic in script much higher than before. And thus limit many other potential uses of Bitcoin.

@1440000bytes
Copy link

Again, let's be clear: Cybersecurity 101 stuff here. The ability to inject arbitrary data into any system in a way that isn't explicitly documented as permitted for an expected purpose of the system is always a vulnerability and always something that should be addressed.

You can use nlocktime and amounts in outputs to decode it to some text. Will that need a CVE ID?

@wizkid057
Copy link

Again, let's be clear: Cybersecurity 101 stuff here. The ability to inject arbitrary data into any system in a way that isn't explicitly documented as permitted for an expected purpose of the system is always a vulnerability and always something that should be addressed.

You can use nlocktime and amounts in outputs to decode it to some text. Will that need a CVE ID?

You're clearly using a bit of hyperbole here, again in yet another attempt at a "gotcha" moment without addressing anything already rebutted previously.... but, regardless, I suppose such an exploit could have had a CVE assigned if something like that had been exploited and/or discovered as an issue before OP_RETURN was made available as standard. If there wasn't anything being done to address the issue then sure, I guess it could've gotten a CVE. But, the developers did act to prevent harmful data stuffing schemes with the tolerated limited OP_RETURN data carrying becoming standard, which made using such methods pointless.

The scope of harm a bad actor would be capable of inflicting upon the network with such a silly encoding method is extremely limited. Many orders of magnitude less severe than the actively exploited OP_FALSE/OP_IF issue.

OP_RETURN exists and tolerates up to 640-bits of arbitrary data per transaction as standard. This is the mitigation for people encoding arbitrary data in other types of fields and overrides any need for stashing arbitrary data in other areas that would be less efficient than just using OP_RETURN.

Storing data in nLockTime is not happening as it's not really even exploitable. 32-bits max, but far less in practice, since would need to be a value permitted to be included in a block... so maybe 27 bits or so max per transaction? Just use OP_RETURN. Cheaper and more efficient.

Store data in amounts is convoluted since you'd need a significant amount of bitcoin to be able to encode anything of a useful length. Again... just use OP_RETURN. Cheaper and more efficient.

And that was the purpose of making 40 then 80 byte OP_RETURN standard. It was the exploit bandage for such inefficient and harmful data carrying schemes, and addressed the issue sufficiently.

Additionally, if someone were to use some method of storing data in nLockTime/amounts, presumably there would need to be a way to differentiate this arbitrary data from a normal transaction in order to extract the data and for it to be useful for the exploiter... in which case once known, the network (and thus developers here) should enforce sane limits on the data carrying practice like any other known data carrying method to prevent its exploitation to the best of our ability and prevent harm to the network. If that wasn't happening, then sure maybe it should get a CVE. I'd support that, despite how unlikely such a situation really is.

Again, there's no real reason to use such inefficient methods of data carrying, since OP_RETURN up to a sensible limit is available and tolerated as standard. So, that's a bit contrived and a non-issue in the real world.

Finally, keep in mind I've noted that the CVE's existence for the OP_FALSE/OP_IF exploit, nor its status have anything to do with whether or not the bug is in fact a vulnerability that's being actively exploited. It's clearly by definition a vulnerability, and even more critical to address since it's being actively exploited in the wild.

@1440000bytes
Copy link

Storing data in nLockTime is not happening as it's not really even exploitable. 32-bits max, but far less in practice, since would need to be a value permitted to be included in a block... so maybe 27 bits or so max per transaction? Just use OP_RETURN. Cheaper and more efficient.

https://x.com/1440000bytes/status/1732580146203250731

https://bitcoin.stackexchange.com/questions/23792/can-someone-explain-nlocktime/

Store data in amounts is convoluted since you'd need a significant amount of bitcoin to be able to encode anything of a useful length. Again... just use OP_RETURN. Cheaper and more efficient.

https://www.akamai.com/blog/security/bitcoins--blockchains--and-botnets

@wizkid057
Copy link

Things you've noted are addressed by the availability of OP_RETURN.

@ktecho
Copy link

ktecho commented Jan 19, 2024

Which of these opcodes is not working as implemented or documented?

It's not a single opcode that is not working as implemented or documented. It's the whole system that is not behaving as expected.

If you have an API so users of your site can upload pictures for blog posts, and you found that somebody uploaded a 35 GB blueray-ripped movie, you do something to fix the software, right?

@eragmus
Copy link

eragmus commented Jan 19, 2024

Which of these opcodes is not working as implemented or documented?

It's not a single opcode that is not working as implemented or documented. It's the whole system that is not behaving as expected.

If you have an API so users of your site can upload pictures for blog posts, and you found that somebody uploaded a 35 GB blueray-ripped movie, you do something to fix the software, right?

Bitcoin is a decentralized system that operates based on market incentives, it's not a centralized system. So, that's an apples-to-oranges comparison.

@ktecho
Copy link

ktecho commented Jan 19, 2024

Bitcoin is a decentralized system that operates based on market incentives, it's not a centralized system. So, that's an apples-to-oranges comparison.

That has nothing to do with what I said. It was designed to be money, and it's being exploited to be used as a general purpose database. That must be fixed.

@eragmus
Copy link

eragmus commented Jan 19, 2024

Bitcoin is a decentralized system that operates based on market incentives, it's not a centralized system. So, that's an apples-to-comparison.

That has nothing to do with what I said.

I responded to what you said:

If you have an API so users of your site can upload pictures for blog posts, and you found that somebody uploaded a 35 GB blueray-ripped movie, you do something to fix the software, right?

You made a comparison to support your position, and I explained how it's an apples-to-oranges comparison.

It was designed to be money, and it's being exploited to be used as a general purpose database. That must be fixed.

Also, because bitcoin is a decentralized system, it needs consensus to be changed to be "fixed", which does not exist for many reasons, as I explained in my fairly comprehensive post. There is no benevolent dictator.

@Bitcoin-Lebowski
Copy link

Everything @wizkid057 has said in the thread above in in line with my thoughts. I find it amazing and deeply worrying that those with ulterior motives could so easily drive Bitcoin discussion around this obvious exploit.

Bitcoin is a monetary network, other use cases which make it more difficult or expensive for users to use it as such are damaging to its adoption. It's common sense.

Some of the comments and behaviour I've seen from people defending the exploit are shocking to me. If we truly believe that Bitcoin is the technology we think it is, then we are simply the current custodians of it for future generations, and we have a duty to protect it from bad actors,. Allowing exploits which bloat node memory, UTXO sets, attach who-knows-what data/images to the chain etc. is nothing short of a dereliction of that duty. It's unbelievable and hugely disappointing to me that this discussion has taken the course it has.

@tromp
Copy link

tromp commented Jan 26, 2024

Nobody is defending the exploit. Rather, some of us recognize that this exploit is inherent in Bitcoin's design, whose script language provides plenty of room for embedding arbitrary data. What transactions miners put in blocks is driven purely by profit motives and the fact is that inscriptions are very profitable. Declaring them "bad" doesn't change that. As long as they follow consensus rules, they are "good" to the miners.

The intentions behind this proposal are good. What I object to is the suggestion that this closes the exploit. It doesn't. It only tries to make it harder to relay inscriptions to miners. But when so much money wants to find its way to miners, and miners are so eager to accept this extra income, most of it will still find its way with this proposal enacted.

But it will have the side effect of distorting mempool views between regular nodes and miners, with negative effects on fee estimation.

As phrased, the proposal fails to acknowledge this reality and fails to address the possible negative effects. Pointing that out is not "defending the exploit".

@michaelfolkson
Copy link
Contributor

The incorrect use of terms like "exploit", "bug" and "spam" is ensuring this discussion is extremely subpar. The absolute best this pull request can accomplish is to slow down the confirmation of certain transactions and ensure they are submitted directly to mining pools rather than propagated over the P2P network. So even if it was an "exploit" (which it isn't) this pull request doesn't fix the "exploit".

If it truly was an "exploit" it would need to be comprehensively fixed. The way to comprehensively fix an "exploit" would be to change the consensus rules and ensure none of these transactions were mined into blocks rather than merely slowing down the confirmation of these transactions. Whilst I personally think attempting to do that in this particular example would be foolish in the extreme that is what had to be done for an actual exploit (e.g inflation bug 2018).

"Spam" is arguably a more acceptable term than "exploit" but I wouldn't even describe it as that. Assuming the confirmation of these transactions is only being slowed down (which is what this PR is attempting to accomplish) every full node will eventually have to see and verify these transactions once they are mined. If I receive a spam email I never want to see it. If one of these transactions is mined my full node has to not only receive it and see it but verify it also. How can it be "spam" if my full node is required to see it and verify it to ensure it does full verification?

Personally I'm not sure of the history of what default and custom policy rules were used for as a blunt tool many years ago. But the reality is today submission of high fee rate, consensus (rule) valid transactions directly to mining pools bypassing the P2P network is happening and will continue to happen. The only (extremely foolish in my view) way to try to prevent this from happening is to change the consensus rules.

@GregTonoski
Copy link

with negative effects on fee estimation.

What is the negative effect? Could you describe (and possibly quantify in order to put it into perspective), please?

Could you explain the mechanism or provide evidence that support the opinion, please? Why negative and not neutral? Isn't the fee estimation flawed and confused with the proposal, perhaps?

@GregTonoski
Copy link

GregTonoski commented Jan 26, 2024

The absolute best this pull request can accomplish is to slow down the confirmation of certain transactions and ensure they are submitted directly to mining pools rather than propagated over the P2P network.

No, that's untrue. To the contrary, that pull request doesn't ensure that transactions are submitted directly to mining pools rather than propagated over the P2P network.

In my view, the discussion here is about deficiency in Bitcoin Core implementation that does not support configuration of mempools (networking). Why not adding more options to Bitcoin Core and let users decide about their mempools (and resource usage)? Hardcoded one-size-fits-all configuration is suboptimal, especially in context of mempools which are always unique (by definition).

@michaelfolkson
Copy link
Contributor

@GregTonoski:

What is the negative effect? Could you describe (and possibly quantify in order to put it into perspective), please?

The point here is that if you prevent full nodes from receiving and verifying high fee rate, consensus (rule) valid transactions prior to them being mined then a full node is unaware of increasing numbers of high fee rate, consensus (rule) valid transactions until they are mined. Hence looking at their mempool and trying to work out what the current market fee rate is distorted because many of the high fee rate, consensus (rule) valid transactions aren't in their mempool, they were submitted directly to a mining pool instead. The full node's mempool is being partially blinded to what high fee rate, consensus (rule) valid transactions are actually out there waiting to be mined and hence can't do accurate estimates of the current market fee rate.

@GregTonoski
Copy link

GregTonoski commented Jan 26, 2024

@GregTonoski:

What is the negative effect? Could you describe (and possibly quantify in order to put it into perspective), please?

(...) Hence looking at their mempool and trying to work out what the current market fee rate is distorted because many of the high fee rate, consensus (rule) valid transactions aren't in their mempool (...). The full node's mempool is being partially blinded to what high fee rate, consensus (rule) valid transactions are actually out there waiting to be mined and hence can't do accurate estimates of the current market fee rate.

Then I don't see a problem with impact on the fee estimation implementation since each mempool is different/unique anyway. (I turned off my mempool because it was full of high-fee spam).

@michaelfolkson
Copy link
Contributor

@GregTonoski: So what are you using to do fee estimation and assessing what fee you need to pay to get your transaction into an upcoming block? I assume you are using someone else's mempool (e.g. whatever block explorer you use). This is fine, you are free to do this and many do. But other people would like the option of assessing fees using the mempool of their own full node and not relying on a centralized third party's mempool.

@tromp

This comment was marked as off-topic.

@GregTonoski
Copy link

GregTonoski commented Jan 26, 2024

@GregTonoski: So what are you using to do fee estimation and assessing what fee you need to pay to get your transaction into an upcoming block? (...)

I use the data from a few recent blocks to find out median of fees of transactions that were included. I also consider broadcasting a duplicated transaction with higher fee if the first one doesn't get in a block for too long.

Besides, nobody uses fee estimation by Bitcoin Core in my circles.

@michaelfolkson
Copy link
Contributor

I use the data from a few recent blocks to find out median of fees of transactions that were included. I also consider broadcasting a duplicated transaction with higher fee if the first one doesn't get in a block for too long.

Besides, nobody uses fee estimation by Bitcoin Core in my circles.

Ok fair enough. That is a criticism of the Bitcoin Core wallet's fee estimation. But the "data from a few recent blocks" you are using for fee estimation includes transactions that have been submitted to the miner directly and includes transactions that drove you to turn off your mempool. So you aren't ignoring these transactions for fee estimation. You are accepting that they made it into a mined block and hence should factor into your own fee estimation. Someone who is using their mempool for fee estimation also doesn't want to exclude them from their fee estimation.

@GregTonoski
Copy link

Let's not divert from the subject. Fee estimation techniques and tradeoffs are beside the point.

@michaelfolkson
Copy link
Contributor

Let's not divert from the subject. Fee estimation techniques and tradeoffs are beside the point.

Just pointing out your inconsistency @GregTonoski. If you don't want people to have some of these high fee rate transactions in their mempools so they can do better fee estimation you should filter out these transactions from mined blocks in your fee estimation data. You want them to keep their heads in the sand that some of these transactions will be mined and hence you should keep your head in the sand that the transactions ended up being mined.

@bitcoin bitcoin deleted a comment from joeyvee1986 Jan 26, 2024
@ArmchairCryptologist
Copy link

Hardcoded one-size-fits-all configuration is suboptimal, especially in context of mempools which are always unique (by definition).

If you do not run a mining node, you do in fact want your mempool configuration to be as compatible with the miner majority as possible if you want it to be "optimal", both from the perspective of your node and from the network as a whole. In most cases, this means using the defaults, but possibly with a larger size.

To see why, let's assume that you configure your local node to discard transactions based on your individual preferences. This means that whenever a block is mined, any mined transactions that you previously discarded will inevitably be missing from your node's mempool. As other people have mentioned, outside of degrading your node's ability to predict what the next block will be, this makes compact blocks less efficient, since mined transactions your node has no prior knowledge of will need to be downloaded from another node before the block can be validated. In other words, your node has to download and validate the filtered transaction twice - once when it was originally broadcast, and again when the block was mined. This slows down block validation, and adds overhead both on your own node and the node you (re-)fetched the transaction from.

The idea of filtering spam transactions may be laudable, but unfortunately, mempool rules that aren't incentive compatible do not really work in general. Some are necessary to prevent network-level DoS, like the RBF replacement rules, but even if mempool policy prevents a transaction from being accepted and/or broadcast by default, this doesn't stop a miner from accepting it directly. And many/most of them will if it increases their mining profits. Which means that playing whack-a-mole with data storage schemes like ordinals on the mempool policy level does not work in practice, and will have no real effect outside of making your node perform worse in various ways.

@luke-jr
Copy link
Member Author

luke-jr commented Feb 14, 2024

It is intentional that compact blocks become less efficient the more miners deviate from the network norms. That's a reason for miners to comply with nodes, not nodes to comply with miners.

@1440000bytes
Copy link

Are ocean contributors (miners) complying with the nodes?

@GregTonoski
Copy link

GregTonoski commented Feb 18, 2024

Common Weakness Enumerations (non-exhaustive list) regarding the Bitcoin Script implementation in Bitcoin Core:
CWE-561: Dead Code
CWE-570: Expression is Always False
CWE-1164: Irrelevant Code

@ben-arnao
Copy link

ben-arnao commented Mar 13, 2024

I was under the impression this change might (however low chance) censor legitimate transaction to fight inscriptions. I never want to risk legitimate txs being censored, but I don't think that assumption is correct though as it does not seem like any legitimate transaction needs to exceed the limit here. That being said, I think maybe the original approach is correct. As for what the original author's intentions were.. first of all I do think their intention was more than likely in line with what this is trying to resolve, and even if it wasn't does that matter?

If we're considering not making this change then that's more of a fundamental question of whether or not BTC blockchain should be used for inscriptions or transacting. If it always boils down to miners doing whatever is more profitable, why have the original limit at all? However I do think that miners would choose to adopt the store of value/transacting use case of BTC blockchain is it will make the tokens they mine worth more in the long run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests