Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a size limit for tx_extra in tx pool #8733

Merged
merged 1 commit into from Mar 18, 2023

Conversation

tevador
Copy link
Contributor

@tevador tevador commented Feb 5, 2023

Related to #6668

This patch limits the maximum size of tx_extra that's allowed in the transaction pool. Mined transactions bypass this check, which means this is a non-consensus change.

I set the limit to 1060 bytes. A typical transaction with 16 outputs needs up to 547 bytes for the transaction public keys, which still leaves 32 bytes of space for some custom data to send to each recipient.

Note that Bitcoin also has a limit for arbitrary data payloads that defaults to a much lower value of just 83 bytes per transaction.

The purpose of this change is to make it harder to "upload" large amounts of unprunable data onto the blockchain. Here is an example of a transaction with a tx_extra size of 126 KB containing a PDF file. The limit of ~1 KB should be sufficient for useful content such as the metadata of decentralized exchanges.

Ping @kayabaNerve

@jeffro256
Copy link
Contributor

I am generally against adding relay rules to tackle the tx_extra issue, because it will cause hard-to-debug issues for people trying to broadcast transactions that are otherwise compliant, especially if they are not using the reference code.

However, if a soft limit were to be implemented, this value should hard coded and NOT easily adjustable by the user (unless they want to compile their own node). Their should also be an error/warning for trying to construct a transaction with tx_extra being too large, so a user can know if a transaction won't be relayed before constructing and broadcasting the transaction.

@tevador
Copy link
Contributor Author

tevador commented Feb 6, 2023

it will cause hard-to-debug issues for people trying to broadcast transactions that are otherwise compliant

The daemon RPC call will return an error message "tx-extra too big", which should make it obvious.

this value should hard coded and NOT easily adjustable by the user

The adjustability is mostly for miners if they wanted to mine their own non-standard tx. But I see your point, since we also don't have an option to override the default minimum fee.

@kayabaNerve
Copy link

kayabaNerve commented Feb 6, 2023

Concept ACK. For a 16-out TX, I calculate 557 bytes off-hand (due to the mandatory payment ID). Accordingly, I'd actually suggest the limit of just 1024 bytes.

The methodology for deciding a limit here is sane if you believe in short memos to recipients. I don't believe anyone is currently using Monero that way, instead focusing on payloads. While yes, we may want to leave the door open to people using Monero that way, I don't believe Monero should be a messaging layer except for brief messages. 16 messages, in one TX so effectively one large message with 16 parts, isn't brief.

1024 would leave ~465 bytes of usable message space (post-TLV), or two tweets. While this is still rather large, I provide a few use cases:

  1. 16 messages with ~30 characters per recipient ("Jim Jones, payroll 1/1/2023", "Happy birthday!")

  2. Including a JAMTIS address in a TX. With a binary encoding, I believe it's 123 bytes? Yet with base32, it has 196 bytes. That only leaves 260 bytes usable, or the size of a single tweet.

  3. To be a degenerate, Uniswap v3 contract calls. If we want to discuss Monero, and having the ability to access other networks from it and to it, which is an extension of classical discussions on achieving integration, we should acknowledge the ecosystem Ethereum is, along with its tokens (notably stablecoins). For someone to send Monero, attempting to trigger a Uniswap v3 action (say, to acquire DAI), they'll have to encode ~260 bytes of Ethereum calldata.

If these use case don't sound desired, then shrink it further (768 bytes?). If these use cases sound tolerable, 1024 makes sense. If this sounds a bit limiting, increase the usable space to 512 bytes (so each recipient could get the hash of a message/you could do a JAMTIS address and a Uniswap v3 call). That'd be 1071 byte limit, I believe (I may be off by a byte or two).

Also, please note I appreciate tevador acknowledging DEX metadata as a usecase, and I'd like to caveat I would explicitly push for 512 byes at a minimum if it wasn't for the fact that I expect DEX payloads to be large coming in, not going out, and coming in they're generally just 2-out TXs. The lack of so many additionally public keys frees up hundreds of bytes, creating a usable amount >512. This is not specific to Serai and should also be true for other similar projects.

Atomic swaps could technically feasibly use on-chain data thanks to the COPZ discrete log equality proof. It'd be ~1500 bytes, from my understanding (for just a 128-bit private key, double it for a full bit option). If we also want to leave that use case on the table, then a slightly higher limit of (1 + 32 + 1 + 1 + (16 * 32) + 1 + 1 + 8 + (9 * (1 + 2)) + 2048) = 2632 would be my suggestion (if my length calculation is correct for a standard TX extra with 2048 bytes added in <= 255 byte chunks). While we may not want to explicitly support that use case, we can acknowledge how a larger size enables these options and accordingly want a larger size.

I'd rather have this limit now and further refine later than endlessly discuss it though. At the very least, this concept as proposed should be merged (I have not reviewed the impl).

TL;DR 2632 for atomic swaps without an additional communication channel (which is idiotic as it removes the privacy benefit but eh, has its benefits). 1024 to be even more concise yet still tolerable to most protocols. At least do this as proposed.

@jtgrassie
Copy link
Contributor

this value should hard coded and NOT easily adjustable by the user

I tend to agree with this, as it's not obvious to me why even a miner should/would need to override.

16 messages with ~30 characters per recipient ("Jim Jones, payroll 1/1/2023", "Happy birthday!")

The decision was already made wrt payment IDs to do away with custom non-encrypted messages in favor of 8 byte encrypted payment IDs, so I don't think we need revisit this use case.

1024 to be even more concise yet still tolerable to most protocols. At least do this as proposed.

1024 should suffice, but if others want to err on the side of caution, 2150 as proposed should be more than adequate.

@kayabaNerve
Copy link

The decision was already made wrt payment IDs to do away with custom non-encrypted messages in favor of 8 byte encrypted payment IDs, so I don't think we need revisit this use case.

Somewhat agree, largely disagree with this specific point. Not only are payment IDs only usable for a known context, which jumps past the issue of context establishment, there's only one encrypted payment ID per TX. While no, I don't think people should be placing "payroll" nor "happy birthday" on chain, encrypted payment IDs aren't a complete solution.

@jtgrassie
Copy link
Contributor

While no, I don't think people should be placing "payroll" nor "happy birthday" on chain, encrypted payment IDs aren't a complete solution.

Yet you are advocating for "~30 characters per recipient" (for a solution to an undefined use case).

The encrypted payment IDs are a solution to a specific use case, and one that removed the ability to easily embed custom data. I'd prefer we limit to known use cases and adjust in the future (as and when compelling use cases are presented).

@jeffro256
Copy link
Contributor

jeffro256 commented Feb 6, 2023

The daemon RPC call will return an error message "tx-extra too big", which should make it obvious.

One problem:This PR doesn't return how big tx_extra is allowed to be, so what should the wallet do in this scenario? Start binary searching how big it can make it? However, that could be fixed by making the RPC interface more expressive or making the limit constant and hardcoded

@tevador tevador changed the title Add a configurable size limit for tx_extra in tx pool Add a size limit for tx_extra in tx pool Feb 6, 2023
@tevador
Copy link
Contributor Author

tevador commented Feb 6, 2023

Thanks for the feedback. Changes:

  • Removed the command line option. The limit is now a compile-time constant.
  • Reduced the limit from 2150 bytes to 1060 bytes.

For a 16-out TX, I calculate 557 bytes off-hand (due to the mandatory payment ID)

The dummy payment ID is added only for 2-output transactions.

// we don't add one if we've got more than the usual 1 destination plus change
if (destinations.size() > 2)
add_dummy_payment_id = false;

The current limit of 1060 bytes equals 547 + 1 + 512, so exactly enough space for 16 outputs, a 1-byte tag and 32 bytes for each recipient.

@SamsungGalaxyPlayer
Copy link
Collaborator

@jtgrassie at the minimum, it's somewhat reasonable to allow for a refund address to be encoded in a transaction. Ideally this can be communicated out of band ofc.

I agree with the proposed limit. It won't prevent everything, but it'll prevent most forms of drive-by, low-effort malice.

@kayabaNerve
Copy link

The current limit of 1060 bytes equals 547 + 1 + 512, so exactly enough space for 16 outputs, a 1-byte tag and 32 bytes for each recipient.

Slight caveat. Will Monero fail to scan such a TX?

bool parse_tx_extra(const std::vector<uint8_t>& tx_extra, std::vector<tx_extra_field>& tx_extra_fields)
{
tx_extra_fields.clear();
if(tx_extra.empty())
return true;
binary_archive<false> ar{epee::to_span(tx_extra)};
do
{
tx_extra_field field;
bool r = ::do_serialize(ar, field);
CHECK_AND_NO_ASSERT_MES_L1(r, false, "failed to deserialize extra field. extra = " << string_tools::buff_to_hex_nodelimer(std::string(reinterpret_cast<const char*>(tx_extra.data()), tx_extra.size())));
tx_extra_fields.push_back(field);
} while (!ar.eof());
CHECK_AND_NO_ASSERT_MES_L1(::serialization::check_stream_state(ar), false, "failed to deserialize extra field. extra = " << string_tools::buff_to_hex_nodelimer(std::string(reinterpret_cast<const char*>(tx_extra.data()), tx_extra.size())));
return true;
}

I'm pretty sure it requires intelligibility of extra. That'd mean it'd need 3 NONCE fields, each with <= 255 bytes. That's 8 bytes of TL data (tag, 224, tag, 224, tag, 32), not 1. Sorry if I'm misreading this.

@jtgrassie
Copy link
Contributor

@kayabaNerve

Slight caveat. Will Monero fail to scan such a TX? ... I'm pretty sure it requires intelligibility of extra. That'd mean it'd need 3 NONCE fields, each with <= 255 bytes. That's 8 bytes of TL data (tag, 224, tag, 224, tag, 32), not 1. Sorry if I'm misreading this.

There is no enforced structure to the data in tx extra.

@SamsungGalaxyPlayer

at the minimum, it's somewhat reasonable to allow for a refund address to be encoded in a transaction

1060 (or 1024) bytes still allows plenty of room for such a feature.

@kayabaNerve
Copy link

@jtgrassie I said fail to scan. I understand monerod will accept and mine the TX. I'm unsure the 16 recipients will realize they received funds though under the current PR and one proposed usage.

@tevador
Copy link
Contributor Author

tevador commented Feb 6, 2023

@kayabaNerve You may be right, but fixing the tx_extra parser is out of scope of this PR. I don't see why the whole tx would need to be skipped if some unknown data are found after the tx public keys.

@jtgrassie
Copy link
Contributor

jtgrassie commented Feb 6, 2023

I said fail to scan. I understand monerod will accept and mine the TX. I'm unsure the 16 recipients will realize they received funds though under the current PR and one proposed usage.

This patch doesn't prevent scanning. It prevents txs being placed in the tx pool if they have too big extra.

@kayabaNerve
Copy link

kayabaNerve commented Feb 7, 2023

@tevador My comment was if you use 8 bytes of tag data, Monero won't reject it. You calculated a number using only 1 byte of tag data. I was saying you may want to 'correct' it to +7. Else, it's actually just 505 bytes of usable data. I wasn't calling for fixing the extra parser which I question if will ever be feasible.

@tevador
Copy link
Contributor Author

tevador commented Feb 7, 2023

it's actually just 505 bytes of usable data

Yes, with a buggy parser, it turns out to be just 505 bytes. I don't see it as a reason to increase the size limit.

fixing the extra parser which I question if will ever be feasible

Why wouldn't it be feasible? Any 3rd party wallet can fix it. The official wallet code should also get fixed eventually.

@kayabaNerve
Copy link

👍 to being 505.

I accused the extra parser, in Monero specifically, as it's a low priority topic which is always contentious. There's also no good way to handle unrecognized tags (immediately returning, I guess?). But a debate/PR for another day. Sorry if this sidetracked this PR too much.

@jeffro256
Copy link
Contributor

jeffro256 commented Feb 26, 2023

Sorry I don't know how to do PR code changes suggestions in GH for files outside the PR tevador#1. This should go alongside this PR if it gets merged

@tevador
Copy link
Contributor Author

tevador commented Feb 26, 2023

@jeffro256 Thanks, added.

@jtgrassie
Copy link
Contributor

This patch is really needed now that someone has released something to (ab)use the unconstrained size of the field.

Copy link
Collaborator

@SamsungGalaxyPlayer SamsungGalaxyPlayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a sensible limitation. It doesn't completely prevent people from using tx_extra for any purpose, but it limits one's ability to include these transactions without direct miner cooperation. This should be the start of the response to tx_extra, not the end. We should set a consensus limit of 1060 bytes later as well.

Let's not avoid putting up any defenses while are still discussing the extent of the defenses we all want. 1060 bytes is an uncontroversial number, let's enforce it today and enforce other sensible limits later after those discussions have reached consensus.

@plowsof
Copy link
Contributor

plowsof commented Mar 17, 2023

With this PR. running the monero-wallet-cli from https://github.com/mooonero/mordinals with --do-not-relay to save the raw transaction file and then trying to sendrawtransaction i get this response:

{
  "credits": 0,
  "double_spend": false,
  "fee_too_low": false,
  "invalid_input": false,
  "invalid_output": false,
  "low_mixin": false,
  "not_relayed": false,
  "overspend": false,
  "reason": "",
  "sanity_check_failed": false,
  "status": "Failed",
  "too_big": false,
  "too_few_outputs": false,
  "top_hash": "",
  "tx_extra_too_big": true,
  "untrusted": false
}

the morbinal has been denied

@luigi1111 luigi1111 merged commit 76dd14d into monero-project:master Mar 18, 2023
@Gonbatfire
Copy link

Gonbatfire commented Mar 24, 2023

I understand that no one wants bloat on a digital cash protocol, but can @tevador, @SamsungGalaxyPlayer or someone else explain to me how does using 50 transactions to inscribe 50KB image do LESS(?) harm to the privacy of the protocol than just 1 transaction?

Are we still prioritizing transaction privacy? Please point out what I'm not seeing.

Edit: I despise non-digital cash txs as much as the next guy, but I also want to make sure we are not losing sight of the privacy aspect when attempting to suppress these things

@adapt-L
Copy link

adapt-L commented Mar 25, 2023

@Gonbatfire Your logic doesn't necessarily hold. You might as well say that a limit of 50KB is harming privacy because someone who wants to upload a 50MB image would have to use a thousand transactions, so we should actually raise the limit. What will probably happen is that they will be uploading 1KB files instead of 50KB files. Most of these images are less than a kilobyte anyways...

I agree that it doesn't completely prevent an adversarial use of tx_extra spam, but it does discourage the behavior and limits them to a kilobyte unless they want to redesign their protocol to handle files split across multiple transactions. Instead of uploading a massive file, now they will just post a Bittorrent or IPFS link or file hash, and distribute the whole file out-of-band.

I am not an expert, but my intuition is that it makes privacy slightly worse, even if these "fake NFT transactions" are identified and not used in decoy selection, because they compete with transactions that could be used as decoys. But I would suspect that the net effect is that less space will be used by these things because users will generally be uploading of 1kb pixel art instead of 50kb photographs.

@Gonbatfire
Copy link

Gonbatfire commented Mar 25, 2023

Instead of uploading a massive file, now they will just post a Bittorrent

@adapt-L Completely disagree, like you said it's trivial to update their protocol to just use multiple transactions, but most importantly: storing the data off-chain misses the point, the "value" is doing it all on-chain

At most what this change did is slightly raise the fee cost to store arbitrary data, which I don't oppose, but it also comes with the tradeoff that each "upload" will harm the network's privacy even more.

What I would like to make clear is that this is fundamentally different from any other flood attack, because there is now a financial incentive playing a role, that changes everything. (tagging @kayabaNerve)

For example 51% attacks are not a preoccupation because it generally goes against the financial incentive of miners to execute one, but if miners were to be financially incentivized to do 51% attacks (game theory is broken) then we would have a really bad time. Right now there is a group users who are financially incentivized to flood the chain, not good.

Recently a new ATH of $257,460 (9.28 BTC) was spent in fees (single day) to inscribe ordinals in Bitcoin, this is something that may as well catch on in Monero eventually, we should strive to minimize their privacy impact as much as possible for when that happens.

@adapt-L
Copy link

adapt-L commented Mar 25, 2023

@Gonbatfire

the "value" is doing it all on-chain

Is it? I think this is a common misconception. I would argue that all the value is supposed to come from the issuer's signature, and the assumption that they (or others) will respect (care about) the metadata associated with the enote in the future. See projects like bitcoin's colored coins project, which existed way before projects like etherium's NFTs or bitcoin ordinals which standardized uploading tons of unnecessary metadata to the ledger. For example, if a vendor attaches a coupon to an enote (as described in this video: https://www.youtube.com/watch?v=889JSfIaPzs), the value of the colored coin rests on the assumption that they will accept the coupon at a later date. Or if some game developer like CS:GO uses colored coins to represent skins or something, the value of the skin is dependent on the authority issuing the colored coin not changing their game's client code to disrespect the metadata in the going forwards. It's the requirement of an issuer's signature which makes them scarce (if not valuable, like a autograph), because anyone else can easily create an output with similar metadata.

Whether or not others will respect the metadata is the chief issue of NFTs, (if not cryptocurrency in general). You can goad other nodes into synchronizing the metadata by including it in OP_RETURNs or tx_extra's, but you can not force them to respect or use the metadata whatsoever. In monero, a good portion if its value comes from their fungibility, their lack of metadata (a.k.a. de-anonymization heuristics).

A similar system of colored coins could be easily instituted in monero, where the "minter" simply signs a certificate file that points to one of their outputs, and you use a log of tx proofs to indicate ownership.
In fact, I would even be willing to help implement such a system if it would make these "mordinals" people stop spamming the chain with their image files! This would not only save on-chain data, but it would also improve privacy to the extent that these certs and tx_proofs are shared privately. Perhaps you can have a sidechain that shares valid tx_proofs and treats them like blackball transactions. However, it would not allow for rarity on the basis of the "order of inscriptions" (e.g. you could just make a certificate that points to an old enote). It may even be possible to create new type of cryptographic proof that replaces the tx_proofs in this situation with something that improves fungibility.

Additionally, you could just make it so that wallets only mix outputs with the same tx_extra data. That's a way of implementing colored coins that takes advantage of monero's existing privacy features! For example, some issuer could create 30 colored coins that all have identical metadata, and they all use the same pool of outputs for decoy selection, so they are fungible amongst each other insofar as the other colored coins owners don't disclose their ownership.

At most what this change did is slightly raise the fee cost to store arbitrary data, which I don't oppose, but it also comes with the tradeoff that each "upload" will harm the network's privacy even more.

What I would like to make clear is that this is fundamentally different from any other flood attack, because there is now a financial incentive playing a role, that changes everything. (tagging @kayabaNerve)

It doesn't hurt privacy by output spamming, because the fact that they are not real transactions is transparent/public. Take whatever public stenographic mechanism they use to plant files on monero, and you have a mechanism to detect the fake transactions and add them to a blackball list. This is what the "mordinals" creator encourages and suggests. The problem is that these files will displace real transactions that would otherwise increase the anonset of monero. The solution is to encourage them not to waste so much space! I think the best way to do that is a combination of increasing the tx_extra fee and capping tx_extra at some reasonable limit to encourage them to use smaller files (perhaps use some sort of membership proofs to discourage output spamming? there was some debate about this on IRC)

For example 51% attacks are not a preoccupation because it generally goes against the financial incentive of miners to execute one, but if miners were to be financially incentivized to do 51% attacks (game theory is broken) then we would have a really bad time. Right now there is a group users who are financially incentivized to flood the chain, not good.

That's predicated on the notion that their "inscriptions" will be valuable in the future, which is debatable. A more accurate analogy would be if it were uncertain if a 51% attack would be profitable or not.

Recently a new ATH of $257,460 (9.28 BTC) was spent in fees (single day) to inscribe ordinals in Bitcoin, this is something that may as well catch on in Monero eventually, we should strive to minimize their privacy impact as much as possible for when that happens.

agreed. But if it helps monero gain some publicity it may actually improve the anonset of monero in the long run by attracting new users. It's not the greatest publicity though, as it encourages use amongst NFT grifters. It's also important to notice that monero does not have the same blocksize restrictions as bitcoin, so these types of comparisons are apples-and-oranges in a sense.

@adapt-L
Copy link

adapt-L commented Mar 25, 2023

Sorry for the long post, all.

@kayabaNerve
Copy link

If people want to spam Monero with garbage, that's their perogative. Monero's job is to do its best to be a currency. Not to cater to garbage. This PR disincentives garbage, yet does move people from one type of garbage to a type of garbage arguably worse for privacy. I'm not mad at Monero for disincentivizing large file uploads in a TX by not accepting such TXs. I'm mad at idiots putting large files onto Monero. Use Bitcoin. Use Filecoin. Use torrents. Use IPFS. There's no advantage to the Monero protocol for this, at all.

I'd also point out, as prior pointed out, that by gonbatfire's interpretation the opposite of this patch would increase privacy, so we should adopt a 1 GB TX size limit so anyone uploading 1 GB files doesn't damage privacy by splitting across 10,000 TXs with a minimum of 20k outputs. I'd say don't put 1 GB files on Monero. Apparently some people may say differently.

@Gonbatfire
Copy link

Gonbatfire commented Mar 28, 2023

@kayabaNerve I actually do think that it serves no point to limit tx_extra in any way (minus the blocksize limit of course), if we don't want people putting 1GB files on Monero, then fees should reflect that (and thankfully they do, it's expensive to put 1gb on-chain)

I would prioritize privacy over scalability, this patch has done the opposite, I just wanted to point it out.

I suppose it isn't useful to continue this discussion until an actual dedicated attacker comes, so we will deal with it when time comes I suppose. Maybe by the time arbitrary data storage catches on we will already have full membership proofs, hopefully.

@plowsof
Copy link
Contributor

plowsof commented Mar 28, 2023

Following up to my earlier test. an image of 985 bytes was broadcast / mined on a node running this patch. my understanding of this patch is wrong, i assumed it would limit sending an image of 32 bytes, or, 255 bytes in 1 tx if you split the image over all outputs. Although small / low quality - a 985 byte image is still 'something to work with'.
https://testnet.xmrchain.net/tx/8bae4c8163824a566556ed8914f44812768854465c48cae8d4377303eccda63a

@kayabaNerve
Copy link

It limits the entire TX extra field to 1060 bytes. If only 75 bytes were used by Monero, that leaves 985 bytes free. The entire field is so limited due to how 16-output TXs will use ~550 bytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants