Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a getutxos command to the p2p protocol #4351

Merged
merged 1 commit into from Aug 25, 2014
Merged

Conversation

@mikehearn
Copy link
Contributor

mikehearn commented Jun 16, 2014

Introduction

The getutxo command allows querying of the UTXO set given a set of of outpoints. It has a simple implementation and the results are not authenticated in any way. Despite this, there are times when it is a useful capability to have. I believe @jgarzik also has a use case for this, though I don't know what it is.

As a motivating example I present Lighthouse, an app I'm writing that implements assurance contracts:

http://blog.vinumeris.com/2014/05/17/lighthouse/

Lighthouse works by collecting pledges, which contain an invalid transaction signed with SIGHASH_ANYONECANPAY. Once sufficient pledges are collected to make the combination valid, we say the contract is complete and it can be broadcast onto the network, spending the pledged outputs. Before that occurs however, a pledge can be revoked and the pledged money redeemed by double spending the pledged output. For instance you might want to do this if it becomes clear not enough people care about the assurance contract for it to reach its goal in a timely manner, or if you simply need the money back due to some kind of cashflow crunch.

It is convenient to be able to see when a pledge has been revoked, so the user interface can be updated, and so when the final contract is created revoked pledges can be left out. For this purpose "getutxos" is used.

Protocol

The getutxos message takes a boolean which controls whether outputs in the memory pool are considered, and a vector of COutPoint structures. It returns a bitmap with the same number of bits as outputs specified rounded up to the nearest 8 bits, and then a list of CTxOut structures, one for each set bit in the bitmap. The bitmap encodes whether the UTXO was found (i.e. is indeed unspent).

Authentication

The results of getutxos is not authenticated. This is because the obvious way to do this requires the work maaku has been doing on UTXO commitments to be merged, activated by default, miners to upgrade and a forking change made to enforce their accuracy. All this is a big project that may or may not ever come to fruition.

For the Lighthouse security model however, this doesn't matter much. The reason is that the pledge transactions you're getting (which may be malicious) don't come from the P2P network. They come in the form of files either from a simple rendezvous server, or e.g. a shared folder or email attachments. The people sending these files have no way to influence the choice of peers made by the app. Once the outputs are returned, they are used to check the signatures on the pledge, thus verifying that the pledge spends the UTXO returned by the P2P network.

So we can be attacked in the following ways:

  • The pledge may be attempting to pledge non-existent outputs, but this will be detected if the majority of peers are honest.
  • The peers may be malicious and return a wrong or bogus output, but this will be detected when the signature is checked, except for the value (!) but we want to fix this by including the value into the sighash at some point anyway because we need it to make the TREZOR efficient/faster.
  • The peers may bogusly claim no such UTXO exists when it does, but this would result in the pledge being seen as invalid. When the project creator asks the pledgor why they revoked their money, and learns that in fact they never did, the bogus peers will be detected. No money is at risk as the pledges cannot be spent.
  • If the pledgor AND all the peers collaborate (i.e. the pledgor controls your internet connection) then they can make you believe you have a valid pledge when you don't. This would result in the app getting "jammed" and attempting to close an uncloseable contract. No money is at risk and the user will eventually wonder why their contract is not confirming. Once they get to a working internet connection the subterfuge will be discovered.

There is a final issue: the answer to getutxos can of course change the instant the result is generated, thus leading you to construct an uncloseable transaction if the process of revocation races with the construction. The app can detect this by watching for either a reject message, or an inv requesting one of the inputs that is supposed to be in the UTXO set (i.e. the remote peer thinks it's an orphan). This can then cause the app to re-request the UTXO set and drop the raced pledge.

In practice I do not anticipate such attacks are likely to occur, as they're complicated to pull off and it's not obvious what the gain is.

There may be other apps that wish to use getutxos, with different security models. They may find this useful despite the lack of UTXO commitments, and the fact that the answer can change a moment later, if:

  • They are connecting to a trusted peer, i.e. localhost.
  • They trust their internet connection and peer selection, i.e. because they don't believe their datacenter or ISP will commit financial fraud against them, or they are tunnelling via endpoints they trust like a randomly chosen Tor exit.
  • They are not using the response for anything important or worth attacking, like some kind of visualisation.

Upgrade

If enforced UTXO commitments are added to the block chain in future, it would make sense to rev the P2P protocol to add the proofs (merkle branches) to the response.

Testing

I attempted to write unit tests for this, but Core has no infrastructure for building test chains .... the miner_tests.cpp code does it but at the cost of not allowing any other unit test to do so, as it doesn't reset or clean up the global state afterwards! I tried to fix this and ended up down a giant rabbit hole.

So instead I've tested it with a local test app I wrote, which also exercises the client side part in bitcoinj.

BIP

If the code is ACKd then I will write a short BIP explaining the new message.

Philosophy

On IRC I have discussed this patch a little bit before. One objection that was raised is that we shouldn't add anything to the P2P protocol unless it's unattackable, because otherwise it's a sharp knife that people might use to cut themselves.

I personally disagree with this notion for the following reasons.

Firstly, many parts of the P2P protocol are not completely unattackable: malicious remote nodes can withhold broadcast transactions, invent fictional ones (you'd think they're orphans), miss out Bloom filter responses, send addr messages for IP's that were never announced, etc. We shouldn't hold new messages to a standard existing messages don't meet.

Secondly, even with UTXO commitments in the block chain, given the sick state of mining this only requires a collaboration of two people to undo, although that failure would be publicly detectable which is at least something. But anyway, there's a clean upgrade path if/when UTXO authentication becomes available.

Thirdly, we have a valid use case that's actually implemented. This isn't some academic pie in the sky project.

Finally, Bitcoin is already the sharpest knife imaginable. I don't think we should start rejecting useful features on the grounds that someone else might screw up with them.

If the above analysis ends up not holding for some reason, and people do get attacked due to the lack of authentication, then Lighthouse and other apps can always fall back to connecting to trusted nodes (perhaps over SSL). But I would like to optimistically assume success up front and see what happens, than pessimistically assume the worst and centralise things up front.

@petertodd
Copy link
Contributor

petertodd commented Jun 16, 2014

Why is there absolutely no privacy at all in this feature? You could easily search by prefix rather than being forced to always give the peer the exact outputs you are interested in. (recall how leveldb queries work re: the iterators)

Also, re: security, Lighthouse is particularly bad as lying about UTXO's - falsely claiming they don't exist/are spent when they are unspent - can certainly lead to serious exploits where clients are fooled into thinking an assurance contract is not fully funded when in fact it is over-funded, leading to large fees being paid to miners. You've not only got a potential exploit, you've got a strong financial motivation to exploit that exploit.

@petertodd
Copy link
Contributor

petertodd commented Jun 16, 2014

One last thing: needs a NODE_GETUTXO service bit - having an unencrypted copy of the UTXO set is definitely a service that not all nodes can be expected to have. (recall @gmaxwell's clever suggestion of self-encrypting the UTXO set to avoid issues around storage of problematic data)

@mikehearn
Copy link
Contributor Author

mikehearn commented Jun 16, 2014

If the app thinks a pledge is revoked it won't be included in the contract that is broadcast, so it can't lead to overpayment.

Re: encrypted UTXO set. That makes no sense. Nodes must be able to do this lookup internally to operate. Gregory's suggestion was to obfuscate the contents on disk only to avoid problems with silly AV scanners, not that the node itself can't read its own database.

There is no prefix filtering because that would complicate the implementation considerably. You are welcome to implement such an upgrade in a future patch, if you like.

@petertodd
Copy link
Contributor

petertodd commented Jun 16, 2014

If the app thinks a pledge is revoked it won't be included in the contract that is broadcast, so it can't lead to overpayment.

The attacker would of course broadcast the pledges themselves; pledges are public information.

Re: encrypted UTXO set. That makes no sense.

I was thinking in the case where privacy is implemented, but actually on second thought my complaint is invalid for this implementation as you're not returning the UTXO data associated with the UTXO.

There is no prefix filtering because that would complicate the implementation considerably. You are welcome to implement such an upgrade in a future patch, if you like.

You can query leveldb with the prefix, get a cursor in return, then just scan the cursor until the end of the prefix range. There's no good reason to create yet more infrastructure with zero privacy, and the lack of privacy makes attacking specific targets without being detected much easier.

@mikehearn
Copy link
Contributor Author

mikehearn commented Jun 16, 2014

Pledges are not public. You're making assumptions about the design without understanding it.

Your second statement is nonsensical. The code does return "the UTXO data associated with the UTXO", what else would it do?

Your third statement is something I already know: I am the one who implemented LevelDB for Bitcoin. My point stands. This patch is what it is. If you'd like it to be better feel free to contribute code to make it so.

@petertodd
Copy link
Contributor

petertodd commented Jun 16, 2014

Pledges are not public. You're making assumptions about the design without understanding it.

Either pledges are public information and can be attacked, or they are not and some single user is running the crowdfund, (the project owner) in which case the overhead of just using existing systems is not a big deal. In particular, in the "single project owner design" all pledges can easily be added to a single bloom filter and the chain scanned to keep the state of spent/unspent up to date at the same low cost as keeping an SPV wallet up-to-date. (remember that users are going to be pledging specific amounts and/or using a specific wallet for their pledges, so in the vast majority of cases you'll need to create a transaction output for that pledge, which means the bloom filter behavior is identical to that of a standard SPV wallet)

Relying on pledges not being public information for your security is a rather risky design with difficult to predict consequences. Easy to imagine, for instance, a user publishing a list of pledge amounts showing the progress of their campaign, and that list being used for the attack. (trick project owner into publishing an invalid tx with an input spent, record signatures, then make it look like other pledges are now spent and get more pledges) Even a "multiple utxo's is one pledge" design can easily fall to an attacker who just guesses what UTXO's are probably part of the pledge based on amounts pledged. Again, all attacks that are much more difficult to pull off if the app isn't giving away exact info on what transaction outputs it's looking for. (although UTXO anonymity does suffer from the inherent problem that UTXO size grows indefinitely, so your k-anonymity set is much weaker than it looks as many entries can be ignored due to old age - another argument for the bloom filter alternative)

Your second statement is nonsensical. The code does return "the UTXO data associated with the UTXO", what else would it do?

It can return only a spent/unspent bit. What's the use-case for requiring more than that? It's easy to foresee this encouraging applications to abuse the UTXO set as a database. (back to the policy question: do we really want to redefine NODE_NETWORK as being able to provide UTXO data to peers as well?)

@laanwj
Copy link
Member

laanwj commented Jun 17, 2014

Looks OK to me, implementation-wise. Talking of testing, if you cannot integrate testing into the unit tester suite for some reason I think at least some Python script should be included in qa/... to be able to test this functionality. Such a test script could create a node, import a bootstrap file, and launch getutxos queries at it.

I think we are at the point that we need to define an extension mechanism - whether that's done with a NODE_* bit or some other way doesn't matter. That would encourage experimentation, and I think this is an excellent use-case for such. No need to force all NODE_NETWORK nodes above a certain version to provide a specific query service. Alternative implementations (obelisk, btcd) may or not want to implement this, and may want to experiment with their own extensions. Then the bootstrapping network needs a way to find only nodes that provide a certain extension. Let's not repeat the bloom debacle.

@petertodd
Copy link
Contributor

petertodd commented Jun 17, 2014

@laanwj I suggested awhile back we use a simple bitmask:

x0000000000000001.testnet-seed-mask.bitcoin.petertodd.org

Returning all seeds with at least NODE_NETWORK set. There's most likely to be a relatively small number of combinations people use, so DNS caching will still work fine. (though I'm no DNS expert) Of course, as always relying heavily on seeds is foolish, so just setting up app-specific seeds probably makes more sense in many cases and lets the authors of those apps implement whatever feature tests they need to ensure they're serving up seeds that actually support the features required by the apps. (e.g. right now if there ever exist NODE_BLOOM-using nodes on the network and someone does a bloom IO attack against the nodes returned by the seeds, maliciously or by accident, you'll easily wind up with only NODE_BLOOM-supporting nodes being returned, breaking anything relying on bloom filters)

Also, as an aside it'd be reasonable to set aside a few service bits for experimental usage, with an understanding that occasional conflicts will be inevitable. In my replace-by-fee implementation that uses preferential peering to let replace-by-fee nodes find each other quickly I have:

// Reserve 24-31 for temporary experiments
NODE_REPLACE_BY_FEE = (1 << 26)

https://github.com/petertodd/bitcoin/blob/f789d6d569063fb92d1ca6d941cc29034a7f19ef/src/protocol.h#L66

@laanwj
Copy link
Member

laanwj commented Jun 17, 2014

@petertodd The problem with service bits is that there is only a very limited number of them. It would, IMO, be better to have a string namespace defining extensions. A new version of the network protocol could add a command that returns a list of strings defining the supported extensions and maybe even per-extension versions. It's still very simple and conflicts could be much more easily avoided.

@petertodd
Copy link
Contributor

petertodd commented Jun 17, 2014

@laanwj Well we've got 64 of them, 56 if you reserve some for experiments; I don't see us using up that many all that soon. A string namespace thing can be added in the future for sure, but I just don't see the short-term, or even medium-term, need. After all, NODE_BLOOM was AFAIK the first fully fleshed out proposal to even use a single service bit, with the closest runner up being @sipa's thoughts on pruning.

That said, strings, and especially UUIDs, (ugh) would definitely reduce the politics around them.

@laanwj
Copy link
Member

laanwj commented Jun 17, 2014

It's not about fear of running out but about reducing the need for central coordination. Anyhow, let's stop hijacking this thread.
Using a service bit in this case is fine with me.

@petertodd
Copy link
Contributor

petertodd commented Jun 17, 2014

@laanwj Agreed.

@mikehearn
Copy link
Contributor Author

mikehearn commented Jun 17, 2014

I don't think the attack you have in mind works.

Let's assume that pledges are public for a moment, e.g. because the user chooses to publish them or collect them in a way that inherently makes them public, like people attaching them to forum posts. I don't fully get what attack you have in mind, but I think you're saying if you can control the internet connection of the fundraiser for an extended period of time, you could ensure they don't close the contract as early as possible and continue to solicit pledges. Then Dick Dasterdly steps in, takes all the pledges and steals the excess by working with a corrupt miner.

But this attack makes no sense. If the pledges are public any of the legitimate pledgors can also observe the contracts state and close it. The attacker has no special privileges. Unless you control the internet connection of all of them simultaneously and permanently, the attack cannot work: legitimate users will stop pledging once they see it's reached the goal and then either close it themselves, or ask the owner via a secure channel why they aren't doing so.

What you're talking about is only an issue if the pledges are NOT public, but the attacker is able to obtain them all anyway, AND control the users internet connection so they do not believe the contract is closeable AND they continue to solicit funds and raise money. Given that Lighthouse includes an integrated Tor client and can therefore tunnel through your control anyway, I don't think this is a realistic scenario.

It can return only a spent/unspent bit. What's the use-case for requiring more than that?

It's explained in the nice document I wrote above, please read it! Then you would actually know what data is returned and why.

@laanwj As far as I know only Peter thinks Bloom filtering was a debacle: if we had added a service bit, all nodes on the network would set it except for one or two that don't follow the protocol properly anyway, so who knows what they would do. If a node doesn't wish to support this simple command they can just not increase their protocol version past 90000. If they do, it should be only a few lines of code to add. As you note, using a service bit means implementations can be found but there are only a handful of them to go around, and not using a service bit means some entirely new mechanism must be designed which is way out of scope for this patch.

Re; qa tests, as far as I can tell they are only capable of covering the JSON-RPC interface. We don't seem to have any testing infrastructure for doing P2P tests except for the pull tester. I could try adding some code to that, but that code is maintained in the bitcoinj repository.

@laanwj
Copy link
Member

laanwj commented Jun 17, 2014

@mikehearn Binding features to version numbers assumes a linear, centralized progression. It means that everyone that implements A also needs to implement B even though they are unrelated. I don't think this is desirable anymore.

And as said above, using a service bit is fine with me. I do think we need another mechanism for signalling extensions to the protocol in the future, but for now we're stuck with that.

@mikehearn
Copy link
Contributor Author

mikehearn commented Jun 17, 2014

OK, I can add a service bit, although AFAIK nobody actually has any code that searches out nodes with a particular bit? I'm not sure Core does and bitcoinj definitely doesn't. But that can be resolved later.

The question of optionality in standards is one with a long history, by the way. The IETF has a guide on the topic here:

http://tools.ietf.org/html/rfc2360#section-2.10

Deciding when to make features optional is tricky; when PNG was designed, there were (iirc) debates over whether gz compression should be optional or mandatory. The feeling was that if it were optional, at least a few implementations would be lazy and skip it, then in order to ensure that their images rendered everywhere PNG creators would always avoid using it, thus making even more implementations not bother, and in the end PNGs would just end up bigger for no good reason: just because a few minority implementors didn't want to write a bit of extra code. So they made gz compression mandatory. Another feature that GIF had (animation) was made optional and put into a separate MNG format (later another attempt, APNG). Needless to say, the situation they feared did happen and animated images on the web today are all GIFs.

So I will add a service bit, even though this feature is so trivial everyone could implement it. If it were larger and represented a much bigger cost, I'd be much keener on the idea. As is, I advise caution - simply making every feature from now on optional is not necessarily good design. The tradeoffs must be carefully balanced.

@maaku
Copy link
Contributor

maaku commented Jun 17, 2014

NODE_NETWORK is a hack. It is conflating two things: storing the whole block chain, and storing the current UTXO set. These are orthogonal things. I think there should be a service bit here, but the meaning is not constrained to just a 'getutxos' call. NODE_NETWORK should be split into NODE_ARCHIVAL and NODE_UTXOSET, with the latter eventually indicating presence of other things as well, such as a future p2p message that returns utxo proofs.

@laanwj
Copy link
Member

laanwj commented Jun 17, 2014

I agree it would be preferable for everyone to agree and do the same thing, but that makes progress incredibly difficult. From my (maybe over-cynical) view of the bitcoin community that means that nothing new ever happens. There's always some reason not to agree with a change, it could be some perceived risk, disagreement on the feature set or how the interface should look, or even paranoid fantasies.

Having optional features could mean the difference between something like this, which is useful but not absolutely perfect, being merged, or nothing being done at all. So I also advice caution on trying to push it to the entire network with a version bump.

@maaku
Copy link
Contributor

maaku commented Jun 17, 2014

@laanwj how else do you indicate presence of this one particular p2p message except by version bump? That's what the version field is for.

@mikehearn
Copy link
Contributor Author

mikehearn commented Jun 17, 2014

Ah, you're right, that's why software projects have maintainers instead of requiring universal agreement from whoever shows up :) There will always be people who disagree or want something better (but don't want to do the work). Sometimes those disagreements will make sense, and other times they will be bike shedding.

If we look at projects like the kernel, it's successful partly because Linus lets debates run for a while, he develops opinions and then if things aren't going anywhere he steps in and moves things forward. Bitcoin has worked the same way in the past with @gavinandresen doing that, and I hope we will retain good project leadership going forward.

Gavin, what are your thoughts on protocol extensibility / optionality? As it seems nobody has problems with the code in this patch itself.

@maaku
Copy link
Contributor

maaku commented Jun 17, 2014

@mikehearn it would be better imho if the return value included the height and hash of the best block. That would help you figure out what is going on when you get different answers from peers, and parallels the information returned by a future getutxos2 that returns merkle proofs.

@mikehearn
Copy link
Contributor Author

mikehearn commented Jun 17, 2014

Good idea! I'll implement that this afternoon or tomorrow.

@petertodd
Copy link
Contributor

petertodd commented Jun 17, 2014

@mikehearn Sybil attacking the Bitcoin really isn't all that hard; I really hope Lighthouse doesn't blindly trust the DNS seeds like so much other bitcoinj code does. re: having getutxos return actual UTXO's vs. spent/unspent, I see nothing in the design of Lighthouse that prevents pledges from containing the transactions required to prove the UTXO data. Also, last I talked to Gregory Maxwell about the issue he had strong opinions that NODE_BLOOM was the right idea - he did after all ask me to implement it. Warren Togami also is in that camp. (and asked me to re-base the patch and submit the BIP)

@laanwj
Copy link
Member

laanwj commented Jun 17, 2014

@mikehearn It may have been that way in the past, but Bitcoin Core is not the only node implementation anymore. Don't confuse leadership over this project with leadership over the global P2P network, which has various other actors as well now.

Edit: another concrete advantage of an optional-feature approach is that features can be disabled again if they either prove to be not so useful for what they were imagined for, or the implementation causes problems, or a later extension provides a better alternative. Locking it to >= a protocol version means every version in the future is expected to implement it.

@mikehearn
Copy link
Contributor Author

mikehearn commented Jun 17, 2014

@laanwj New version numbers can mean anything, including "feature X is no longer supported". So I don't think we need service bits for that.

@sipa
Copy link
Member

sipa commented Jun 17, 2014

We've talked about it, and I'm sure you're aware of my opinion already, but
I'll still repeat it here to offer for wider discussion.

I do not believe we should encourage users of the p2p protocol to rely on
unverifiable data. Anyone using 'getutxos' is almost certainly not
maintaining a UTXO set already, and thus not doing chain verification, so
not able to check that whatever a peer claims as respond to getutxos is
remotely meaningful. As opposed to other data SPV clients use, this does
not even require faking PoW.

Yes, there are other parts of the protocol that are largely unverified.
Addr messages for new peers, the height at startup, requesting the mempool
contents, ... But those are either being deprecated (like the height at
startup), or have infrastructure in place to minimize the impact of false
data. In contrast, I do not see any use of getutxos where the result can
be verified; if you're verifying, you don't need it. To the extent
possible, Bitcoin works as zero-trust as possible, and I believe improving
upon that should be a goal of the core protocol.

Of course, that does not mean that the ability to request UTXO data is
useless. I just don't believe it should be part of the core protocol.

I think the problem is that in some cases, there are very good reasons to
connect to a particular (set of) full node, and trusting its responses. For
example, when you have different Bitcoin-related services running within a
(local and trusted) network, connected to the outside world using a
bitcoind "gateway". In this case, you are using bitcoind as a service for
your system, rather than as a pure p2p node.

So far, we have separated service providing done through RPC than through
P2P. This makes often sense, but is not standardized, is not very
efficient, and is inconvenient when most of the data you need is already
done through P2P.

My proposal would therefore be to add "trusted extensions" to the P2P
protocol. They would only be available to trusted clients of a full node
(through IP masking, different listening port, maybe host authentication at
some point, ...). I've seen several use cases for these:

  • When a local network wallet rebroadcasts transactions, you want the
    gateway to rebroadcast as well. Default current behavior is to only relay
    the first time you see a transaction.
  • You want local clients to bypass rate limiting systems, without
    triggering DoS banning (currently done for localhost, whuch is broken for
    Tor).
  • Some functionality really is only available when you have a trusted
    bitcoind. Mempool acceptance checking for detecting conflicting wallet
    transactions is one, getutxos is another. Mechanisms for these could be
    available but only to trusted clients.

This may be controversial (and probably needs a separate
mailinglist/issue), as it could all be done through a out of band separate
non-P2P protocol, or just RPC.

Comments?

@mikehearn
Copy link
Contributor Author

mikehearn commented Jun 17, 2014

I think I put all my comments on that in the original writeup. Yes, in the ideal world everything would be perfect and authenticated by ghash.io ;) However we do not live in such a world and are dragging ourselves towards it one step at a time.

BTW, on height in version message being "deprecated", that's the first I've heard of this. SPV clients use it. If someone wants to deprecate that they're welcome to update all the clients that require it. But let's discuss that in a separate thread.

@sipa
Copy link
Member

sipa commented Jun 17, 2014

Oh, not in the protocol. I just mean that full nodes don't use it at all
anymore. I wish it didn't exist in the first place, but it's too
unimportant to bother changing in the protocol.

@mikehearn
Copy link
Contributor Author

mikehearn commented Jun 17, 2014

Ah, OK.

@petertodd
Copy link
Contributor

petertodd commented Jun 17, 2014

@sipa +1

There's nothing wrong with trust. We'd like everything to be decentralized, but we don't live in a perfect world so occasionally we introduce trust to solve problems that we don't have decentralized solutions for yet. We did that in the payment protocol because we had no other way to authenticate payments; we should be doing that in UTXO lookup, because we have no other way to authenticate UTXO's. (yet)

We also have a responsibility to design systems that naturally lead to safe implementations that are robust against attack. This patch is anything but that on multiple levels - even little details like how it gives you 100% unauthenticated UTXO data rather than just a "known/unknown" response encourage inexperienced programmers to take dangerous shortcuts, like relying on your untrusted peer(s) for utxo data corresponding to a tx rather than at least getting actual proof via the tx and its merkle path to the block header. (PoW proof) Equally the presence of vulnerable targets encourages attackers to sybil attack the Bitcoin network to exploit those targets - we don't want to encourage that.

That said, I'm not convinced we need to add trusted extensions to the Bitcoin Core P2P protocol; that functionality already exists in the form of Electrum among others. UTXO's can be looked up easily, you can authenticate the identity of the server you're talking to via SSL, and it is already used for that purpose by a few applications. (e.g. the SPV colored coin client ChromaWallet) A client implementation is simple(1) and Electrum supports things like merkle paths where possible to reduce the trust in the server(s) to a minimum. Why reinvent the wheel?

  1. https://github.com/bitcoinx/ngcccbase/blob/master/ngcccbase/services/electrum.py
@laanwj
Copy link
Member

laanwj commented Jun 17, 2014

@sipa Agreed - getutxos is in the same category of 'information queries from trusted node' as the mempool check for unconfirmed/conflicted transactions that an external wallet could use.

Regarding the height in version messages: yes, nodes have lied about this, resulting in 'funny' information in the UI so we don't use it anymore, not even behind a median filter. See #4065.

@genjix
Copy link

genjix commented Aug 27, 2014

wow what a stupid change, all the more reason why we can't have a single group making unilateral decisions on one codebase. I had never heard of this. I think you guys need to stop trying to throw all this crap into the Bitcoin protocol, and focus on keeping it small + focused. Bloated software overextends itself introducing security flaws through new attack surfaces.

@mikehearn
Copy link
Contributor Author

mikehearn commented Aug 27, 2014

I actually can't make my bitcoind crash even when firing many such bogus queries in parallel, even without the fix. Memory usage does go up, and if you don't have much RAM that could be a problem, but it drops down again immediately after.

With the extra ten lines to track bytes used the code is clearly better, but it's way overkill to go running to reddit and claim it's an "easy way to crash bitcoind". Heck our networking code doesn't even have a limit to aim for. Bitcoind could run out of RAM on some systems just by handling a lot of clients, or if there was a lot of transactions in the mempool.

@petertodd
Copy link
Contributor

petertodd commented Aug 27, 2014

Lots of nodes out there without all that much RAM. Other DoS attacks are prevented by existing resource limits, e.g. tx fees, coin-age, etc. and "handling lots of clients" is something we already have limits on via the connection limits. I should know - I've spent a lot of time looking for and fixing cheap DoS attacks. (e.g. the sigops one I fixed) getutxos is unique in how easy it is to use to crash systems at no cost.

@mikehearn You'd be smart to just own up and say "oops, I screwed that up" rather than trying to make excuses. Heck, I personally screwed up a bit by ACKing the patch without noticing that flaw.

@laanwj
Copy link
Member

laanwj commented Aug 27, 2014

I had never heard of this

It is not as if this was kept secret. To be fair, the BIP was posted to the bitcoin development mailing list, https://sourceforge.net/p/bitcoin/mailman/message/32590257/
The BIP was reviewed, finalized and merged at some point, bitcoin/bips#88
This pull has been open for months as well.
You could have heard of this, and given your opinion in all of those instances...

@mikehearn
Copy link
Contributor Author

mikehearn commented Aug 27, 2014

Peter, I wrote a patch right? I'm grateful that Dave did this testing and found this problem. I am less grateful that he then ran to reddit and started complaining about how badly unit tested Bitcoin Core is (blame Satoshi for that one, if he must).

I think it's an open question about how such things can be caught in future. Any future change could result in large temporary memory usage without anyone noticing. The lack of any definition for "large" makes this harder - as I said, I can't actually make my testnet node crash even when repeating the conditions that were given. So how to find this sort of thing systematically is tricky. Ideally Bitcoin Core would print a warning if its memory usage went over a certain amount, but we don't have that.

One solution that will definitely NOT work is blaming people for being imperfect programmers. Core has shipped fatal bugs before and will do so again. The right solution is usually to ask "how can we stop such mistakes systematically"?

@genjix
Copy link

genjix commented Aug 27, 2014

There's a ton of silly discussion on that mailing list which consists of "who has the most stamina to invest in arguments" which I don't have time for. Therefore I cannot sift through all that looking for the gems of important discussion to register my single objection.

@mikehearn you can avoid mistakes by taking features out not trying to stuff more features in (which we don't need). This is a case of developers going mad for features that want which should be happening on another layer of Bitcoin, not the core protocol which should stay pure and focused. I see very little impetus for real implementation work happening apart from odd bugfixes or cosmetic changes, and a lot of "THE NEXT BIG THING" spurned by corporations who see Bitcoin as the new payments innovation instead of wanting to protect the integrity, security and values of Bitcoin. I'm a protocol conservative.

@btcdrak
Copy link
Contributor

btcdrak commented Aug 27, 2014

Reverted in 70352e1

@dgenr8
Copy link
Contributor

dgenr8 commented Sep 1, 2014

Nothing about this change is harmful enough to violate the process. A BIP was even merged, for crying out loud.

It would be great if core supported optional and p2p-queryable indexes for everything. An optional way to authenticate data served would also be great. Lack of these extra features should not doom this change. They could be in a layer maintained by the core project. There is absolutely no reason to punt stuff like this to third parties if open-source developers want to create it.

@btcdrak
Copy link
Contributor

btcdrak commented Sep 1, 2014

@dgenr8 A BIP getting merged doesn't make it a standard, it just starts it in the 'draft' workflow status: https://github.com/bitcoin/bips/blob/master/bip-0001/process.png

@gmaxwell
Copy link
Contributor

gmaxwell commented Sep 1, 2014

There are several bad bips which (IMO) no one should ever use, the BIP process doesn't tell you if something is good or not... it just specifies it.

jonasschnelli added a commit to jonasschnelli/bitcoin that referenced this pull request Dec 3, 2014
has parts of @mhearn bitcoin#4351
* allows querying the utxos over REST
* same binary input and outputs as mentioned in Bip64
* input format = output format
* various rpc/rest regtests
jonasschnelli added a commit to jonasschnelli/bitcoin that referenced this pull request Mar 4, 2015
has parts of @mhearn bitcoin#4351
* allows querying the utxos over REST
* same binary input and outputs as mentioned in Bip64
* input format = output format
* various rpc/rest regtests
jonasschnelli added a commit to jonasschnelli/bitcoin that referenced this pull request Mar 4, 2015
has parts of @mhearn bitcoin#4351
* allows querying the utxos over REST
* same binary input and outputs as mentioned in Bip64
* input format = output format
* various rpc/rest regtests
jonasschnelli added a commit to jonasschnelli/bitcoin that referenced this pull request Apr 21, 2015
has parts of @mhearn bitcoin#4351
* allows querying the utxos over REST
* same binary input and outputs as mentioned in Bip64
* input format = output format
* various rpc/rest regtests
rebroad added a commit to rebroad/bitcoin that referenced this pull request Sep 1, 2016
Copy link

bmstaten left a comment

How do you know which Node to use B.S.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

You can’t perform that action at this time.