Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add in a system to enforce namespace consensus #1

Closed
shea256 opened this issue Dec 2, 2014 · 24 comments
Closed

add in a system to enforce namespace consensus #1

shea256 opened this issue Dec 2, 2014 · 24 comments
Labels
feature Brand new functionality. New pages, workflows, endpoints, etc.

Comments

@shea256
Copy link
Contributor

shea256 commented Dec 2, 2014

No description provided.

@shea256 shea256 changed the title add support for historical record hashes add namespace checkpointing Dec 4, 2014
@shea256 shea256 added the feature Brand new functionality. New pages, workflows, endpoints, etc. label Dec 11, 2014
@shea256 shea256 changed the title add namespace checkpointing add a way for nodes to agree on a snapshot of the namespace Dec 11, 2014
@shea256 shea256 changed the title add a way for nodes to agree on a snapshot of the namespace require nodes to regularly demonstrate agreement on namespace snapshots Dec 11, 2014
@shea256 shea256 changed the title require nodes to regularly demonstrate agreement on namespace snapshots add a way for nodes to regularly demonstrate agreement on namespace snapshots Dec 11, 2014
@shea256 shea256 changed the title add a way for nodes to regularly demonstrate agreement on namespace snapshots add in a system to enforce namespace consensus Dec 11, 2014
@shea256
Copy link
Contributor Author

shea256 commented Dec 11, 2014

I'm thinking that the way to go is to require nodes to regularly demonstrate agreement on top-level namespace snapshots.

They could do this by:

  1. taking a snapshot of the namespace at a given block
  2. building a merkle tree from the namespace
  3. using the merkle root to uniquely identify that namespace snapshot

In order for nodes to show each other which namespace they think is the right one, and from there come to a consensus, they do the following:

  1. Require that certain name operations include the namespace merkle root.
  2. Ignore all operations that have a merkle root that differs from the locally calculated merkle root.
  3. Include the locally calculated merkle root in all name operations of a certain type.

Which name operations should require a merkle root?

The only ones with enough room to fit in OP_RETURN are preorders and transfers. My instinct is that at either (a) just preorders or (b) both preorders and transfers should require the merkle root.

@shea256
Copy link
Contributor Author

shea256 commented Dec 11, 2014

Quote from @cpacia:

"What would prevent someone looking to create a duplicate name from adding a merkle root of a fake namespace to the duplicate name registration. Would a light client be able to tell the difference?"

@cpacia
Copy link

cpacia commented Dec 11, 2014

Yeah I've been trying to think of ways to make it work. I feel like there has to be some way our there to do it, but I haven't yet worked it out.

I was thinking one thing you could do is hash chain the name registrations together. Then a checkpoint could be hardcoded into a client and a server could serve up the namespace (including the bitcoin transactions) from the checkpoint forward. A lite client could verify the transactions and verify they chain back to the checkpoint.

There are two problems with this approach:

  1. Nothing would stop an attacker from maintaining a fork of the hash chain with duplicate name registrations (all using keys the attacker owns). The one benefit of doing that however, is the attack would be public. People with the full blockchain could see a duplicate name chain is being maintained. Lite clients would not be able to see that, however and could be fed the bogus namespace.

  2. Lite clients would have to download the entire name space. That isn't a big deal if the namespace (including transactions) is small, but if it grows to a million names, say, then that's a heavy burden on client.

@cpacia
Copy link

cpacia commented Dec 11, 2014

A merkle root could work well if there is a way to prove consensus on the merkle root.

@shea256
Copy link
Contributor Author

shea256 commented Dec 11, 2014

Here's what I'm thinking:

A light client starts from the first block and downloads the whole namespace until the first checkpoint it decides on. Once it reaches that checkpoint, it saves and "trusts" that checkpoint merkle root, then deletes the name operations up until that point. It repeats this process until it gets to the most recent checkpoint, and does that continuously. Not sure if this fully makes sense - have to think about it a bit more.

@cpacia
Copy link

cpacia commented Dec 12, 2014

OK. I don't think a blockchain-only lightweight proof is possible without an additional consensus mechanism (blockchain). In fact, I think this is why counterparty and mastercoin don't have SPV implementations ― because you can't do it.

Nevertheless, here is a simple hack to fix it. You have the user include the txid in the social media proofs.

This immediately solves two problems:

  1. While someone could register a duplicate name (and technically a malicious server could serve it), the txid in the social media proofs would not match and hence any software verying the proof would know you were served the wrong name.

  2. This also solves the utxo problem. If we include the txid in the social media proofs, we can prove it's the latest tx and not an earlier one. This means the user would need to update their proofs every time they want to update their name profile, but that can be largely automated and they wouldn't update that often.

Problem solved :)

@cpacia
Copy link

cpacia commented Dec 12, 2014

Come to think of it... the user might not need to update the txid each time. Maybe only including a txid of the name registration might be enough. Then the a light client could track subsequent transactions.

That could work for some use cases... such as a wallet which receives txs from the network since it will always be able to see the latest tx in the chain. But it might not work for applications that don't connect to the bitcoin network.

@shea256
Copy link
Contributor Author

shea256 commented Dec 12, 2014

Hm, this is an interesting idea. But I don't think it works because there's a circular dependency:

to check if the transaction/TXID is correct, check the tweet in the supposed data --> to check if the tweet URL and by extension the data are correct, check the served transaction --> to check if the served transaction/TXID is correct, check the tweet in the supposed data ...

If we can't first establish that the user's data is correct, we're not even able to establish that the twitter handle is correct in the first place.

The attack:

  1. malicious server sends SPV client a transaction with a name owner or value hash that doesn't match the valid one
  2. SPV client looks up the hash and gets the profile data
  3. SPV client finds the (INCORRECT!) tweet URL and checks the tweet to verify the transaction hash
  4. SPV client verifies that the incorrect tweet matches the incorrect transaction hash
  5. SPV client thinks it has verified something when in fact it really hasn't

I might be missing something here but this sounds like it doesn't work.

The reason for this issue is that all the tweet verifies is that the user who owns the twitter account acknowledges that they have the name. But... it doesn't verify that a given name is owned by a given twitter account. Any twitter account can claim that it owns the name through this attack.

@shea256
Copy link
Contributor Author

shea256 commented Dec 12, 2014

Also, I don't understand why you wouldn't be able to do a lightweight proof. Let's say you have an SPV wallet for Bitcoin. You have all the headers and can know with pretty high certainty that the 6 block deep merkle root is correct. If you know that all the transactions up until that point are correct, then you can verify whether a given OP_RETURN transaction really existed in the blockchain.

OK, now that we can verify whether OP_RETURN transactions actually existed in the blockchain, we need to determine whether those transactions are considered valid openname operations.

To deem a given openname registration valid, we do the following:

  1. each day, calculate a merkle root hash that takes into account all the valid openname transactions in that day and the previous merkle root hash
  2. in each transaction, include the merkle root hash of the previous day
  3. only accept transactions that have what you consider the correct merkle root hash
  4. each day, delete all merkle root hashes that are older than 2 days

This has the following properties:

  1. each day, you are certain of the name owners and data up until the previous day
  2. each day, you need to download all transactions from the previous day
  3. you only need to store the transactions of the past two days at any given time
  4. if there are 5M operations/year = 14K operations/day, that's 4 MB/day (the size of a song) to stay synced
  5. if there's a release every 100 days with a published checkpoint, you can separately verify the checkpoint in that latest release and at most need to temporarily download 400 MB to catch up to the network from that checkpoint

This made sense to me. Does this work?

@cpacia
Copy link

cpacia commented Dec 12, 2014

If we can't first establish that the user's data is correct, we're not even able to establish that the twitter handle is correct in the first place.

The first time yo look up a name you would have to perform manual verification that the twitter handle belongs to the correct person. The client can save the information after the first time saving you from having to re-verify each subsequent look-up.

What I described ends up looking a lot like how keybase works. Except here that model would only be used for a lite client with people still able to do (optional) full verification if they want to download the whole chain.

Let's say I want to send you coins (wallet example):

  1. I enter ryanshea into the address field.
  2. Wallet downloads your profile data, transaction, and SPV proof and verify the tx is in the chain.
  3. Wallet checks social media proofs from the profile against the transaction hash and verifies they match.
  4. Finally, the wallet shows me that the social media proofs were validated for @ryanshea on twitter, ryanshea on facebook, and rxl on github and asks me if this is the correct ryanshea.
  5. If I answer yes, it sends the payment and saves the social media proofs so that I don't have to manually verify them next time. As long as the social media accounts match the previously verified accounts, it can send a payment without re-verifying the accounts.

If someone registered a fake ryanshea and the server sent me that profile... the social media accounts would not match your's and, presumably, I would catch that before sending the payment.

This should provide at least the same level of security as keybase as it's a similar model. The difference being we're using bitcoin keys instead of PGP keys to sign the social media proof.

@cpacia
Copy link

cpacia commented Dec 12, 2014

This made sense to me. Does this work?

Let me think about that for a little bit and then I'll get back to you.

@shea256
Copy link
Contributor Author

shea256 commented Dec 13, 2014

  1. Finally, the wallet shows me that the social media proofs were validated for @ryanshea on twitter, ryanshea on facebook, and rxl on github and asks me if this is the correct ryanshea.

You're right, this step fixes the problem, provided that the user already knows the recipient's social media accounts.

Unfortunately, my usernames on Twitter and Facebook are "ryaneshea" not "ryanshea", so you would have sent money to the wrong person 😮

@cpacia
Copy link

cpacia commented Dec 13, 2014

Unfortunately, my usernames on Twitter and Facebook are "ryaneshea" not "ryanshea", so you would have sent money to the wrong person

haha yes perfect example.

I think you're example using the merkle root is similar to the hash chaining I mentioned earlier. The concern I had with that is that someone could maintain a parallel chain.

For example, starting at a certain point in time, for every new registration transaction an attacker could publish a duplicate registration containing a merkle root that is valid for the parallel namespace.

I you end up being sent the transactions from the parallel chain, you could end up with a false picture of the namespace and I don't think your client could tell the difference. Of course, this type of attack would be public to those who can see the full chain but not to a lite client.

@shea256
Copy link
Contributor Author

shea256 commented Dec 13, 2014

A few things:

  1. There could be warnings that parallel namespaces develop. Light clients would see the multiple ones.
  2. A light client would never trust a parallel namespace that "broke the naming rules". The only worry is that the servers the client connects to censor transactions and allow certain names to be registered when they were already registered (thus allowing the owner of an incorrect bitcoin address to impersonate the user). So all you have to do here is make sure you're seeing all of the relevant transactions. You could talk to multiple servers to check for this. Maybe there's some clever variation/expansion on this idea.
  3. You could corroborate the merkle hash with dozens of peers. This is much easier than corroborating each individual name lookup.

@muneeb-ali
Copy link
Member

Just catching up on this thread. Jumping straight to

You could corroborate the merkle hash with dozens of peers. This is much easier than corroborating each individual name lookup.

I assumed this is how this works. A SPV client boots up and gets a merkle root hash from the server, it then checks with peers to see if this is the correct tree. If the server gave you the wrong root hash, then all/most peers will need to be on the (incorrect) parallel namespace for them to say that this is the correct tree. We only need to ensure that this attack is very expensive to perform by having many public "full nodes" that are run by different organizations etc. I could be missing something here.

@cpacia
Copy link

cpacia commented Dec 15, 2014

I agree. Though it isn't completely trustless per se, and you would have to worry about sybil attacks in that scenario. But if you combine that with including the txid in the social media proofs, it would be extremely difficult to attack.

@shea256
Copy link
Contributor Author

shea256 commented Dec 16, 2014

Yes, doing both would be more secure. I'm just trying to think of a way around including the TXID in the tweet. Seems so unwieldy and hard on the eyes.

@cpacia
Copy link

cpacia commented Dec 16, 2014

I don't know if you used keybase but the process of linking a social media identity is pretty smooth. IIRC just displayed what to post and asked you to copy and past it.

For example:
‏@ChrisPacia
Verifying myself: I am chrispacia on Keybase.io. bmiKWszfTts6C5Ck7siupsIWPQgxJk_UVxXt

Pretty easy and since it's really only a machine that will read it, I don't think the looks matter that much.

@shea256
Copy link
Contributor Author

shea256 commented Dec 16, 2014

Hm, interesting. Only thing is it looks fairly ugly and people are probably confused about what those characters at the end are. If people can get over that and just ignore it, then I think we're fine and it's a good measure to add in.

@jcnelson
Copy link
Member

Building the consensus hash over the entire namespace for each block has proven very costly, even for a small number of users (in the 10,000 range). I am currently using a consensus hash generation algorithm that simply builds a Merkle tree over the name operations discovered in the current block, and includes the consensus hash of the previous block as one of the hashes. This not only lets two Blockstore peers prove that they have processed the same operations, but also that they have processed them in the same order. This is required by virtualchain, and since since virtualchain allows there to be arbitrary side-effects to processing operations, we must allow the final state of the system to have the freedom to enforce a particular order in which operations are processed.

As discussed elsewhere, non-idempotent operations (i.e. ones that conflict) that can be issued by more than one principal need to include the Merkle hash to guarantee at-most-once semantics, and resolve conflicts. This would include preordering, updating, transferring, and namepace preordering.

Users shouldn't need to announce their Merkle roots; only their names (and only once the name has been registered). Once the name has been registered, Blockstore peers will have had to have processed the preorder, and will have had to agree on the same Merkle root at the time of the preorder.

All of this has been implemented, so I'm going to close this issue. Please re-open if we need to revisit the consensus hash algorithm.

@shea256
Copy link
Contributor Author

shea256 commented Sep 16, 2015

Sounds good. We should open a new thread at some point around just SPV support.

@brandonrobertz
Copy link

Namecoin sounds a lot easier at this point. SPV and consensus are solved problems.

@muneeb-ali
Copy link
Member

SPV support has been added in Blockstore. Namecoin has its advantages and disadvantages, we summarized lessons from our year long deployment on Namecoin here.

@muneeb-ali
Copy link
Member

To be clear, the SPV-like support has been added in the develop branch and is not fully available on the last stable release. The code is there, but we're in the process of releasing the documentation. This implements a new Merkle skip-lists based solution. The documentation will be published here, when it's ready. Stay tuned.

muneeb-ali pushed a commit that referenced this issue Aug 1, 2016
muneeb-ali added a commit that referenced this issue Aug 1, 2016
muneeb-ali added a commit that referenced this issue Aug 15, 2016
muneeb-ali pushed a commit that referenced this issue Jan 4, 2017
This reverts commit 4f711580aa171e88d37c0b414e3a9a2b7418d0fe.
muneeb-ali added a commit that referenced this issue Jan 4, 2017
muneeb-ali pushed a commit that referenced this issue Jan 16, 2017
This reverts commit 4f711580aa171e88d37c0b414e3a9a2b7418d0fe.
muneeb-ali added a commit that referenced this issue Jan 16, 2017
destenson pushed a commit to destenson/blockstack--blockstack-core that referenced this issue Aug 11, 2017
destenson pushed a commit to destenson/blockstack--blockstack-core that referenced this issue Aug 11, 2017
…acks-network#1"

This reverts commit 4f711580aa171e88d37c0b414e3a9a2b7418d0fe.
destenson pushed a commit to destenson/blockstack--blockstack-core that referenced this issue Aug 11, 2017
destenson pushed a commit to destenson/blockstack--blockstack-core that referenced this issue Aug 11, 2017
lgalabru pushed a commit that referenced this issue Mar 20, 2020
zone117x added a commit that referenced this issue Apr 16, 2020
lgalabru added a commit that referenced this issue Dec 3, 2022
jbencin pushed a commit to jbencin/stacks-core that referenced this issue Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Brand new functionality. New pages, workflows, endpoints, etc.
Projects
None yet
Development

No branches or pull requests

5 participants