Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BitcoinJ Bloomfilter #487

Closed
ManfredKarrer opened this issue Jun 11, 2016 · 11 comments
Closed

Fix BitcoinJ Bloomfilter #487

ManfredKarrer opened this issue Jun 11, 2016 · 11 comments

Comments

@ManfredKarrer
Copy link
Member

ManfredKarrer commented Jun 11, 2016

Requirements for taking that bounty:

  • Expertise with BitcoinJ
  • Expertise with Bloomfilters
  • Expertise with the 2 research papers regarding BitcoinJ Bloomfilters

Here is a general overview:
#414

If you want to work on that bounty it requires close communication. I need to be sure that you fulfill the above requirements! Also the final scope of the bounty will be decided during work on it. The bounty amount is just for a first review of the current implementation not follow up improvements.

Update:
The fix is still not solving the privacy issue.
The bounty is NOT for a review but for fixing the BitcoinJ handling of pubKeys in the bloom filters to make the removal of the pubKey in the filter work.
I might increase the bounty as well if a qualified developer works on it. Please get in touch if you are interested.
See also: #414

@Giszmo
Copy link

Giszmo commented Jul 17, 2016

As I told you in person, bloomfilters that describe addresses that you are interested in, that are then in addition to that interconnected via change addresses or merging spends are broken beyond repair when you want privacy. Requesting 50% of all addresses' transactions currently weighs 40GB less than the whole blockchain yet still it weighs 40GB you have to download, only to fool the spy a tiny bit. If he finds chains 10 long that fit the filter, then he already has a high probability to have found a match and if you ever make a chain of length 20, which is for how long I tend to use my HD accounts, you certainly have your addresses revealed even if you took the compromise of downloading 40GB of noise.

I'll try to keep you posted on my reversed approach on BloomFilters.

@ManfredKarrer
Copy link
Member Author

Thanks for the write up. Your reversed idea was very interesting. Would be great if you could write that up for further discussion.

@Giszmo
Copy link

Giszmo commented Jul 20, 2016

I actually started working on my reverse idea. See this repo.

Playing with it, it seams to me it is crucial to have the bloom filter designed for the count of elements you then throw at it. This means, you get inferior performance if you take one for 100 entries and truncate it somewhat, trying to save bits for fewer entries. Or that you could just merge them for more entries. But that's no big issue I guess.

My next step would be to cover different scales, to quickly find relevant blocks and to also include the addresses that are being spent from, not only spent to.

So, the repository mentioned above chewed through the whole block chain now and only made bloom filters of about 500k P2PKH receiving addresses each. The result is 463 files of 418MB total or 903kB each, with 0.1% false positives. With the input addresses I expect it to be double that.

Due to the nature of Bloomfilters (they need about 10 bits per unique entry for 1% false positves) and the nature of the blockchain (entries are mostly unique in short ranges), you don't gain much by covering multiple blocks with one Bloom filter. On the contrary, if you have one filter per block, you can count entries first, then make the filter fit the entry count and thus get better adjusted filters overall. A 5k elements filter with 0.1% false positives weighs only 9kB and should be enough for 1MB blocks with inputs and outputs.

@ManfredKarrer
Copy link
Member Author

So the whole blockchain represented in bloomfilters would be about 1GB. As the client only needs the blocks created after he first started his app, it will be considerable lower. 144 blocks a day with 9kB would be about 1.3mB per day which is reasonable. Even if you did not sync for a month it's just 30mB to download to catch up. Of course it requires afterwards the block downloads as well but for a typical user there will not too many...

Sounds an interesting alternative between FullNode and BitcoinJ style SPV.
Would be great to get that as a service from full nodes so we don't require an extra server for delivering those filters. But that will prob. be difficult and not happen soon. But once it's implemented and we have some working prototype you might discuss it on the Bitcoin core dev mailing list.

Would be interested in feeback from @jonasnick

@Giszmo
Copy link

Giszmo commented Jul 30, 2016

@ManfredKarrer what is up with the mailing list? I mentioned my repository there, too and my mail is in moderation queue since two days!?!? WTH? Is the list dead and all doing slack only? Do I have to use slack? Really?

@ManfredKarrer
Copy link
Member Author

ManfredKarrer commented Jul 30, 2016

I think the Bitcoin ML got a lot of spam/social attacks so prob. they are more restrictive with acceess. But mail directly to one of the core devs or get in touch with @jonasnick.

Re Slack:
I hope not at least the core guys stay Slack-less. Find that pretty concerning like all change to commercial platforms.

@ManfredKarrer ManfredKarrer changed the title Bounty: Review Bitsquare's BitcoinJ Bloomfilter fixes Bounty: Fix BitcoinJ Bloomfilter Mar 8, 2017
@nopara73
Copy link

If you would like to go down the rabbit hole and make a perfect fix for bloom filter stuff, then consider full block downloading SPV wallets. That is the best I could find as of today.
I'm in the works of writing a full block downloading SPV wallet in C# on top of NBitcoin that is similar to BitcoinJ: GitHub
I am not sure the performance would be good enough for you, but here's a basic estimation.

So someone could take the time and get this bounty by translating my C# code to Java.

There are 2 alternatives:

  • Wait until Core merges its own full block SPV implementation (if ever) and use their RPC API.
  • At Stratis we'll probably have a local HTTP API to manage our implementation. I'm relucant to advise this though. Until it gets less experimental keeping other teams' wishes in mind would slow down our development.

@remyers
Copy link
Contributor

remyers commented Dec 2, 2017

Would one of these proposals be helpful? It seems like block digests are a good privacy preserving way to handle light clients.

  • "BIP Proposal: Compact Client Side Filtering for Light Clients" (June 2017)
  • "Block Filter Digest profiling" (June 2017)

@ghost
Copy link

ghost commented Dec 2, 2017

The bounty is NOT for a review but for fixing the BitcoinJ handling of pubKeys in the bloom filters to make the removal of the pubKey in the filter work.

It's impossible to delete elements from Bloom filter as I know. As an alternative there is Counting Quotient Filter - https://blog.acolyer.org/2017/08/08/a-general-purpose-counting-filter-making-every-bit-count/amp/

@ManfredKarrer
Copy link
Member Author

@remyers The client side filtering is indeed the most promising solution. But I am not aware that anyone is working on it on the Bitcoin side.

@dulanov I referred to not add the PubKey but only the PubKeyHash to the filter because there are basically no P2PK transactions anymore used (at least not in Bisq). So it was not about removing elements from an existing filter but to create it without the PubKey. I implemented it and it worked basically but we discovered that in some cases transactions did not get delivered. I assume that is related to other code parts in BitcoinJ where some shortcuts are done, but did not had time to investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants