-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix BitcoinJ Bloomfilter #487
Comments
As I told you in person, bloomfilters that describe addresses that you are interested in, that are then in addition to that interconnected via change addresses or merging spends are broken beyond repair when you want privacy. Requesting 50% of all addresses' transactions currently weighs 40GB less than the whole blockchain yet still it weighs 40GB you have to download, only to fool the spy a tiny bit. If he finds chains 10 long that fit the filter, then he already has a high probability to have found a match and if you ever make a chain of length 20, which is for how long I tend to use my HD accounts, you certainly have your addresses revealed even if you took the compromise of downloading 40GB of noise. I'll try to keep you posted on my reversed approach on BloomFilters. |
Thanks for the write up. Your reversed idea was very interesting. Would be great if you could write that up for further discussion. |
I actually started working on my reverse idea. See this repo. Playing with it, it seams to me it is crucial to have the bloom filter designed for the count of elements you then throw at it. This means, you get inferior performance if you take one for 100 entries and truncate it somewhat, trying to save bits for fewer entries. Or that you could just merge them for more entries. But that's no big issue I guess. My next step would be to cover different scales, to quickly find relevant blocks and to also include the addresses that are being spent from, not only spent to. So, the repository mentioned above chewed through the whole block chain now and only made bloom filters of about 500k P2PKH receiving addresses each. The result is 463 files of 418MB total or 903kB each, with 0.1% false positives. With the input addresses I expect it to be double that. Due to the nature of Bloomfilters (they need about 10 bits per unique entry for 1% false positves) and the nature of the blockchain (entries are mostly unique in short ranges), you don't gain much by covering multiple blocks with one Bloom filter. On the contrary, if you have one filter per block, you can count entries first, then make the filter fit the entry count and thus get better adjusted filters overall. A 5k elements filter with 0.1% false positives weighs only 9kB and should be enough for 1MB blocks with inputs and outputs. |
So the whole blockchain represented in bloomfilters would be about 1GB. As the client only needs the blocks created after he first started his app, it will be considerable lower. 144 blocks a day with 9kB would be about 1.3mB per day which is reasonable. Even if you did not sync for a month it's just 30mB to download to catch up. Of course it requires afterwards the block downloads as well but for a typical user there will not too many... Sounds an interesting alternative between FullNode and BitcoinJ style SPV. Would be interested in feeback from @jonasnick |
@ManfredKarrer what is up with the mailing list? I mentioned my repository there, too and my mail is in moderation queue since two days!?!? WTH? Is the list dead and all doing slack only? Do I have to use slack? Really? |
I think the Bitcoin ML got a lot of spam/social attacks so prob. they are more restrictive with acceess. But mail directly to one of the core devs or get in touch with @jonasnick. Re Slack: |
If you would like to go down the rabbit hole and make a perfect fix for bloom filter stuff, then consider full block downloading SPV wallets. That is the best I could find as of today. So someone could take the time and get this bounty by translating my C# code to Java. There are 2 alternatives:
|
It's impossible to delete elements from Bloom filter as I know. As an alternative there is Counting Quotient Filter - https://blog.acolyer.org/2017/08/08/a-general-purpose-counting-filter-making-every-bit-count/amp/ |
@remyers The client side filtering is indeed the most promising solution. But I am not aware that anyone is working on it on the Bitcoin side. @dulanov I referred to not add the PubKey but only the PubKeyHash to the filter because there are basically no P2PK transactions anymore used (at least not in Bisq). So it was not about removing elements from an existing filter but to create it without the PubKey. I implemented it and it worked basically but we discovered that in some cases transactions did not get delivered. I assume that is related to other code parts in BitcoinJ where some shortcuts are done, but did not had time to investigate. |
Requirements for taking that bounty:
Here is a general overview:
#414
If you want to work on that bounty it requires close communication. I need to be sure that you fulfill the above requirements! Also the final scope of the bounty will be decided during work on it. The bounty amount is just for a first review of the current implementation not follow up improvements.
Update:
The fix is still not solving the privacy issue.
The bounty is NOT for a review but for fixing the BitcoinJ handling of pubKeys in the bloom filters to make the removal of the pubKey in the filter work.
I might increase the bounty as well if a qualified developer works on it. Please get in touch if you are interested.
See also: #414
The text was updated successfully, but these errors were encountered: