Towards Measuring the Traceability of Cryptocurrencies #601
DanGould
started this conversation in
Study Club
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
https://arxiv.org/pdf/2211.04259
Merging vertices with identical addresses
could be extended to arbitrary clustering rules
Entropy is not of the edges but of the stationary distribution. Use linear algebra to approximate the stationary distribution.
Boltzmann Entropy
can be used to inform absorber probabilities, it's not in tension with Untraceability model.
Often subset sum can be calculated with linear solvers on consumer hardware (e.g. wasabi 1 50 input 50 output), which could inform a model that combined them both.
Defenders generate under-determined transactions. Adversaries need to compute e.g. subset sum on determined transactions that don't change.
Since some if it is
Compute
Bitcoin compute
1h for 1 week with Table 1 nodes & edges
Back to genesis
Ethereum
3-4 hours, since there were more cycles
academic journal
more clustering techniques can change clustering to crate loops
new cluster can make absorbers
main cost has to do with the sparsity of the matrix. Degree is bound by a constant on the transaction graph. If you don't do the cluster collapse set. With clusters emerging, you've got a non-sparse regime.
for every given coin you want to compute the entropy, so you can prune the graph and disregard stuff not in that coin's light coin. Then take the nth power of that coin's matrix. Google did this with page rank.so possible to parallelize, do in map-reduce way.
Distinguish between calculating probability distribution and whether or not it has high entropy
Calculate for every coin? have interested user pay for cost of query and then cache it? Prioritize based on payment or interest?
Eve Alice Eve threat model of CoinJoins.
You have a counterparty, small villiage with bitcoiners who use non-kyc atm but don't interact with each other. Alice has enough bitcoin from atm, needs to buy a boat to have a boating accident. Even if she coinjoins every coin, the ATM has a determined intersection attack because her coins from the past are determined to be in the set.
Istvan's paper characterizes this precisely.
If graph acts as expander graph or super-concentrater graph (in a random walk they behave like a connected graph).
If the blockchain has the structure of an expander graph then all coins' have fungibility. Q: Can we construct txs with expander graph probabilty: A:, Yuval assumes yes, "all graphs are expander graphs" (even though we care about super-concentrater graphs). There are random DAG models that care about what we're interested in. This is a global property of the graph
with log n clusters after coinjoin
Expander graphs are like irrational numbers: if you generate a large graph, with very high probability it will be an expander graph. Like if you point to a number line, with probability approaching 1 you will hit an irrational number.
super concentrator graph
Directed graphs with strong connectivity.
it's a statement about limits
randomness extraction
Tagged hash for discounting
incentivize people not to CoinJoin as urgently as possible. Resulting tx graph ends up with pseudorandom structure.
Imagined deep connection with differential privacy. We can quantify robustness of privacy, perhaps. Robust to edge removal even if other users make mistakes.
where you say markov process in distributional sense, change it to a one-shot probabilistic algorithm. Walk a single sat on the graph to some depth and see where that ends up. That'd be the diff priv algo. Δε-differential privacy. Think of edge between every might-be-linked coinjoin input and output with some probability assigned to it. If you remove those edges, can you bound the algo from outputting a probability of certain source coins? Compare the two distributions: Is the difference between before and after picture bounded by ε?
This quantifies the rate of decay of privacy for an adversary looking at a specific coin.
Perhaps this is more useful for researchers and protocol designers than for end users or even applications that can make specific actions.
Istvan: 2020 US census data published in differential private way. Sociologists/Psychologists were complaining that some districts in Chicago where some childrens' age was a negative number.
Census used to do elision and swapping but people complained it wasn't sufficiently private. Intersection with voter registration data was used to de-anonymize some people. That's how k-anonymity was proposed. Around 2005 bureau started using differential privacy. Which is done by adding laplacian noise to all data. Noise will dominate data point so that nobody can learn data about particular point but they can learn about statistics. So in the Chicago case the mean may still have been statistically useful, but a particular zip code may have been overwhelmed by noise. It sounds like a trade-off between useability of statistics and privacy. Seems possible to design for specific statistical analysis algorithms.
Cynthia Dwork started this work in math in the 90s. USG started with this to gather data on homosexuals in the military.
Next week
discuss detail of expander graph stuff and how it applies to coalition formation stuff.
subtx model one for next week? Or financial crypto '24 [Wicht [57] in Seres '25] paper with more general model w/o random walk.
Beta Was this translation helpful? Give feedback.
All reactions