New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Raising the mandatory ringsize in the v6 hardfork, September 2017 #1673

Closed
olarks opened this Issue Feb 4, 2017 · 152 comments

Comments

@olarks

olarks commented Feb 4, 2017

TL;DR There is an edit at the bottom of this post highlighting what is going on because a lot has changed over many discussions, and better alternatives are being explored.

You may want to grab a drink for this one...

As we all know there will be a hardfork in September 2017 that will raise the mandatory ringsize to 4
based on the recommendations of MRL-4, but this paper was published before RingCT was found to have extraordinary savings in transaction size for transactions with larger ringsizes.

We have 7 months to explore alternatives and I would like to discuss some viable options
for raising the mandatory ringsize for the v5 hardfork, so that we do not need to again change it in a
future hardfork unless certain circumstances arise.

It is important to note that raising the ringsize of a transaction will increase the time it takes to verify
transactions for nodes, but as far as I am aware this scales linearly(?). So this can be maintained to a reasonable degree and if we assume processor speeds continue to improve, then this problem does not greatly impact the scalability of the Monero network in comparison to the relative gains of privacy for its users. Optimizations in ring signatures and rangeproofs will certainly ease these concerns.

There are two options I would like to explore:

  1. Simply raising the mandatory ringsize to a value greater than 4 that can take full advantage of RingCT's optimizations in transaction size without sacrificing very much performance. Ring size 8 could for example easily be used instead of 4 and still have very good performance for example.

and the more interesting option

  1. Raising the mandatory ringsize and have it be static so no other ringsizes are accepted in the Monero network. Static ringsizes 9, 12, or 15 would be good options and I will explain a bit more about why those ringsizes specifically are special later below.

The first option would be simple to implement and the optimizations of RingCT would be immediately realized with very little added cost. However the second option will bolster privacy on the Monero network by a much higher degree. By enforcing only one valid ringsize for transactions this removes any confusion and complexity for new users who do not understand why you would want to increase the ringsize of a transaction if transactions in Monero are all already private. Since the protocol would be enforcing a static ringsize, then all transactions would be homogenous and would circumvent analysis of transactions by adversaries who may know that for example exchange x uses ringsize y for all their outgoing transactions, or that there is a user that regularly specifies atypical ringsizes like 41 for all their transactions. This removes a human element to transactions that could save people from unintentionally reducing their privacy as well as other users' privacy.

In the future, businesses and services could be 'assigned' very specific ringsizes that would stick out on the blockchain for passive observers to more easily identify their outputs in other ring signatures, this is only hypothetical and one of many possible scenarios where irregular ringsizes could reduce privacy in the Monero network.

I believe that this is a very valuable and natural addition to Monero since the protocol already makes decisions to protect its users like dynamic fees and randomly choosing their ring partners according to specific parameters. A static ringsize would just further streamline the user experience while defending against many attacks via passive analysis of ring signatures.

In addition to the arguments I made for a static ringsize I will also propose a new output selection algorithm to assist a static ringsize. Going back to why I had specifically mentioned static ringsizes 9, 12, and 15 is because if a third of ring partners are chosen randomly from the past five days and the remaining two thirds of ring partners are chosen at random then a few narrow edge cases that would threaten privacy would be completely eliminated or at the very least much less impactful than only having a default ringsize 4.

I will demonstrate these claims with some hypothetical scenarios using a ringsize of 12 as an example.

Scenario 1: Alice sends Bob a transaction with an output that is two days old. In Alice's ring signature are four inputs from the past five days chosen by our output selection algorithm(plus Alice's real spend) and eight additional inputs chosen at random.

Alice has complete financial privacy as her real output is masked with the other 12 ring partners without any major inconsistencies in our output selection algorithm. Since there is a chance at least one of those eight outputs chosen at random could have been from the last five days the additional recent output in the ring signature is not entirely suspicious. This will still be taken into account by passive observers since a lot of people will be spending money soon after it has been received. Possible scenarios where this would definitely question the legitimacy of randomly chosen outputs and directly threaten privacy is discussed in more depth later.

Scenario 2: Same as Scenario 1 but instead Alice has an output that is one month old. Our output selection algorithm will again choose four random outputs from the past five days and eight outputs at random.

Now the problem is Alice has an output that is older than the outputs that would be specified from the last five days and if the other eight random outputs do not coincide with the last five days then only four recent outputs will exist in the ring signature. This is a direct indication that none of those outputs is the real spent output because only four recent outputs would be chosen from our algorithm anyway, hence being fake. Luckily the other eight random outputs still mask Alice's real output so this scenario does not greatly threaten her privacy as the transaction would effectively still have the strength of ringsize 8 down from 12.

Scenario 2 is less likely to occur if a larger timespan is chosen for recent outputs in our algorithm so the randomly chosen outputs have a higher chance to be in that timespan, but since money is typically spent within a short time after being received this will directly weaken the obfuscation of our recent outputs selected for a ring signature.

The output inconsistencies in Scenario 2 cannot be solved very well without adding additional formality to the output selection algorithm making it easier to single out specific outputs that are not consistent with our output selection rules like in Scenario 2. With a ringsize of 12 at the very least an effective strength of ringsize 8 would still exist for a transaction.

However if an observer were able to find inconsistencies within a transaction with multiple inputs, all or at least some, containing more than four outputs younger than five days old it is incredibly unlikely the other eight random outputs for each input's ring signature would contain very recent outputs. This would question the legitimacy of the eight other outputs older than five days giving the transaction a hypothetical strength equal to the number of outputs in the ring signature younger than five days old or at the absolute bare minimum the strength of ringsize 4.

These scenarios will need to be accepted to exist and thankfully do not completely destroy the privacy of a transaction with poorly chosen ring partners by the output selection algorithm.

Increasing the static ringsize would decrease the likelihood of these contradictions in ring signatures, but it is a tradeoff with transaction sizes. Ideally most transactions would be able to take full advantage of ringsize 12, but the privacy of ringsizes 4 and 8 can still be obtained in worst case scenarios.

I personally prefer ringsize 12 because at the bare minimum when the output selection is not in your favour then at the very least ringsize 4 can still be obtained, which is the original ringsize scheduled for v5 hardfork. Ultimately any ringsize greater than 9 that is a multiple of 3 can be used and if we take into account optimizations of ring signatures and rangeproofs then even larger ringsizes, factors of 3s, could be considered.

The privacy that would be gained with a static ringsize 12 far outweighs any short term scalability issues that may arise in my opinion and would be a boon to the privacy and security of the Monero network while having good safe guards for maintaining a minimum privacy level of ringsize 4.

I hope valuable discussions can be made from this post and further optimizations that can further the security of the network can be discovered.

Thanks for taking the time to read my writeup. Any input is greatly appreciated.

EDIT: If anyone is just now finding this discussion and does not want to go through the entire backlog(I don't blame you ;) ) then to catch everyone up to speed what is happening right now is I have been surveying bitcoin transactions for the age of outputs and categorizing them by the last day, week, month, year, and older than a year to get the probability of these outputs being spent to help construct an output selection algorithm for Monero's ring signatures.

After long discussions with @iamsmooth, having only one static ringsize may be a bit extreme and does not give a user the freedom to increase their ringsize for transactions. @moneromooo-monero mentioned possibly having three static ringsizes for users to choose from. I have been favoring having only three static ringsizes because of its similarity to users being able to choose three different fee options based on urgency, we could have three ringsize options based on paranoia for users to choose from.

So for the time being we are just trying to narrow down what an output selection algorithm would look like. I have to thank @iamsmooth for giving critical feedback and nitpicking different ideas. There is no need to be increasing the ringsize from 4 after we come to a conclusion on a suitable output selection, however I still favor giving the mandatory ringsize a slight bump(to 6?) based on ringct savings in transaction size, but if there is no consensus for doing so ringsize 4 is still a strong option.

@ghost

This comment has been minimized.

ghost commented Feb 5, 2017

This is a great topic to highlight. To my knowledge, we haven't really had a decent post-RingCT discussion of ringsize. I assume keeping the size of the database smaller rather than bigger will be a part of this discussion, but personally I like your thinking, particularly the idea of a static ringsize.

@NanoAkron

This comment has been minimized.

Contributor

NanoAkron commented Feb 5, 2017

Is this inspired by the latest work by @knaccc describing the churn required for true anonymity in ring sets?

@olarks

This comment has been minimized.

olarks commented Feb 5, 2017

@NanoAkron I have not been around the last few weeks, so I must have missed that discussion. This writeup was inspired after looking deeper at the ring signatures and output selection algorithms. Though I am not aware of anything else that could fix these problems without setting a suitable static ringsize 12 like I discuss above. The problems are just inherent to the output selection algorithm and taking those flaws into account with a ringsize that can still protect users.

@RandomRun

This comment has been minimized.

RandomRun commented Feb 5, 2017

Hey, I am glad to see input selection and ringsize value being put under the microscope to fix some information leaks that still exist. Indeed, if one looks at the high ringsize transactions, it is possible to trace back the user's inputs with some probability of success, assuming that the user is "the high ringsize guy" that uses all the highest values in each ring.

I think that @olarks approach is in the right direction, but I'd like to make a few suggestions. But before moving on to those, I just wanted to note that there are two different issues at play here: (i) inputs age correlation, and (ii) their ringsize values correlation, both of which may leak information and help with traceability.

Inputs age:

If we knew the actual probability distribution for the ages of the inputs actually consumed on each new ring, then we could just sample that distribution as much as needed to build new transactions. AFAIK, we don't have that distribution yet, but it would be a worthy endeavour to try to obtain it by looking at the blockchain of some non-private coins, like Bitcoin. I bet that if some one did that, we would be looking at something that resembles a Zipf distribution, or otherwise some form of power law distribution. In that regard I think that the distribution suggested in the OP might not be the best fit for that, since it is the superposition of two uniform distributions.

If we assume for the moment that the actual distribution is a Zipf. Then all that the wallet has to do is pick a day according to it, and then, within that day, just choose one input uniformly at random. Repeat this process untill you get enough inputs to complete your ring. Add your own input to it, which by definition can be viewed as picked according to the actual distribution, and you are done.

[As a side note, if it turns out that that "the actual distribution" and hence the one we use to sample the inputs does follow a power law and has a long tail, it might make the blockchain a bit more prunable, in the sense that over time it would be very rare to see a transaction involving a very old input, real or decoy. And so, maybe not all nodes will have to keep the blockchain history older than say 5 years (although of course all the inputs would still have to exist somewhere). This would not be possible if we keep sampling inputs according to a uniform distribution, though.]

Ringsize:

Monero has a good level of privacy, and that comes in part from the size of your anonymity set in each transaction, which in this case is the ring size, so the bigger the better.
I agree that we should make it easier for the user not to do something stupid everywhere we can, and that includes ringsize selection; and I like the idea of increasing the minimum ringsize way beyond 4, since that doesn't seem to be a bottleneck, at least in the forseeable future. But on the other hand, I would prefer it to be done by way of selecting good default options rather than banning some ringsize values (except for values under the minimum, that is good of course).

So instead of simply defaulting the wallets to set ringsize to 4 or 8, why not have the default be a random value as well? Perhaps let it be a uniform random value between 8 and 50, or a triangular distribution from 8 all the way down to 100, or even another power law. I don't have a completely well formed opinion about the choice of distribution for the ringsize, except that I really think it should be random by default. That way, if a user does decide to use a higher ringsize, hopefully they won't stand out as much as right now, when it is prety much all ringsize 2 and 4...


@NanoAkron: Could you please link to @knaccc's work that ou referenced?

@hyc

This comment has been minimized.

Contributor

hyc commented Feb 5, 2017

@knaccc's latest draft is here http://termbin.com/kplo

@JohnnyMnemonic22

This comment has been minimized.

JohnnyMnemonic22 commented Feb 5, 2017

As a possible solution to the scenario 2 issue, instead of the number of selected 5-day-old ring partners being fixed to ringsize/3, why not make it a random number between 0 and ringsize-1? Wouldn't that eliminate the possibility of clever deductions based on input age?

@kenshi84

This comment has been minimized.

Contributor

kenshi84 commented Feb 6, 2017

@RandomRun

So instead of simply defaulting the wallets to set ringsize to 4 or 8, why not have the default be a random value as well? Perhaps let it be a uniform random value between 8 and 50, or a triangular distribution from 8 all the way down to 100, or even another power law.

But the tx fee is proportional to the ringsize. For example, currently the fee for a 2-in/2-out transaction with ringsize of 8/50/100 would be 0.024/0.033/0.045, respectively. So I can imagine users will cancel and repeat the tx construction algorithm until the algorithm picks a smaller ringsize to reduce the fee. I think that was the point of the enforced minimum mixin.

@olarks

This comment has been minimized.

olarks commented Feb 6, 2017

@JohnnyMnemonic22 The main problem with a randomly variable output distribution is now your minimum privacy guarantee is random too. While this could make scenario 2 less likely to occur the minimum privacy level being guaranteed could be lower than ringsize 4, in some scenarios, which I think is undesirable. Having a concrete minimum ringsize that can be accepted to fall back on, like ringsize 4, when randomly chosen outputs are unfavorable for a user is the better alternative imo.

@RandomRun Zipf distribution or other power laws could be possible, but this would add an additional uniform pattern to how outputs are chosen wouldn't it? Any output that does not fit the uniformity of the power law would likely to be the real spent output right?

The reason why I chose the output selection to have a completely random distribution, within two seperate timeframes, is because it is less likely to be successfully scrutinized by an attacker sorting through randomness. By having the number of outputs from the last five days be ringsize/3. this can provide a 'minimum' functioning ringsize to ensure that even if our random output selection are poorly chosen we have a safety guard in place to still provide good privacy.

I had also pondered having a random ringsize be chosen for every transaction within a defined range, but this won't be enforced in all wallet software. The network would still allow any ringsize in our chosen range to be used which means not all transactions would be using random ringsizes, rather favoring the minimum ringsize to save on transaction fees like @kenshi84 points out.

I think having a static ringsize would be invaluable for the level of homogeneity it would provide because the more transactions stand out from eachother the more likely passive analysis can make a connection among them.

Thanks for your input @RandomRun

@Gingeropolous

This comment has been minimized.

Contributor

Gingeropolous commented Feb 6, 2017

@RandomRun @olarks , I'm digging the higher fixed ringsize. And hopefully we're all agreed that its ringsize now . I've had a longstanding concern over inputs age.

@iamsmooth posted something recently in IRC and I dunno if it was followed up, but it sounds like its related, so I'd thought I'd post, and I hope smooth doesn't mind that I copy and pasted his words.

Jan 30 11:04:30 <smooth>        to pick some random recent transaction, pick a random input, and then use that (approx) age to pick your fake out
Jan 30 11:04:58 <smooth>        repeat until done
Jan 30 11:06:21 <smooth>        this will converge to fake outs having the same distribution as real outs, even though real outs cant be identified. statistical magic
Jan 30 11:10:53 <smooth>        there might be some reasonable refinements like picking a random recent transaction of the same or similar shape (in terms of number of inputs and outputs)
Jan 30 11:11:12 <smooth>        at the cost of needing even more volume for that to be reasonable
"

what do you guys think?

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Feb 6, 2017

The discussion of a single fixed ring size is worth having, from the perspective of increasing homogeneity, and therefore offering a potential system-wide benefit. I'm generally not a fan overall of proposals to blanket increase the minimum ring size because of the accompanying increase in tx size, cost, and therefore likelihood of reduced usage (which in turn carries its own cost in reduced privacy in practice in addition to probably increased likelihood that Monero fails to reach a critical mass of adoption quickly enough or at all, and fails entirely).

It is certainly usually true that one can improve the privacy of individual transactions in various ways by increasing the ring size, but this is different in kind from the sorts of catastrophic cascading system-wide failures/attacks that motivated the MRL-0004 recommendations. The bar to mandating increased costs is higher when it is a more direct quality/cost trade-off and not a system-wide failure mode or benefit (acknowledging the potential that a fixed size might carry such system-wide benefits).

I expect there will be a variety of wallets and tools built on top of the Monero protocol that will improve privacy at higher cost or with different tradeoffs (such as delayed availability of funds) by managing transactions in various useful ways (and likewise some that will minimize costs within the protocol rules for users who care more about that). Those don't all need to be built into the Layer 1 core protocol. Unless there is demonstrated systemic effect, I'm reluctant to (attempt to) impose a more expensive one-size-fits-all solution upon everyone.

@olarks

This comment has been minimized.

olarks commented Feb 6, 2017

Another concern is as the TXO set ages, then each randomly chosen output is less effective because in the future if four year old outputs are being selected for ring signatures it is likely those aren't the real outputs being spent. At the same time you don't want to be only favoring newer outputs for selection because then spending older outputs will stick out and be suspect for the real output being spent.

There would need to be in place some kind of schedule for increasing the mandatory ringsize every x years to offset this problem, so even though old outputs are being selected there is enough ring partners not to look suspicious in the ring signature. Hopefully by then we will have good scaling solutions, so this increasing bound does not bog down the network.

@hyc

This comment has been minimized.

Contributor

hyc commented Feb 6, 2017

How about: for ring size N, we pick N/2 outputs around the same age as the real one, and N/2 outputs clustered around a randomly chosen one?

@olarks

This comment has been minimized.

olarks commented Feb 6, 2017

@hyc Overall I like your idea, but the group of outputs with the real spend would stand out by having (N / 2) + 1(real spend) outputs. So every transaction's effective ringsize would be cut in half immediately.

@hyc

This comment has been minimized.

Contributor

hyc commented Feb 6, 2017

Assuming this approach is worth pursuing, then we can (1) make sure that the total number of outputs N is always even, so the real spend group is (N/2)-1 random outs + the real out, and the other group is N/2 random outs. Or, (2) if the total number of outputs is odd, randomly choose whether the real group or the random group gets the odd random out.

@olarks

This comment has been minimized.

olarks commented Feb 6, 2017

@hyc To expand on your idea, the outputs could be selected based on whether or not the spending output is in the last five days. We could use a static ringsize 10 to ensure that both groups of outputs have the strength of ringsize 5(4 ring partners + 1(potential spend)).

If the spend is in the last five days then four additional outputs are chosen at random in that timespan. For selecting the last five outputs then you could randomly choose one output in the TXO set and have it be a base point for randomly choosing four additional outputs within five days of the base output.

If the spend is older than five days then it is pretty much the same situation, but instead of choosing another random base point you would just randomly choose five outputs from the past five days. So in either scenario it is not obvious if the real spend is in the last five days or in the random group of outputs because both situations would be occurring in every ring signature regardless of the age of the real spending output.

A major hole in this though is the spend output only being a few days older than the randomly selected outputs from the past five days. What could occur is all the randomly chosen outputs within five days of the spend output all happen to also be less than five days old which would directly expose the real spend output since it would be the only output in the ring signature older than five days.

Choosing the extra four outputs randomly from the TXO set does not solve this problem very well either because now spend outputs that are only a bit older than five days look out of place in the ring signature because they are unlikely to be randomly chosen very often, and now you risk having the five outputs from the past five days being discarded as potential spend outputs. Giving the transaction effectively only a ringsize 4 with a threat of unmasking the spend output a little older than five days because it is unlikely to be selected often from four outputs in the TXO.

This goes back to my original proposal with static ringsize 12, but in the same scenario only four outputs would be discarded while giving a much higher level of plausible deniability of the real spend output because you get 8 random outputs being selected from the selection algorithm instead of 4, so the suspect spend output is scrutinized to a much lesser degree while only being slightly larger in ringsize 12 up from 10 for this selection scheme.

@JamesCullum

This comment has been minimized.

JamesCullum commented Feb 6, 2017

I think in this discussion we should also keep in mind that a higher ring size would lead to a bigger transaction and therefore higher fees. We want to make sure that people are using it for normal transactions which don't necessarily require perfect privacy and are just not supposed to not threaten the legitimacy of the whole network, which is why MRL4 proposed the minimum ringsize. I agree with @iamsmooth that a higher ringsize could be suggested for people seeking better privacy, maybe including a different output picking algorithm, but increasing the ringsize this dramatically will have more downsides on the economic part than benefits from the privacy part.

Nobody wants to use a currency where the transaction fees are a major cost. If I want to buy a coffee with a cryptocurrency, I won't choose the one where I pay lots of money to make sure nobody knows I bought the coffee. If coffee is illegal in this country I may be willing to do so, hence I would increase the ringsize myself.

Mandatory ringsize of 4? Yes.
Mandatory ringsize of >4? No.

@hyc

This comment has been minimized.

Contributor

hyc commented Feb 6, 2017

@JamesCullum Your concern doesn't seem valid. The txn size (and thus the fee) is dominated by the number of outputs, not by the ringsize. http://monero.stackexchange.com/questions/3323/with-ringct-on-can-i-lower-mixin-safely-to-save-on-tx-fees

@hyc

This comment has been minimized.

Contributor

hyc commented Feb 6, 2017

@olarks I have no problem with your suggestion of ringsize 12 in this context.

@olarks

This comment has been minimized.

olarks commented Feb 6, 2017

@JamesCullum Thanks for your input. The cost of increasing the ringsize with RingCT transactions is very overstated and I did also mention that MRL-4 did not take into account RingCT savings because that paper was written much before RIngCT existed, hence why I want to discuss increasing the ringsize now.

This proposal is to make sure that at the bare minimum a transaction gets the security of ringsize 4 when under heavy analysis. Ringsize 12 is just the cost of being able to strongly maintain this minimum security.

Your point regarding coffee is not really valid here because I am trying to make sure that when you do buy that coffee your money is still fungible. We would not want you to be getting investigated by authorities because your money had been used in the past by someone else for 'suspicious' things and now you get framed for it. This is what currently is plaguing Bitcoin for ever being used in the real world as a digital cash.

The results of this discussion and proposal I made will allow Monero to resist strong analysis of ring signatures and allow it to be real digital cash.

@JamesCullum

This comment has been minimized.

JamesCullum commented Feb 6, 2017

Ups, sorry then. If I recall correctly, before RingCT the ring size was a bigger influence, hence my statement to keep the fees low. If we can do that while at the same time having more privacy, we should do this.

Of course it is important to stay fungible, but it is already at the moment and a transaction age correlation would still give sufficient plausible deniability, meaning that a filter to block tainted coins would hit high false negatives. So I think we are currently (maybe not for long though, hence a higher ringsize than 2) fungible and should work on maintaining that. 12 may be a little high, but if it doesn't cost much more its alright.

@RandomRun

This comment has been minimized.

RandomRun commented Feb 6, 2017

@kenshi84: I can see your point about users choosing to minimize their fees and therefore the ringsize of most transactions converging to the minimum alowed, as it already happens and motivated OP's suggestion to fix the ringsize. I guess if that is the way fees are computed we can't escape the scenario with only minimum ringsizes are used. Unless we change how fees are calculated to subsidize higher ringsize transactions (something like: flat fee of 0.033 for all transactions not bigger than the 2-in/2-out ringsize 100 you used as an example). But this is more me thinking out loud about your comment than giving a concrete suggestion. I actually don't fully understand how fees are calculated, or even why they are "fixed", as opposed to their price being defined by market forces on the network.

@olarks:

Zipf distribution or other power laws could be possible, but this would add an additional uniform pattern to how outputs are chosen wouldn't it?

I don't see how it would add any information. Zipf distribution or other power laws are biased distributions, and therefore not uniform, which is the reason why I was suggesting to use them, as I believe this is the way real outputs ages behave.

Any output that does not fit the uniformity of the power law would likely to be the real spent output right?

An output is just a possible outcome from the distribution. I believe it doesn't make sense to say that it doesn't fit its distribution. At best you could say that it is a rare outcome, but that is fine, as it would be rare both when it happens through a real spend, or through sampling.

The reason why I chose the output selection to have a completely random distribution

All distributions suggested on this thread are completely random. I believe you are using uniform and random interchangeably, but those are not the same thing. In fact, my main point has been that the actual distribution of real spent outputs is a biased distribution, and therefore if we use a uniform distribution to create our rings, that will leak information.

Let me exagerate the scale, and simplify your suggestion a bit, to hopefully make this more visible:

Let's say outputs ages are in fact a Zipf distribution, and that we are producing our rings sampling outputs from the blockchain uniformly at random. Assume we have ten years of blockchain history and that users are choosing to produce a rings of size 10.

The rings produced with this uniform sampler wouldn't be too far from say having one output from each year, roughly. But as a user, by assumption, you would be very likely spending a very recent output, and it would be very unlikely that one of those 10 outputs chosen over the course of a decade would be more recent than yours. So an attacker could guess that your real input is the most recent one, and be correct with high probability.

Now you may shrink the interval to 5 days, but the effect is still there, and in the end what you are suggesting is overlaying two samples obtained from two uniform distributions that suffer from the same attack just described.

At the same time you don't want to be only favoring newer outputs for selection because then spending older outputs will stick out and be suspect for the real output being spent.

They won't be suspect because it would be both rare occurences to see them being actually spent, or randomly selected as a decoy, since both events would be occuring according to the same distribution. You might feel exposed seeing your output alone as the oldest one in the ring, but that is only because you know a priori that that is your output. To everybody else, it will look just like any other ring in any other transactions, all with more more recent outputs than older ones.

The main point I am trying to make is that within each ring, all outputs should have equal probability of being the real one, and that doesn't happen if you are using a uniform distribution to mask events that are modeled by a non-unifrom distribution.

@Gingeropolous @iamsmooth: The problem with using outputs that are close in time to the real one is that it leaks information about the time of the real output being spent. And worse, the higher the number of decoys chosen according to that method, the better the estimation of that piece of information.

@olarks

This comment has been minimized.

olarks commented Feb 6, 2017

@RandomRun Ok, I understand what you mean a lot better now. I was getting caught up in semantics. We could try surveying Bitcoin, maybe Litecoin and Dogecoin too since they have good histories going back multiple years and still get good transaction volume to this day. Hopefully some insight can be gained from what their zipf distributions look like and hopefully can be translated into an output selection algorithm Monero can take advantage of.

@luigi1111

This comment has been minimized.

Collaborator

luigi1111 commented Feb 6, 2017

BTW, (I haven't read the entire thread) ring size in RingCT actually scales identically to pre-RCT. It scales very slightly worse with number of inputs (this could be remedied if there was strong need, which I do not believe is the case).

Now, as a % of base transaction size, it is of course smaller. It can still balloon with high input count with high ring size (using the two together is dubious anyway, due to likely real input correlation).

@kenshi84

This comment has been minimized.

Contributor

kenshi84 commented Feb 6, 2017

I also wanted to bring up the subject about temporal alignments of ring members across different rings, for example:

If this kind of temporal alignment happens in a tx with many inputs, and especially if it happens for the ring members in the older time, then it becomes highly probable that those are the real spent outputs. knaccc expressed a similar concern on IRC. A possible remedy might be to artificially increase the chance of temporal alignments for even older outputs so that this kind of event is not too rare. I have no idea how to achieve that, though.

Edit:
A similar question already asked on StackExchange

@kenshi84

This comment has been minimized.

Contributor

kenshi84 commented Feb 6, 2017

@JamesCullum

We want to make sure that people are using it for normal transactions which don't necessarily require perfect privacy

I think your argument goes against the original idea of Monero, i.e., there's no such thing as 'perfect privacy' vs 'moderate privacy'. If the strongest possible privacy feature is not enforced to all participants by default at the protocol level, there's actually no privacy at all; look at Dash/Zcash etc. A small group of paranoid users choosing some 'more private' transaction method at higher cost won't help, because such transactions will stand out in the blockchain which would be easy to analyze. The more uniform transactions, the better privacy.

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Feb 7, 2017

If the strongest possible privacy feature is not enforced to all participants by default at the protocol level, there's actually no privacy at all

This is not the original idea of Monero at all. The changes to apply a mandatory minimum ring size were made to address specific issues identified in MRL-0001 and MRL-0004 where individual users choosing lower ring size could severely compromise the privacy of other users, and make certain Sybil attack vectors much cheaper for an attacker. That justified imposing higher costs, because the alternative meant severe damage or vulnerability to the system as a whole.

There is no such case being strongly made here, at least nowhere near the degree to which the case was made in MRL-0001 and MRL-0004,

A small group of paranoid users choosing some 'more private' transaction method at higher cost won't help, because such transactions will stand out in the blockchain which would be easy to analyze

This needs further study and characterization to better clarify what is meant by "won't help". It certainly does help in the sense of increasing the immediate anonymity set on a particular transaction, including the strength of plausibility deniability of having spent a particular output. This carries tradeoffs in terms of standing out from the crowd, but it is not clear (at least to me) without more rigorous analysis how or when these tradeoffs might apply, nor whether this imposes damage or vulnerability onto other users of the system (if a particular user finds the tradeoffs of using a larger ring size to be unattractive, that user can choose not to do so).

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Feb 7, 2017

@kenshi84

I also wanted to bring up the subject about temporal alignments of ring members across different rings

That should probably be made into a different issue. Simply increasing the ring size a little won't solve that problem, at least not entirely.

Time alignment is somewhat addressed in MRL-0004 and the recommended solution is largely to avoid or limit how related outputs are combined and instead send multiple transactions, but that has some costs, including potentially increased costs from RingCT.

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Feb 7, 2017

@RandomRun

@Gingeropolous @iamsmooth: The problem with using outputs that are close in time to the real one ...

That wasn't my suggestion. Maybe you meant to direct that at someone else though, I think @hyc made a suggestion something like that

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Feb 7, 2017

@hyc

The txn size (and thus the fee) is dominated by the number of outputs, not by the ringsize.

It isn't always dominated by the number of outputs. As @luigi1111 mentioned a few replies back, the signatures can still balloon up and exceed the size of the range proofs if the ring size and number of inputs is both large (since they scale with n*m), assuming of course that the number of outputs isn't also large.

A large number of inputs is less common with RingCT but currently it still does happen (I've seen txs with >100 inputs recently). There might be other reasons to discourage that though.

@kenshi84

This comment has been minimized.

Contributor

kenshi84 commented Feb 7, 2017

@iamsmooth

This is not the original idea of Monero at all.

Well, maybe I just took the idea wrong and am being an extremist. But I do get a puzzled feeling when users are currently allowed to use Monero with different degrees of privacy by changing ringsize. I saw someone on IRC (forgot who) questioning: "Oh, isn't it always private when I use Monero?" There have been quite some questions asked about the degree of privacy w.r.t. ringsize, e.g.:

So at least this seems like a common point of question and possibly a source of confusion. Also, users might even damage their privacy by choosing high mixin inappropriately.

I really wonder: does using higher mixin really provide more privacy than using default mixin (currently 4, possibly higher in the future)? If so, does that mean users using the default mixin are somehow risking their privacy?

I wish to be able to tell people simply that "Your privacy is well protected if you're using Monero", instead of "Your privacy is protected very well/so so if you configure your Monero wallet in this/that manner". In my opinion, MRL should continuously conduct studies on the level of possible Sybil attacks and blockchain analysis at present time to come up with a particular ringsize that should provide good enough privacy for everyone, and enforce it at the protocol level on every hardfork.

The community decided to accept the increased cost of RingCT favoring the improved privacy, despite some users complaining. Likewise, shouldn't we accept the increased cost for the fixed higher ringsize to ensure high enough privacy?

Those who want to sacrifice their privacy to save fees may switch to other coins. Or alternatively, some Lightning Network built on top of Monero that offers lower fees at the cost of privacy may be able to satisfy such demand in the future.

@JollyMort

This comment has been minimized.

Contributor

JollyMort commented Jul 11, 2017

I think there is still work to be done to determine...

I agree, but that is something we can't conclude here since it requires more research.

I believe that the required amount of churns could vary significantly depending on the method chosen to analyze it, especially if the output selection algorithm is taken into account.

As for the minimum ringsize, a month after the HF, there will be enough real data to analyze how many times the new outputs get picked :) Since 50% is chosen from last 1 day or so and the ringsize will be 10, I expect them to get picked up quite often if the network usage doesn't oscillate much. We'll see.

@lethos3

This comment has been minimized.

Contributor

lethos3 commented Jul 11, 2017

Fixed ring size of 10 looks good.

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Jul 11, 2017

How high the minimum ring size needs to be to ensure that most outputs will be used at least once as part of a decoy in another transaction

It is not hard to calculate this using the binomial distribution as a function of the number of outputs created per day and the ring size.

@fluffypony

This comment has been minimized.

Collaborator

fluffypony commented Jul 26, 2017

FYI: we're aiming for a code freeze, tag, and release on August 15th, so that we have a month before the September hard fork. This means we need to pick a magic number in the next two weeks, bearing in mind that we can also change it in the subsequent hf.

@luigi1111

This comment has been minimized.

Collaborator

luigi1111 commented Jul 27, 2017

I object to picking a "magic number" with this little lead time. I prefer leaving the currently/previously planned ring size 5 in place for September.

The wallet can absolutely be changed to use 10 as default.

In any case this does mean the the ringsize bump is the only thing going into this hard fork, though obviously a lot of non-forking changes will go in the release.

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Jul 27, 2017

The wallet default, in practice, seems to have limited practical effect. A large majority of the transactions tend to be made using the minimum allowed. In fact it might be better for users to just let the default be the minimum since that provides the best 'blend into the crowd' protection on the network anyway, and anyone wanting to do something else maybe should be explicit (and informed, i.e. 'advanced setting') about it.

However, I don't disagree with what @luigi1111 says about lead time and making the adjustment to the already planned minimum=5 now while considering, well in advance, another minimum or fixed ring size for a future fork. For example, this would be a reasonable time, now, to decide on an April fork.

@JollyMort

This comment has been minimized.

Contributor

JollyMort commented Jul 27, 2017

The wallet can absolutely be changed to use 10 as minimum, even if it's not consensus minimum.

@dnaleor

This comment has been minimized.

dnaleor commented Jul 27, 2017

I prefer the minimum to be equal to the default, as exchanges tend to use the minimum to save on fees (or maybe for regulatory reasons)

Poloniex for example used the minimum for a long time, making it easy for people to trace monero coming directly from them.

@peronero

This comment has been minimized.

peronero commented Jul 27, 2017

The September hardfork and mid-August freeze seem to have pretty awkward timing taking into account progress on other important work - is there any merit to pushing back the fork?

  1. Almost all TXs are already RCT so enforcing RCT at protocol level is of negligible benefit
  2. Minimum ringsize bump at this point in time seems to be almost arbitrary and ad-hoc as a bump below 10 is of negligible benefit but a bump above 20 would bloat the blockchain beyond what is reasonable with current rangeproofs (paraphrasing Surae, I think I got it right) - optimized rangeproofs could be ready in several months and informed ringsize research could come a couple months after that
  3. 0MQ and multisig are not ready to make it into the release either but could be in several months

Does it perhaps make sense to wait for 2 and 3 and have a well-informed ringsize bump with shiny new rangeproofs and feature-full release in 6 or so months as opposed to waiting until next spring?

On the other hand, a GUI release is long overdue, but maybe we could squeeze out an interim GUI release with a 0MQ- and MS-less daemon sometime this summer and have the hardfork release in the winter?

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Jul 27, 2017

  1. 0mq and multisig are not consensus, they can be released (either together or separately) on any schedule independent of forks.
  2. Agree completely about RCT.
  3. Not sure if the planned increase 3->5 is negligible or not. I'd prefer to see it happen all else being equal, especially since its been sort of part of the social contract for like 2+ years. I'd consider all other enhancements, whatever their merits, to be in a somewhat different category.
@rusticbison

This comment has been minimized.

rusticbison commented Jul 31, 2017

The specification for the X Wallet is a fixed mixin of 4. At this level, the cost to use the wallet is already uncomfortably high.

On a related note: it would be a critical event if the network forks and enforces a higher minimum ring size. That would mean that our app would suddenly fail to send, trapping the user's money and creating a lot of stress. They would certainly lose trust in the technology at that point.

@Gingeropolous

This comment has been minimized.

Contributor

Gingeropolous commented Jul 31, 2017

The specification for the X Wallet is a fixed mixin of 4. At this level, the cost to use the wallet is already uncomfortably high.

I wouldn't fix the ringsize at 5. It should be whatever the protocol minimum is, at the least.

Also, increased ringsize doesn't increase the tx size that much, compared to what ringct has done.

@barnyardanimal

This comment has been minimized.

barnyardanimal commented Jul 31, 2017

I have read every comment above and will summarize my thoughts citing the following recent points:

I prefer the minimum to be equal to the default, as exchanges tend to use the minimum to save on fees (or maybe for regulatory reasons)

Not sure if the planned increase 3->5 is negligible or not. I'd prefer to see it happen all else being equal, especially since its been sort of part of the social contract for like 2+ years.

It's worth repeating what I think @moneromooo-monero pointed out, which is that a 15% transaction size increase will in the future become a 30% transaction size increase, because when the range proof sizes are eventually reduced by @luigi1111 the rest of the transaction will be smaller by comparison.

FYI: we're aiming for a code freeze, tag, and release on August 15th, so that we have a month before the September hard fork. This means we need to pick a magic number in the next two weeks, bearing in mind that we can also change it in the subsequent hf.

I think an increase from 3 to 5 seems fairly negligible. I agree with the (very loose) consensus of 10 as the network minimum AND the client default. Range proofs come soon enough and a few months of transactions happening with a larger ring size before they are implemented wont be the end of the world.

Lets settle on 10 NOW a few weeks before August 15, so that wallets now in development can plan accordingly @rusticbison

The specification for the X Wallet is a fixed mixin of 4. At this level, the cost to use the wallet is already uncomfortably high.

I have seen the cost projections above and strongly disagree with this statement. People that use Monero care about privacy and security and 10 is much better fit for the current Monero social contract. An increase in ring size has been long expected.

@olarks

This comment has been minimized.

olarks commented Aug 7, 2017

Hello everyone. I am glad to see discussion has still continued on this topic in my absence. There is lots of catching up to do! I had fallen off the earth for the past few months, but I am finally back. My last post mentioned continued surveying of the Bitcoin blockchain for the age of spent outputs and it is mostly the same, but with a larger survey size.

Total Outputs Last Day Last Week Last Month Last Year > 1 year
7,029,369 4,533,277(64%) 1,274,759(18%) 672,399(10%) 467,780(7%) 81,154(1%)

Here is the improved python script that will dump the results to a json file in the same location as the python script. Feel free to play with it and modify it if there are other things you want to survey in the Bitcoin blockchain.

import urllib2
import json
import os
import time

# this takes over a day's worth of blocks to parse so parsing
# the same block twice is unlikely
target_blocks = 144

def survey():
    if not os.path.isfile('survey.json'):
        with open('survey.json', 'w') as writefile:
            json.dump({'inputs': 0, 'transactions': 0, 'day': 0, 'week': 0, 'month': 0, 'year': 0, 'old': 0}, writefile)

    with open('survey.json', 'r') as readfile:
        data = json.loads(readfile.read())
        num_inputs = data['inputs']
        num_transactions = data['transactions']
        num_day = data['day']
        num_week = data['week']
        num_month = data['month']
        num_year = data['year']
        old = data['old']

    # timeouts are for lazily dealing with HTTP 429 errors and other errors from the API
    try:
        current_hash = urllib2.urlopen('https://blockchain.info/q/latesthash').read()
    except:
        time.sleep(60)
        try:
            current_hash = urllib2.urlopen('https://blockchain.info/q/latesthash').read()
        except:
            time.sleep(900)
            current_hash = urllib2.urlopen('https://blockchain.info/q/latesthash').read()

    num_blocks = 0
    while num_blocks != target_blocks:
        try:
            block_data = json.loads(urllib2.urlopen('https://blockchain.info/block/%s?format=json' % current_hash).read())
        except:
            time.sleep(60)
            try:
                block_data = json.loads(urllib2.urlopen('https://blockchain.info/block/%s?format=json' % current_hash).read())
            except:
                time.sleep(900)
                block_data = json.loads(urllib2.urlopen('https://blockchain.info/block/%s?format=json' % current_hash).read())

        block_height = block_data['height']
        transaction_count = block_data['n_tx']

        i = 0
        while i != transaction_count - 1:
            # skip coinbase tx
            i += 1
            transaction_input_count = block_data['tx'][i]['vin_sz']
            num_transactions += 1

            with open('survey.json', 'r+') as writefile:
                data = json.loads(writefile.read())
                data['transactions'] = num_transactions
                writefile.seek(0)
                json.dump(data, writefile)

            j = 0
            while j != transaction_input_count:
                input_index = block_data['tx'][i]['inputs'][j]['prev_out']['tx_index']
                try:
                    input_data = json.loads(urllib2.urlopen('https://blockchain.info/tx-index/%s?format=json' % input_index).read())
                except:
                    time.sleep(60)
                    try:
                        input_data = json.loads(urllib2.urlopen('https://blockchain.info/tx-index/%s?format=json' % input_index).read())
                    except:
                        time.sleep(900)
                        input_data = json.loads(urllib2.urlopen('https://blockchain.info/tx-index/%s?format=json' % input_index).read())

                input_age = input_data['block_height']
                j += 1
                num_inputs += 1

                with open('survey.json', 'r+') as writefile:
                    data = json.loads(writefile.read())
                    data['inputs'] = num_inputs
                    writefile.seek(0)
                    json.dump(data, writefile)

                if (block_height - input_age) <= 144:
                    num_day += 1

                    with open('survey.json', 'r+') as writefile:
                        data = json.loads(writefile.read())
                        data['day'] = num_day
                        writefile.seek(0)
                        json.dump(data, writefile)

                elif 1008 >= (block_height - input_age) > 144:
                    num_week += 1

                    with open('survey.json', 'r+') as writefile:
                        data = json.loads(writefile.read())
                        data['week'] = num_week
                        writefile.seek(0)
                        json.dump(data, writefile)

                elif 4320 >= (block_height - input_age) > 1008:
                    num_month += 1

                    with open('survey.json', 'r+') as writefile:
                        data = json.loads(writefile.read())
                        data['month'] = num_month
                        writefile.seek(0)
                        json.dump(data, writefile)

                elif 52560 >= (block_height - input_age) > 4320:
                    num_year += 1

                    with open('survey.json', 'r+') as writefile:
                        data = json.loads(writefile.read())
                        data['year'] = num_year
                        writefile.seek(0)
                        json.dump(data, writefile)

                elif (block_height - input_age) > 52560:
                    old += 1

                    with open('survey.json', 'r+') as writefile:
                        data = json.loads(writefile.read())
                        data['old'] = old
                        writefile.seek(0)
                        json.dump(data, writefile)

        current_hash = block_data['prev_block']
        num_blocks += 1

while True:
    survey()
@olarks

This comment has been minimized.

olarks commented Aug 7, 2017

My thoughts quickly on the last few posts regarding the September hardfork and the chat logs of the recent meeting. If having a single static ringsize is back on the table I am in full support of the idea and if there is consensus for ringsize 10 as a candidate I also support it. A bump to ringsize 10 not intervening with the new adaptive blocksize algorithm is reassuring and a mere 1kb increase in 2/2 transactions is a small price to pay for ringsize 10.

I am not aware of any recent improvements in the decoy output selection, but a random selection in accordance to approximate age of spent outputs still seems to be a good step in the right direction. I am sure Surae and Sarang can improve further on this though.

@moneromooo-monero

This comment has been minimized.

Contributor

moneromooo-monero commented Aug 9, 2017

I am not aware of any recent improvements in the decoy output selection

ac1aba9

It tries to more or less match the distribution from Miller et al, with no change in overall algorithm.

@knaccc

This comment has been minimized.

knaccc commented Aug 14, 2017

Transaction fee calculator, to help with ring size decisions:

https://www.monero.how/monero-transaction-fee-calculator

@zhizhongzhiwai

This comment has been minimized.

zhizhongzhiwai commented Sep 8, 2017

When the cookie meets the blockchain: Privacy risks of web payments via cryptocurrencies
https://arxiv.org/pdf/1708.04748.pdf

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Sep 10, 2017

This issue should be closed as the september 2017 hardfork is finalized. A new issue can be opened if there is a need for further discussion (referencing this one to avoid duplication).

@moneromooo-monero

This comment has been minimized.

Contributor

moneromooo-monero commented Sep 10, 2017

Seems simpler to continue here. If the date in the title seems a problem, just remove it.

@panopolis

This comment has been minimized.

panopolis commented Sep 10, 2017

Sorry to jump in randomly, but I've read almost all of this discussion and wanted to give my input. First, is that having a mandatory fixed ring size seems best, to make all transactions look as indistinguishable as possible. Having few set options, ie 10, 20, or 40, is also reasonable, but allowing users to choose arbitrary ring sizes appears to have zero benefit and possible downsides.

I also agree that whatever the minimum ring size is, this should also be the default option. I also like the idea of a churn button, as long as a clear explanation (perhaps in a bubble text on mouse over) is provided as to what it does and how/why one should use it. The average person will not have the technical knowledge to understand what's really happening when they press the churn button and will assume it makes them invincible, so it should be as idiot proof as possible.

@moneromooo-monero

This comment has been minimized.

Contributor

moneromooo-monero commented Sep 22, 2017

Hi olarks, have you made any further progress with Bitcoin usage analysis ?

@Gingeropolous

This comment has been minimized.

Contributor

Gingeropolous commented Sep 26, 2017

hi @olarks , there we go we pinged

@iamsmooth

This comment has been minimized.

Contributor

iamsmooth commented Aug 23, 2018

Once again I would point out this is stale (almost a year) and should be closed. Not only have multiple hard forks already occurred, many of the assumptions made in this discussion above no longer apply to the current protocol, but the bitcoin analysis which seems to be an open subtopic was already done by the monerolink paper (and others), and output selection is, for now, being done with a gamma distribution mostly in accordance with the recommendations of that paper.

Whatever issues remain can go into a new, better focused, issue. Any Bitcoin-based (or other) research which eventually gets done can stand on its own without this stale issue.

Closing the issue doesn't mean deleting it of course. It can still be referenced by new issues or even reopened if necessary.

@luigi1111 luigi1111 closed this Aug 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment