New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] ZIP 308: Sprout to Sapling Migration #197

Open
wants to merge 7 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@daira
Copy link
Contributor

daira commented Dec 17, 2018

Co-authored-by: Eirik Ogilvie-Wigley eirik@z.cash
Co-authored-by: Daira Hopwood daira@jacaranda.org

daira and others added some commits Nov 27, 2018

WIP: Sprout->Sapling migration ZIP.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
Completed draft (with open questions) of ZIP 308.
Co-authored-by: Eirik Ogilvie-Wigley <eirik@z.cash>
Signed-off-by: Daira Hopwood <daira@jacaranda.org>

@daira daira added the review needed label Dec 17, 2018

daira added some commits Dec 17, 2018

Formatting improvements for ZIP 308.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
Additional formatting improvements for ZIP 308.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
setenablemigration true|false

It is intentional that there is no facility to change options associated with
the migration, since such options can distinguish users.

This comment has been minimized.

@str4d

str4d Jan 3, 2019

Contributor

This leaves it unclear as to where the migrated funds end up. I agree that minimizing the configuration is a good idea, but would it be useful to allow any Sapling address to be specified as the target? Or possibly any Sapling address with the proviso that it is in the local wallet? Or do we just say "migrated funds will end up in account 0 of the wallet"? In this case, we'd need to automatically generate the first Sapling address if the user had not done so already, but that seems reasonable.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

There's no privacy reason not to do this; I didn't include it because it is simpler to have no configuration.

Note that the wallet needs to be able to tell which transactions are migration transactions in order to implement the getmigrationstatus RPC. If the wallet generates the destination address itself then it's easy for it to identify transactions sent to it. However, there are other ways to do this.

We had consensus that we should do it, given the above caveat.


When migration is enabled, a node will send up to 5 transactions for inclusion
in each block with height a multiple of 500 (that is, they are sent immediately
after seeing a block with height 499, modulo 500).

This comment has been minimized.

@str4d

str4d Jan 3, 2019

Contributor

Is the intention that migrating nodes always send 5 transactions at once, until their available balance is less than 0.05 ZEC? Or that a wallet should generate as many transactions as it can for each window, modulo performance concerns? The former leaks information about when the migration process ends for each node (but also might have issues for very-low-performance nodes that can't generate 5 transactions within a 10-block window); the latter leaks information linking transaction bundles to node performance. It is probably a good idea to be more explicit about the intended node behaviour here.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

The intended behaviour was the former. I have no strong opinion here.


The above strategy has several "magic number" parameters:

* the interval between batches (500 blocks)

This comment has been minimized.

@str4d

str4d Jan 3, 2019

Contributor

This magic number should take into account the block interval changes that will result from zcash/zcash#3672 and zcash/zcash#3690, so it may want to have two values, gated on Blossom activation.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

@jackgavigan says we should leave it as it is and update it when the shorter-block-interval ZIP is finalized. That seems reasonable to me.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

I think we have enough time until Blossom activation that this won't be a problem.

Nodes MUST maintain a boolean state variable recording whether migration is
enabled. This can be set or reset using the following RPC call::

setenablemigration true|false

This comment has been minimized.

@adityapk00

adityapk00 Jan 3, 2019

Can we have the source(zc) and destination(zs) addresses be a part of this RPC call? That will help users gain more control over where the funds end up.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

Setting which source addresses are migrated is kind-of a slippery slope toward greater complexity. Setting the destination address seems reasonable (see @str4d's comment above).

This comment has been minimized.

@daira

daira Jan 22, 2019

Author Contributor

@str4d, @Eirik0 and I decided that we would have a config parameter to (optionally) set the destination address, but not allow it to be set through the RPC.

"unmigrated_amount": nnn.n,
"unconfirmed_migrated_amount": nnn.n,
"confirmed_migrated_amount": nnn.n,
"confirmed_migration_transactions": nnn,

This comment has been minimized.

@adityapk00

adityapk00 Jan 3, 2019

what exactly is confirmed_migration_transactions? Is it the number of transactions that have been executed or total that will be executed. Either way, it'd be great to have 2 fields: total_estimated_migration_steps and completed_migration_steps, so we can show a progress bar on the UI.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

It's intended to be the number of migration transactions for this wallet that are confirmed on the blockchain (currently, that have >= 10 confirmations).

"confirmed_migration_transactions": nnn,
"time_started": ttt, // Unix timestamp
"estimated_time_remaining_seconds": ttt
}

This comment has been minimized.

@adityapk00

adityapk00 Jan 3, 2019

Maybe the getmigrationstatus RPC can also return a list of TxIDs that are a part of this migration. It'll be nice to show this in the UI.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

This is implementable, and might help with troubleshooting. There's a minor privacy issue if the list were to leak, but no worse than for other existing RPC calls. I think this should be included (JSON list of txids).

@adityapk00

This comment has been minimized.

Copy link

adityapk00 commented Jan 3, 2019

This is an excellent proposal, and I think overall has everything I was hoping it will have. I had some comments on the RPC interface, but otherwise, this is excellent!

The design recovers from failed operations to the extent possible.

Each user sends a number of migration transactions that is not too small (to
obscure their total amount).

This comment has been minimized.

@tromer

tromer Jan 5, 2019

This should not by itself be a requirement. It may follow from "mitigate information leakage" leakage above, but if we find a trick to avoid information leakage even with a single transaction, that would be great.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

I see your point that this is a design choice rather than a requirement.

This comment has been minimized.

@tromer

tromer Jan 30, 2019

☑ Fixed in the latest version.


Privacy is increased when the times at which to send transactions are
coordinated between nodes. We choose to send a batch of transactions at each
coordinated time. Sending multiple transactions in each batch ensures that:

This comment has been minimized.

@tromer

tromer Jan 5, 2019

One alternative is a randomized approach, where each node sends at random times (specifically: random block-depth delays drawn from an exponential distribution).
The proposed fixed-depth approach leaks exactly whether the node's migration progress was started in time for the (n*500-5)-th block, and moreover risks congestion at the multiple-of-500 blocks. The randomized approach lessen the start-time leakage and avoid the congestion.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

I have no good intuition about this.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

It's very unlikely that there will be enough migration transactions to cause a mempool size overflow, I think.

This comment has been minimized.

@tromer

tromer Jan 30, 2019

The randomized approach is more elegant (it's stateless and temporally invariant), and offers slightly better privacy (because for the aforementioned timing uncertainty).

The 500-blocks version has the advantage of simplicity and predictability.

Overall I would weakly prefer the randomized version, but it's OK to keep the 500-block if need be.

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor

In another comment I mention that we might want a random distribution after a trigger height (a third option that draws from the above two). But I'm inclined to support whichever option ends up being the simplest to implement.

section. It includes the time spent waiting for the first batch to be sent.


How much to send in each transaction

This comment has been minimized.

@tromer

tromer Jan 5, 2019

Alternative approach: at each sending slot (e.g., multiple-of-500-blocks), send the largest power-of-2 zatoshis that's not larger than the Sprout balance, not smaller than 0.01 ZEC, and not larger than 1000 ZEC.

This trivial algorithm fulfills the given rationale (and moreover makes the Sapling funds available much faster), so additional rational is needed to justify the more complex proposal.

This comment has been minimized.

@tromer

tromer Jan 5, 2019

A bit better, to avoid potentially-recognizable monotonically-decreasing sequences: at every sending slot, pick a random index i such that the i-th bit in the binary representation of the Sprout balance is 1, and migrate 2^i zashois. (With the obvious adjustments for too-small and too-large amounts.)

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

I think this does leak more information, and the suggested mitigation for the recognizable-sequence problem is as complicated as the original proposal.

This comment has been minimized.

@tromer

tromer Jan 30, 2019

For one, I'm deeply suspicious of the magic numbers in the proposed algorithm (which is
draw from {1,...,99}*10^{6,...,9} until it's less than the balance). Why powers of 10, and why 2 digits in the mantissa?
Let's take it to the binary extreme: draw from 2^{20,...36} until it's less than the balance.
That makes reasoning easier, and strengthens the knapsack rationale. It also avoids the direct leakage of bit value from my 2nd proposal.

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor

Powers of 10 means that there is never any small change in the drawn amount; the potential random coins that can be drawn have a floor of 0.01 ZEC. Users use their funds in base 10, not base 2, so the distribution of the resulting Sapling notes is closer to what a user might actually have created, and thus less identifying information might be leaked down the line when those post-migration notes themselves are spent.

As for the two-digit mantissa, it makes the random coins at the edge of the distribution less likely than those in the middle; I'm not sure how I feel about that over a purely-flat distribution.

I do agree that a simpler algorithm here would be nice, but I'm currently more in favour of generating base-10 random coins than base-2.

inference).


Other design decisions

This comment has been minimized.

@tromer

tromer Jan 5, 2019

What about selective disclosure, and in particular payment disclosure blobs, for these transactions? Are they (experimentally...) supported as for normal transactions?

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

They are basically identical to normal transactions other than how they are triggered.

This comment has been minimized.

@tromer

tromer Jan 30, 2019

I expect so, but let's make it an explicit "should" in the ZIP.

"transparent value pool" [#transparent-value-pool]_. The primary motivation
for this is to allow detection of any overall inflation of the Zcash monetary
base, due to exploitation of possible vulnerabilities in the shielded
protocols or their implementation.

This comment has been minimized.

@bitcartel

bitcartel Jan 7, 2019

Contributor

mention mpc?

Sapling z-addresses, provided the node is working.

It should take a "reasonable" length of time to complete the transfer;
typically less than 2 months.

This comment has been minimized.

@bitcartel

bitcartel Jan 7, 2019

Contributor

What does "reasonable" mean to a user, given that migration could actually be performed in a few minutes or hours or days... why 2 months? Since a user might need to spend the funds and can't wait that long, I think it might be a good idea for implementations to provide users with a range of choices when considering the duration of the migration.

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

My intuition (not strongly supported) is that giving users options significantly reduces their privacy. But maybe 2 months is too long (for all users)?

This comment has been minimized.

@daira

daira Jan 22, 2019

Author Contributor

See the table about expected migration times in the Rationale section; the requirement is for less than 2 months but the actual time will be significantly less.

There is sufficient information available to debug failed transactions that
are part of the migration.

The design recovers from failed operations to the extent possible.

This comment has been minimized.

@bitcartel

bitcartel Jan 7, 2019

Contributor

How does it recover? Just rerun the operation based on the current remaining balance to migrate?

This comment has been minimized.

@daira

daira Jan 21, 2019

Author Contributor

The wallet doesn't keep much state (apart from whether the migration is enabled), so it automatically recovers on the next batch.

@daira

This comment has been minimized.

Copy link
Contributor Author

daira commented Jan 21, 2019

Some feedback from a person who I don't yet have explicit permission to identify:

My question and concern are following:

  • Should we somehow notify the user that block mod 500 is coming so he leaves the wallet intentionally open or should this be strictly random? I guess average user will not know how it works and not saying to him that block mod 500 is about to happen might increase the anonymity of a transfer because even user will not know that a transfer is about to happen.

I don't think there's a need to notify the user. We do need to document that zcashd should be left running during a migration.

  • Every 500 blocks means basically every 21 hours so it is fine as I was scared that it would be repeating every 24 hours so it may hit to someones sleeping pattern. Probably just a wierd idea but What if this will be determined based on previous blockhash making it random? The same may go for the amount distribution functions.

Yes, this was intentional. I think a random distribution is unnecessarily complicated, although @tromer did also suggest that.

  • Then to what address will be the funds sent to? Can user select his sapling address the funds will be migrated to or it will be always a new one (which is I guess better from anonymity standpoint)? Will those transactions be somehow marked so we know that this transaction comes from the migration method so we can notify the user about it - I guess that the “confirmed_migration_transactions”: nnn, in the ‘getmigrationstatus’ will be returning an object of migration transactions currently happening/done in the current batch. But what about the past transactions (assuming user has a lot of ZEC)? This information should be stored just locally.

Several people suggested being able to configure the destination address, and I think we will include this.

  • An advantage of knowing the destination address prior running migration may be in the case if something goes wrong and wallet.dat gets corrupted. Then there is a chance that user has backed up the private key of the address prior it got corrupted.

Yes, that's a good point.

Address review comments.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>

@daira daira force-pushed the daira:zip-0308 branch from d4ce847 to 316ffc2 Jan 23, 2019

daira added some commits Jan 23, 2019

Resolve an ambiguity concerning fees.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>
Resolve a TODO.
Signed-off-by: Daira Hopwood <daira@jacaranda.org>

@daira daira force-pushed the daira:zip-0308 branch from 316ffc2 to f825dfe Jan 23, 2019

@mms710

This comment has been minimized.

Copy link

mms710 commented Jan 29, 2019

@str4d @tromer @bitcartel @adityapk00 Can you review the responses provided to your comments and let us know if you're okay with approving this PR?

@adityapk00

This comment has been minimized.

Copy link

adityapk00 commented Jan 29, 2019

Looks good to me!

@gojomo

This comment has been minimized.

Copy link

gojomo commented Jan 30, 2019

It's quite likely I'm overlooking something obvious, but it's not at all clear to me why there's the batching-of-transactions at a common (500-block) interval. The transactions will still look like "migration" transactions (filled with a certain value of sproutZEC, paying saplingZEC). And a hypothetical network-omniscient adversary will still see the IPs from which the batches of 1-5 transactions emerge. And I can't think of heuristics used by lesser adversaries that get confused by a bunch of transactions appearing at arbitrary intervals. What is the specific de-anonymization fear that this synchronization addresses?

@bitcartel

This comment has been minimized.

Copy link
Contributor

bitcartel commented Jan 31, 2019

The ZIP currently does not mention specifying a from address (mentioned in earlier comment),also:

If this option is not present then the migration destination address is
the address for Sapling account 0, with the default diversifier [#zip-0032]_

Thus it appears the ZIP has been written from the perspective of a node being used by just a single user (making it safe to migrate funds to the default diversifier address).

However, a node operator might have multiple users in the same wallet, and may have the requirement of migrating users on an ad-hoc basis i.e. users who opt-in.

To avoid service disruption, a node operator might want to reduce the time it takes to migrate (within reason) so perhaps there could be range of confirmaton / block magic values.

Also, in a multi-user environment, the node operator would probably want support for multiple (concurrent) migrations.

Some of this might be out of scope for the ZIP but it would be worth considering how the ZIP would be interpreted by node operators in a multi user environment.

@daira

This comment has been minimized.

Copy link
Contributor Author

daira commented Feb 7, 2019

@bitcartel I'll add an explicit comment to the ZIP saying that that the RPC has been designed assuming a single-user wallet. (The migration algorithm itself isn't dependent on that.)


The Zcash consensus rules prohibit direct transfers from Sprout to Sapling
z-addresses, unless the amount is revealed by sending it through the
"transparent value pool" [#transparent-value-pool]_. The primary motivation

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor

Probably worth clarifying that this does not require transparent addresses (i.e. funds can be sent through the dynamic transparent value pool inside a transaction, not just the static transparent value pool that is the funds resting in t-addresses).

height minus 1, then this could leak information. Therefore, for target
height N, implementations SHOULD start generating the transactions at around
height N-5. Each migration transaction SHOULD specify an anchor at height N-10
for each Sprout JoinSplit description.

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor
Suggested change Beta
for each Sprout JoinSplit description.
for each Sprout JoinSplit description, and SHOULD specify an expiry height of `N + DEFAULT_TX_EXPIRY_DELTA`.

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor

We do specify later on that the expiry height should be set as for other transactions, so maybe this change isn't necessary.

In deciding the amount to send in each transaction, the strategy does not
take account of the values of individual Sprout notes, only the total amount
remaining to migrate. Can a strategy that is sensitive to individual note
values improve privacy?

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor

Given we are only requiring that migration transactions be hidden amongst themselves, what information can we infer from the number of inputs and outputs, knowing that it is part of a migration?

  • On the receive side, there will be a single Sapling output for every one of these transactions, so everything will be identical there.
  • On the spend side, we will have however many input notes are required to reach the desired target value, and at least one change note. If more than two notes are required to fun the selected amount, then we will have one JoinSplit per input. If the selected amount can be funded with one or two notes, we will have a single JoinSplit.

If the selection strategy is not sensitive to individual note values, then it is possible to end up creating a transaction that consumes most or all of the Sprout notes in the wallet. That combined with the amount revealed might be usable as an indicator that the revealed value is close-to-all of the (remaining) wallet balance. This might particularly affect e.g. wallet developers who received many small notes, which would appear as spikes in the number of inputs. On the other hand, the note-selection strategy in zcashd is largest-first, which tends to generate fragmentation (though I do not have any metrics on this).

Separately, there's a potential performance issue, where if the selected value requires many notes to fund, and each note requires a 20-second proof generation, the time to generate a single transaction may exceed the window this ZIP aims for, reducing the indistinguishability of the transaction (either because it appears later, or has to choose an earlier anchor).

So my feeling is that, while we don't necessarily need to be sensitive to the individual note values, we probably should be sensitive to the number of notes required to fill a given random coin. At a minimum, we should probably re-draw the coin if filling it would require an excessive number of notes.

by a subset of nodes that are performing migrations, in order to cause those
nodes to send migration batches at a different time, so that they may be
distinguished. Is there anything further we can do to mitigate this
vulnerability?

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor

Assuming we aren't concerned here about a 50%+1 adversary, the problematic attacks here involve network partitioning, BGP route manipulation, etc. in order to delay block propagation. This would exacerbate an existing problem, which is the inherent propagation time across the network. We do not assume network privacy, so I'm not sure there's much we can do here.

If we, for example, started broadcasting transactions at some randomised interval after observing a boundary block, that would obscure the exact time we detected the block (to within some delta), with the downside that there is less time for our transactions to propagate and potentially get mined into the target block. I'm not sure the delta can be large enough to thwart a motivated network adversary, without compromsing the ability to get transactions mined promptly. But I don't have any hard numbers for this, and maybe it is possible. It seems more complicated to implement, regardless.

iff it has:
* one or more Sprout JoinSplits with nonzero vpub_new field; and
* no Sapling Spends, and;
* one or more Sapling Outputs.

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor

When would a migration transaction have more than one Sapling output?


Privacy is increased when the times at which to send transactions are
coordinated between nodes. We choose to send a batch of transactions at each
coordinated time. Sending multiple transactions in each batch ensures that:

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor

In another comment I mention that we might want a random distribution after a trigger height (a third option that draws from the above two). But I'm inclined to support whichever option ends up being the simplest to implement.

section. It includes the time spent waiting for the first batch to be sent.


How much to send in each transaction

This comment has been minimized.

@str4d

str4d Feb 14, 2019

Contributor

Powers of 10 means that there is never any small change in the drawn amount; the potential random coins that can be drawn have a floor of 0.01 ZEC. Users use their funds in base 10, not base 2, so the distribution of the resulting Sapling notes is closer to what a user might actually have created, and thus less identifying information might be leaked down the line when those post-migration notes themselves are spent.

As for the two-digit mantissa, it makes the random coins at the edge of the distribution less likely than those in the middle; I'm not sure how I feel about that over a purely-flat distribution.

I do agree that a simpler algorithm here would be nice, but I'm currently more in favour of generating base-10 random coins than base-2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment