Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wallet2: fix rescanning tx via scan_tx [release] #8566

Merged
merged 1 commit into from Jun 27, 2023

Conversation

j-berman
Copy link
Collaborator

@j-berman j-berman commented Sep 14, 2022

Issues the PR solves with scan_tx

  • Calling scan_tx on already processed received txs causes duplicates. (scan_tx causes extra transfers with fee=0 #8531)
  • Scanning txs out of order can corrupt the wallet and render it unable to recover from a reorg correctly.
  • Coinbase txs don't get scanned as miner txs via scan_tx.
  • If a user provides a tx with height > the wallet's known scan height, get_transfers will return inaccurate num confirmation data for that tx until the wallet refreshes.
  • If a wallet has already processed an incoming tx in the pool, the tx is then mined, and then the user calls scan_tx with that tx before the wallet refreshes, the tx will be in a state of both "unconfirmed" and "confirmed" until the wallet refreshes.

UX changes the PR brings

  • scan_tx now requires a trusted daemon edit: if the user attempts to scan a tx that is not their most recent tx. I chose to add this requirement because if a user passes a txid into scan_tx that has a height lower than any txs the wallet already processed, then the wallet will re-request all of the wallet's already processed txs from the daemon. This reveals more of a user's txs to the daemon than the user might otherwise expect. Given the way the code is structured, it would seem the simplest solution to this is to require the feature be used with a trusted daemon.
  • If a user provides a tx with height > the wallet's known scan height, the wallet will continue scanning from that height. If a user misses scanning a tx because of this, they either have to scan the missed tx manually with scan_tx, or use rescan_blockchain.

Future considerations

  • Assume tx2 spends an output received in tx1. tx2 is a sweep to some other wallet's address. The user calls scan_tx(tx2) first before scan_tx(tx1). As is, the wallet is unable to figure out that tx2 spends from tx1 in this case (the user has to re-scan tx2 after tx1). In order to solve this, the wallet would need to query the daemon to see if tx1's associated key image(s) have been spent and in what tx(s).
  • In get_transfers, if a daemon is connected, query the daemon to retrieve the latest height and block reward rather than use the wallet's latest state.
  • When the user provides a tx to scan_tx with height > the wallet's known scan height, don't call refresh inside scan_tx. In this case, scan_tx waits until the wallet is sync'd or the user stops the wallet before responding. Use fast_refresh instead, or just don't do this at all once get_transfers uses the daemon's latest height and block reward.

@plowsof
Copy link
Contributor

plowsof commented Sep 20, 2022

I built the GUI with this PR. I confirmed the issue (duplicate entries in transactions with "Unknown" amount displayed after scanning a tx multiple times). This PR fixes the problem in the GUI, (also when tx is pending :) ) Thanks!

Copy link

@afungible afungible left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested a few cosmetic & improvement comments.

@@ -2217,10 +2217,20 @@ void wallet2::process_new_transaction(const crypto::hash &txid, const cryptonote
}
else if (m_transfers[kit->second].m_spent || m_transfers[kit->second].amount() >= tx_scan_info[o].amount)
{
bool rescanning_tx = txid == m_transfers[kit->second].m_txid;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool rescanning_tx = (txid == m_transfers[kit->second].m_txid);

( ) helps readability.

@@ -2230,6 +2240,21 @@ void wallet2::process_new_transaction(const crypto::hash &txid, const cryptonote
THROW_WALLET_EXCEPTION_IF(amount_iterator == tx_amounts_this_out.end(),
error::wallet_internal_error, "Unexpected values of new and old outputs");
tx_amounts_this_out.erase(amount_iterator);
if (rescanning_tx)
{
if (tx_amounts_this_out.empty())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tx_amounts_this_out - the instance name is strange. Good to be clear as its not self-descriptive.

{
transfer_details &td = m_transfers[kit->second];
LOG_PRINT_L0("Received money: " << print_money(td.m_amount) << ", with tx: " << txid);
if (0 != m_callback)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing with NULL is a better practice, as we're comparing an instance.
and adding a braces even on a single IF is no harm, avoids confusion.

@@ -2230,6 +2240,21 @@ void wallet2::process_new_transaction(const crypto::hash &txid, const cryptonote
THROW_WALLET_EXCEPTION_IF(amount_iterator == tx_amounts_this_out.end(),
error::wallet_internal_error, "Unexpected values of new and old outputs");
tx_amounts_this_out.erase(amount_iterator);
if (rescanning_tx)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under the same "else if" block, you do..
else if (m_transfers[kit->second].m_spent || m_transfers[kit->second].amount() >= tx_scan_info[o].amount)
{
..
if (!rescanning_tx) // 2221
{
}
else
{
}
..
if (rescanning_tx) // 2243
{
}

rescanning_tx is being assigned only once. I see line 2243-2257 is independent & code could be optimized by moving it to line 2221, like:

if (rescanning_tx)
{
..
}
else // place code for if (!rescanning_tx) case
{
LOG_ERROR("Public key " << epee::string_tools::pod_to_hex(kit->first) // line 2223
..
}
.. // line 2234-2242 (common)

basically, the code could be better optimized imo.

@moneromooo-monero
Copy link
Collaborator

tbh I'd skip the entire thing, partly processing seems like a recipe for trouble. Especially if the new scan results do not agree with the previous ones: I see you test amount, but it's not the only thing that could be different. And if it is different, it's a bug somewhere which isn't helped by the testing anyway.

And return some bool to the higher level caller for "already known" so the user knows.

@woodser
Copy link
Contributor

woodser commented Sep 22, 2022

I'm no longer seeing extra transfers with this PR, so that's good.

I do see a couple differences between scanned txs (left) versus regularly synced txs (right). Otherwise the balance is tallying correctly.

image

image

Looks like the only things missing are the number of confirmations (which should be knowable since the wallet has a connection to the daemon), and detecting if it's a miner tx.

@j-berman
Copy link
Collaborator Author

tbh I'd skip the entire thing, partly processing seems like a recipe for trouble

@moneromooo-monero this was my first thought too. When I looked into it, I think that will be fairly involved to do correctly unless I'm missing something. AFAICT will need to check m_payments, m_unconfirmed_payments, m_confirmed_txs, m_unconfirmed_txs to make sure the tx in the wallet's state matches what the daemon says the tx's state is, and if not, re-process. I also think this means the pool needs to be re-processed the same as it's done when scanning, but without re-querying for everything in the pool (I think some factored functions from #8076 would be re-used for this).

This PR seemed like the best quick fix that gets us to a better spot, although the concern for partly processing causing trouble is warranted I think. It seems the best move is to just do it all correctly now.


@woodser seems these are existing issues. Going to take me a bit to get to it but what I have so far in case someone wants to pick it up:

  1. The height used to set confirmations is the wallet's highest scanned block height:

set_confirmations(entry, m_wallet->get_blockchain_current_height(), m_wallet->get_last_block_reward(), pd.m_unlock_time);

uint64_t get_blockchain_current_height() const { return m_light_wallet_blockchain_height ? m_light_wallet_blockchain_height : m_blockchain.size(); }

m_blockchain.push_back(bl_id);

And then confirmations are set to 0 if the tx's height >= blockchain_height:

if (entry.height >= blockchain_height || (entry.height == 0 && (!strcmp(entry.type.c_str(), "pending") || !strcmp(entry.type.c_str(), "pool"))))
entry.confirmations = 0;

As suggested, probably makes sense to use the daemon's height instead to set confirmations.

  1. To pick up miner txs, process_new_transaction inside scan_tx can pass cryptonote::is_coinbase(tx) as the param for bool miner_tx instead of false (evidently miner txs weren't implemented for this feature). Might not be sufficient to solve miner txs, but that's a start.

@woodser
Copy link
Contributor

woodser commented Sep 25, 2022

seems these are existing issues. Going to take me a bit to get to it but what I have so far in case someone wants to pick it up:

Hoping this can be worked out, because it's quite a useful feature.

But either way, this PR makes the existing scan_tx stable as far as I can tell.

@j-berman
Copy link
Collaborator Author

j-berman commented Sep 28, 2022

Approach from @moneromooo-monero to avoid trouble from calling process_new_transaction with an already scanned tx:

1: If a user tries to call scan_tx on a tx with a height lower than the wallet's highest scanned height, they get an error.
2: The highest manually scanned tx's height becomes the wallet's highest scanned height.

The UX downside is that a user may end up skipping their txs, and would need to do a full wallet rescan to fix it:

  • assume a user's wallet has scanned up to block n.
  • the user has received 2 txs: tx1 at block n+1, and tx2 at block n+2.
  • the user calls scan_tx on tx2.
  • the issue: the wallet will have skipped tx1. User must do a full wallet rescan to scan tx1.

A future PR could add the ability for a user to call scan_tx on a tx at a height lower than the last scanned height, so that they don't necessarily have to do a full wallet rescan if they know tx1's hash. Pseudo-implementation: inside scan_tx, pop already processed txs with height >= the tx to be scanned, then re-process all txs in order.

@selsta
Copy link
Collaborator

selsta commented Sep 28, 2022

1: If a user tries to call scan_tx on a tx with a height lower than the wallet's highest scanned height, they get an error.

Doesn't this make scan_tx basically useless?

@j-berman
Copy link
Collaborator Author

j-berman commented Sep 28, 2022

I see what you mean. Sounds like this would have to be the way to go as part of the approach then:

inside scan_tx, pop already processed txs with height >= the tx to be scanned, then re-process all txs in order.

@j-berman
Copy link
Collaborator Author

j-berman commented Sep 28, 2022

Clarifying the approach:

1: If a user calls scan_tx on a tx with a height <= any txs already scanned, pop those already scanned txs and then re-process all in order.
2: The wallet's highest scanned height becomes the max of [highest manually scanned tx height, the wallet's existing highest scanned height]

And to clarify the trouble the approach is seeking to avoid: scanning txs out of order messes up the m_transfers container, which is supposed to be ordered by a tx's height. Example: reorg handling expects transfers to be ordered by height:

auto it = std::find_if(m_transfers.begin(), m_transfers.end(), [&](const transfer_details& td){return td.m_block_height >= height;});

This approach both prevents incorrect duplicate insertions when re-scanning, and ensures a correctly ordered transfers container.

@afungible
Copy link

afungible commented Sep 28, 2022

trouble the approach is seeking to avoid: scanning txs out of order messes up the m_transfers container, which is supposed to be ordered by a tx's height.

I haven't deeply looked at this code. But from a quick look, can't the m_transfers container, std::vector<transfer_details> be sorted forcefully by tx_height at the end of each "scan_tx" irrespective. Could this also be an approach? One downside I can think of is std::sort(..) slowing down with too many tx in container (I doubt though).

@j-berman
Copy link
Collaborator Author

To be honest, process_new_transaction and its complete impact on wallet2's state is such a headache to assess that I can't tell with absolute certainty if it's safe to send in either duplicates or out-of-order txs and then try to fix doing that after. I understand @moneromooo-monero's hesitation sending anything but new (and correctly ordered transactions) into the function. The "scorched earth" strategy of popping first and then re-processing seems easier to sleep at night with.

@j-berman
Copy link
Collaborator Author

A better answer: assume tx2 spends an output from a prior tx1. When we call process_new_transaction on tx2, it's supposed to set tx1's output to spent via set_spent:

monero/src/wallet/wallet2.cpp

Lines 2315 to 2355 in fc907a9

// check all outputs for spending (compare key images)
for(auto& in: tx.vin)
{
if(in.type() != typeid(cryptonote::txin_to_key))
continue;
const cryptonote::txin_to_key &in_to_key = boost::get<cryptonote::txin_to_key>(in);
auto it = m_key_images.find(in_to_key.k_image);
if(it != m_key_images.end())
{
transfer_details& td = m_transfers[it->second];
uint64_t amount = in_to_key.amount;
if (amount > 0)
{
if(amount != td.amount())
{
MERROR("Inconsistent amount in tx input: got " << print_money(amount) <<
", expected " << print_money(td.amount()));
// this means:
// 1) the same output pub key was used as destination multiple times,
// 2) the wallet set the highest amount among them to transfer_details::m_amount, and
// 3) the wallet somehow spent that output with an amount smaller than the above amount, causing inconsistency
td.m_amount = amount;
}
}
else
{
amount = td.amount();
}
tx_money_spent_in_ins += amount;
if (subaddr_account && *subaddr_account != td.m_subaddr_index.major)
LOG_ERROR("spent funds are from different subaddress accounts; count of incoming/outgoing payments will be incorrect");
subaddr_account = td.m_subaddr_index.major;
subaddr_indices.insert(td.m_subaddr_index.minor);
if (!pool)
{
LOG_PRINT_L0("Spent money: " << print_money(amount) << ", with tx: " << txid);
set_spent(it->second, height);
if (0 != m_callback)
m_callback->on_money_spent(height, txid, tx, amount, tx, td.m_subaddr_index);
}
}

Therefore process_new_transaction has to process tx1 before processing tx2, in order to know that tx1's output has been spent in tx2 when processing tx2.

@j-berman j-berman force-pushed the rescan-tx branch 5 times, most recently from de8e6ec to 8b9a833 Compare October 5, 2022 01:49
Copy link
Collaborator

@moneromooo-monero moneromooo-monero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing pool txes seem odd, otherwise looks good.

for (size_t i = slice; i < slice + ntxes; ++i)
req.txs_hashes.push_back(epee::string_tools::pod_to_hex(txids[i]));
req.txs_hashes.push_back(epee::string_tools::pod_to_hex(txids_no_dups[i]));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit unfortunate to add half a dozen lines of pointless rename.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was trying to de-dup in here since I noticed the RPC server doesn't de-dup, updated to just use an unordered_set for txids passed to scan_tx instead

// sort tx_entries by height from lowest height to highest; pool txs to the back
auto cmp_tx_entry = [](const cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry& l, const cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry& r)
{ return l.block_height < r.block_height && !l.in_pool; };
std::sort(sorted_tx_entries.tx_entries.begin(), sorted_tx_entries.tx_entries.end(), cmp_tx_entry);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to be allowing txes not on the chain. Why ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From DM's:

moneromooo: Why do you allow pool txes ?
jberman: Why not? It allows em today too
moneromooo: Because it seems like complication for nohting to me.
moneromooo: But if it works, fine.
moneromooo: I mean, they'd be scanned right after or before already. And there could be a race here if you're not careful.
jberman: Seems reasonable. I don't mind just not processing pool txids and limiting the complexity. It seems the primary use case for the feature was/is restore wallet from higher height -> scan_tx all old txs
moneromooo: Well, if it works now, I'm fine with it. It just seemed odd so I mentioned it.

I don't have a strong opinion on this. I figure trying to maintain the API's behavior is reasonable here and I don't see what exactly is wrong with it as is. Going to keep it unless there's more push back

return sorted_tx_entries;
}
//----------------------------------------------------------------------------------------------------
std::vector<cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry> merge_sorted_tx_entries(std::vector<cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry> l, std::vector<cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry> r)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is no need for the copying.

self.wallet[i] = Wallet(idx = i)
# close the wallet if any, will throw if none is loaded
try: self.wallet[i].close_wallet()
except: pass
res = self.wallet[i].restore_deterministic_wallet(seed = seeds[i])
res = self.wallet[i].restore_deterministic_wallet(seed = SEEDS[i])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seriously ? :D

@j-berman
Copy link
Collaborator Author

Fixed an issue in my code where sweeps wouldn't re-process when calling scan_tx with a single txid that took place at a height below the sweep. Will rebase in a few days.

@j-berman
Copy link
Collaborator Author

Squashed and rebased. Also noting #8617 is dependent on this code.

//----------------------------------------------------------------------------------------------------
std::vector<cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry> merge_sorted_tx_entries(const std::vector<cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry> &l, const std::vector<cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry> &r)
{
std::vector<cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry> merged_txs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merged_txs.reserve(l.size() + r.size());

Comment on lines 5930 to 5950
cryptonote::block_header_response wallet2::get_block_header_by_height(uint64_t height)
{
cryptonote::COMMAND_RPC_GET_BLOCK_HEADER_BY_HEIGHT::request req = AUTO_VAL_INIT(req);
cryptonote::COMMAND_RPC_GET_BLOCK_HEADER_BY_HEIGHT::response res = AUTO_VAL_INIT(res);

bool r;
{
const boost::lock_guard<boost::recursive_mutex> lock{m_daemon_rpc_mutex};
req.height = height;
uint64_t pre_call_credits = m_rpc_payment_state.credits;
req.client = get_client_signature();
r = net_utils::invoke_http_json_rpc("/json_rpc", "getblockheaderbyheight", req, res, *m_http_client, rpc_timeout);
if (r && res.status == CORE_RPC_STATUS_OK)
check_rpc_cost("getblockheaderbyheight", res.credits, pre_call_credits, COST_PER_BLOCK_HEADER);
}

THROW_WALLET_EXCEPTION_IF(!r, tools::error::no_connection_to_daemon, "getblockheaderbyheight");
THROW_WALLET_EXCEPTION_IF(res.status == CORE_RPC_STATUS_BUSY, tools::error::daemon_busy, "getblockheaderbyheight");
THROW_WALLET_EXCEPTION_IF(res.status != CORE_RPC_STATUS_OK, tools::error::wallet_generic_rpc_error, "getblockheaderbyheight", m_trusted_daemon ? res.status : "daemon error");
return res.block_header;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind putting this as a method in NodeRPCProxy instead of here please? See #8605 if you want an example of what I mean. I'm trying to centralize all the raw RPC invocation code into one place.

@j-berman
Copy link
Collaborator Author

Development should be on master, and only finalized code should be PR'd to release branches.

Will stick to this rule on all my PR's going forward

@plowsof
Copy link
Contributor

plowsof commented Oct 25, 2022

When testing with restore deterministic wallet / height == 'now': passing the wallets entire in/out history (~30 txids) in a list to scan_tx i sometimes encounter this error:

Error: '{'error': {'code': -1, 'message': 'deque::_M_new_elements_at_back'}, 'id': 8, 'jsonrpc': '2.0'}'

The times this didn't happen , the get_transfers of the original / restored version are indeed identical

@jeffro256
Copy link
Contributor

jeffro256 commented Oct 25, 2022

Looks like some std::deque ran out of space. The error is thrown in this file at line 832.

Edit 1: The only std::deque declared in wallet2.h is the m_blockchain field of hashchain. Could have something to do with that?

Edit 2: There is a std::deque named output_found in the method process_new_transaction(). This is probably it.

@jeffro256
Copy link
Contributor

@plowsof This didn't happen to be a multisig wallet, did it?

@plowsof
Copy link
Contributor

plowsof commented Oct 25, 2022

edit: jberman has found the cause and has/is implementing a fix
edit2: the error i had is gone / fixed, Thank you!

@jeffro256 just a normal wallet which has 1 unspent output and 33 txids (in and out payments) example here https://paste.debian.net/1258259/ load existing wallet -> get list of txids -> attempt to scan them all. sometimes line 69 (first scan) will not complain - sometimes it will - if the 2nd scan throws the error (even though the first scan completed) it makes my balance 0 at the end. hopefully someone can reproduce, ive tried messing with ordering of txids but doesn't seem to make a difference.

@j-berman j-berman force-pushed the rescan-tx branch 3 times, most recently from ef42c8b to 304e64c Compare October 26, 2022 23:27
@j-berman
Copy link
Collaborator Author

j-berman commented Oct 26, 2022

The issue was: when re-scanning a tx at a height lower than the wallet's latest saved checkpoint m_blockchain.offset(), it would underflow when calling m_blockchain.crop -> m_blockchain.resize(height - m_offset), which then throws the deque error because the underflowed value to resize to is larger than the max size of the underlying container (line 831 in that file you linked @jeffro256).

Fixed by making sure to only call m_blockchain.resize(height - m_offset) when height >= m_offset.


I also noticed an off-by-1 bug where if the height of the tx to re-scan was the same as the wallet's latest saved checkpoint, after re-processing, it wouldn't re-attach the lowest detached block back to the locally saved m_blockchain.


Good bugs to catch @plowsof

@j-berman
Copy link
Collaborator Author

In discussion with @selsta and @plowsof, one of the common use cases for scan_tx today is a GUI simple mode user whose wallet missed a recent tx when scanning (for a wallet to miss a tx when scanning, either there is an unknown bug somewhere in the scanner, or the user was connected to a malicious node withholding txs). The user can quickly ingest the tx via scan_tx without needing to restore from their wallet's creation height and re-do the entire sync process. As such, scan_tx has been a successful support tool for users whose wallets miss recent txs.

Since I added the requirement that the feature be used with trusted daemons in this PR, GUI simple mode users would lose a valuable support feature.

Proposed solution for untrusted daemons

  • If the wallet is connected to an untrusted daemon:
    • has 0 txs after the one the user requests to scan, then scan_tx will continue to completion.
    • has 1+ txs after the one the user requests to scan, scan_tx will fail and the GUI will suggest the user restore their wallet instead.

This maintains scan_tx's value as a support tool as it seems to be used today, and maintains the same privacy risk as the user understands it today.


Additional detail

The reason I added the requirement of trusted nodes in the first place in this PR is because I decided to re-request all of a user's txs from the daemon after the one being scanned in order to re-process all txs in order. The reason I needed to request those txs from the daemon in order to re-process them is because the wallet throws away transaction data after processing a tx and saving the tx to the wallet (specifically RCT data no longer needed like the encrypted amount). Thus, in order to re-process these transactions via process_new_transaction to ensure all txs are re-processed in order when calling scan_tx on an earlier tx, I re-requested the txs from the daemon to get that data back.

Longer term, a larger re-write could seek to avoid needing to do all the re-processing inside process_new_transaction (obviously it's sub-optimal to need to e.g. decrypt amounts again and re-check outputs for ownership etc.). But the cost/benefit of doing such a re-write as part of this fix to scan_tx does not seem worth it to me.

@jeffro256
Copy link
Contributor

jeffro256 commented Oct 28, 2022

has 0 txs after the one the user requests to scan, then scan_tx will continue to completion.

Doesn't this still give away to the untrusted daemon which transaction you're asking to scan, with the implication that you're probably involved with it (likely receiver)?

@j-berman
Copy link
Collaborator Author

Yep, here's how the feature is exposed to a GUI user today:

GUI warning

The implications of exposing txs beyond just the single tx the user requests to scan are more severe and less intuitive. A user could in theory reveal their entire transaction history to a daemon just by requesting to re-scan the first tx in their wallet for example.

@j-berman
Copy link
Collaborator Author

Updated to allow scanning the wallet's most recent tx when connected to an untrusted daemon.

@plowsof
Copy link
Contributor

plowsof commented Oct 29, 2022

Tested the new changes. Confirmed:

  • Latest transaction can be scanned from an untrusted daemon
  • Older transaction from untrusted displays a warning / works with trusted.

Thank you! 👏 here is my test script

@Gingeropolous
Copy link
Collaborator

Gingeropolous commented Dec 21, 2022

couldn't the privacy concerns (of using this with a remote node) be assuaged by using a block height instead of a specific transaction hash? First, there's the basket of potential outputs in the block itself, plus you could double-down and request 10 other random blocks to scan. Or 15. So its like a ring-signature of blocks. But not really, because its not. But you know what i mean.

Sure, it's gonna cost more in processing and bandwidth than 1 specific tx. But its a whole lot less than "hey lets just scan the whole blockchain!"

@j-berman
Copy link
Collaborator Author

j-berman commented Dec 21, 2022

I think using heights instead of hashes in that way would improve the privacy profile of the feature when using an untrusted daemon. Plus it's arguably a UX improvement: as @hMihaiDavid mentioned in #7291, it's easier to write block heights on a paper backup together with a seed, versus writing hashes (though on the downside, it may be a bit harder to cleanly communicate the difference between the "restore height" and a feature to "rescan tx at just this height" in the UI).

As it relates to this PR, I would prefer to keep this PR as is since:

  1. It fixes bugs with scan_tx via hashes today.
  2. It makes sense not to remove scan_tx via hashes even if we were to introduce scanning via heights. Scanning via hashes is already supported in the RPC wallet.
  3. I don't think this PR exposes users to additional privacy risk than what they would already expect in the feature today. In the expectedly rare case where this PR would introduce additional privacy risk, scan_tx requires a trusted daemon.

I think it's worth re-considering adding the ability to scan txs via height(s) in the future, and moving away from using hashes in the GUI for future PR(s).

@woodser
Copy link
Contributor

woodser commented Feb 23, 2023

Just checking on this. Anything preventing progress to merge?


test = 'Checking scan_tx on outgoing pool tx'
for attempt in range(2): # test re-scanning
print(test + ' (attempt ' + str(attempt+1) + ')')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover debug trace

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified this print statement in the latest. I put a print statement here on purpose because the first attempt is testing that scan_tx works on first attempt, then the second attempt is testing that re-scanning works, so the print statement helps clarify it's testing different things


test = 'Checking scan_tx on incoming pool tx'
for attempt in range(2): # test re-scanning
print(test + ' (attempt ' + str(attempt+1) + ')')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover debug trace

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

// sort tx_entries by height from lowest height to highest; pool txs to the back
auto cmp_tx_entry = [](const cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry& l, const cryptonote::COMMAND_RPC_GET_TRANSACTIONS::entry& r)
{ return l.block_height < r.block_height && !l.in_pool; };
std::sort(sorted_tx_entries.tx_entries.begin(), sorted_tx_entries.tx_entries.end(), cmp_tx_entry);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only sort partially, if two of the txes are in the same block (or are both in the txpool). Do you care ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any code that relies on processing txs in the order they appear in a block (or in the pool), just by "height order." So this should suffice

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could screw up cold signing if the m_transfers vector is ordered differently between hot and cold wallets (inputs are refered to by index IIRC). Possibly same for multisig.
All theoretical though. But having m_transfers shuffle itself a bit seems like the kind of bad idea that comes back to bite us later. I suppose it's nothing that can't be fixed later if it does come back though so not a show stopper I suppose.

Copy link
Collaborator Author

@j-berman j-berman Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I missed that. Re-ordering txs like this can cause importing key images from cold wallet to hot wallet to fail (I haven't checked pool/multisig behavior).

This is unfortunate. I don't see a nice way to handle this here without the daemon returning additional data that enables the client to sort txs in the same block (or client re-fetches entire blocks). Checking output_indices seems it would only work for RCT txs, not pre-RCT. Do you see a nice way to handle this?

EDIT: probably what you were thinking: if a tx is re-scanned, I could be sure to keep its position in m_transfers static. Still thinking on this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multisig has the same issue, you were correct there too (get_multisig_composite_key_image relies on m_transfers to be in order).

For a 2/2 multisig, assume wallet A has scanned up to chain tip normally, and wallet B restores at a height > restore height and then uses scan_tx([tx2, tx1]) to scan txs. If tx2 and tx1 are in the same block and wallet B calls scan_tx with the txs out of order, then wallet B will be borked.

Only way to prevent wallet B from getting borked here is for both wallets to order the txs in a consistent way, which goes back to my point above:

I don't see a nice way to handle this here without the daemon returning additional data that enables the client to sort txs in the same block (or client re-fetches entire blocks)

Curious if you've got ideas here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think you can. Keeping existing m_transfers order only works if you've not scanned those before (and different multisig participants might end up with different orders even if you keep previous order within a wallet).
However, getting more data from the daemon isn't a problem since you already get the txes, so getting the blocks they're in does not leak any more information, and it should not be too much extra data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so getting the blocks they're in does not leak any more information, and it should not be too much extra data.

Done in the latest

const wallet2::payment_container &payments, const serializable_unordered_map<crypto::hash, wallet2::confirmed_transfer_details> &confirmed_txs)
{
for (const auto &td : transfers)
if (td.m_block_height >= height && expected_txids.find(td.m_txid) == expected_txids.end())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has >= for a function named "above", intended ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to has_unexpected_tx_at_height_or_above


// If the highest scan_tx height > the wallet's known scan height, then the
// wallet scanner should continue scanning from that tx's height and "pretend"
// it scanned all txs up to that height. This gurantees txs are processed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"guarantees"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as per this suggestion

process_scan_txs(txs_to_scan, txs_to_reprocess, tx_hashes_to_reprocess, dbd);
reattach_blockchain(m_blockchain, dbd);

// If the highest scan_tx height > the wallet's known scan height, then the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment mostly, but not entirely, conveys the rationale for skipping forward in the blockchain. I suggest:

If the highest scan_tx height exceeds the wallet's known scan height, then the wallet should skip ahead to the scan_tx's height in order to service the request in a timely manner. Skipping unrequested transactions avoids generating sequences of calls to process_new_transaction which process transactions out-of-order, relative to their order in the blockchain, as the process_new_transaction implementation requires transactions to be processed in blockchain order. 

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@moneromooo-monero moneromooo-monero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of renaming and otherwise spammy changes in that new patch, but the core seems to do what it says, so approved given I was fine with the previous one except the sorting.

- Detach & re-process txs >= lowest scan height
- ensures that if a user calls scan_tx(tx1) after scanning tx2,
the wallet correctly processes tx1 and tx2
- if a user provides a tx with a height higher than the wallet's
last scanned height, the wallet will scan starting from that tx's
height
- scan_tx requires trusted daemon iff need to re-process existing
txs: in addition to querying a daemon for txids, if a user
provides a txid of a tx with height *lower* than any *already*
scanned txs in the wallet, then the wallet will also query the
daemon for all the *higher* txs as well. This is likely
unexpected behavior to a caller, and so to protect a caller from
revealing txid's to an untrusted daemon in an unexpected way,
require the daemon be trusted.
@plowsof
Copy link
Contributor

plowsof commented Mar 14, 2023

sanity checked the behaviour of adding missing "latest" / "oldest" / scanned multiple tx's - and 'it works' after squashing still :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet