Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Logging and persist async operation for Sapling migration #4002

Merged
merged 3 commits into from May 13, 2019

Conversation

6 participants
@Eirik0
Copy link
Contributor

commented May 9, 2019

Currently zcashd will automatically remove the last async migration operations when it reaches the height where it sends the transactions it just made. This is not in alignment with other async operations, which are not removed until a node is restarted or a user calls z_getoperationresult. This PR removes the calls to pop the operations so that they can be accessed and reviewed later. In this PR I also correct the operation's amount_migrated field to exclude the transaction fee (this field existed for debugging purposes, but should be consistent with z_getmigrationstatus), and have included the list of migration txids in the operation's result (this is similar to the async rpcs such as z_sendmany).

Documentation: none needed.

Test plan:

  • After migration transactions have been created, list the opids and call z_getoperationresult on them.
  • Check that the operations' amount_migrated fields exclude the fee.

@Eirik0 Eirik0 changed the title Do not automatically remove async migration operations and return txids Add Logging and persist async operation for Sapling migration May 9, 2019

@daira

daira approved these changes May 10, 2019

Copy link
Contributor

left a comment

ut(ACK+cov)

@zcash zcash deleted a comment from Eirik0 May 10, 2019

@daira daira added this to Needs Prioritization in Arborist Team via automation May 10, 2019

@@ -67,6 +67,7 @@ void AsyncRPCOperation_saplingmigration::main() {
}

bool AsyncRPCOperation_saplingmigration::main_impl() {
LogPrint("zrpcunsafe", "Beginning AsyncRPCOperation_saplingmigration. id=%s\n", getId());

This comment has been minimized.

Copy link
@bitcartel

bitcartel May 10, 2019

Contributor

%s should be at the start of the log message, to follow the convention used by other async operations. Also makes it easier to visually identify when examining log file or perform parsing based on opid.

@@ -126,21 +140,30 @@ bool AsyncRPCOperation_saplingmigration::main_impl() {
builder.AddSaplingOutput(ovkForShieldingFromTaddr(seed), migrationDestAddress, amountToSend - FEE);
CTransaction tx = builder.Build().GetTxOrThrow();
if (isCancelled()) {
LogPrint("zrpcunsafe", "%s: Canceled. Stopping.\n");

This comment has been minimized.

Copy link
@bitcartel

bitcartel May 10, 2019

Contributor

Runtime error will occur. Format specifier without any data to output. I guess you want a getId() here.

int noteIndex = 0;
CCoinsViewCache coinsView(pcoinsTip);
do {
CAmount amountToSend = chooseAmount(availableFunds);
auto builder = TransactionBuilder(consensusParams, targetHeight_, pwalletMain, pzcashParams, &coinsView, &cs_main);
LogPrint("zrpcunsafe", "%s: Beginning creating transaction with Sapling output amount=%s\n", getId(), amountToSend - FEE);

This comment has been minimized.

Copy link
@bitcartel

bitcartel May 10, 2019

Contributor

Possible runtime error here. Format specifier is %s, but passing in CAmount. Missing FormatMoney() here?

@@ -108,6 +113,15 @@ bool AsyncRPCOperation_saplingmigration::main_impl() {
}
availableFunds -= fromNoteAmount;
for (const CSproutNotePlaintextEntry& sproutEntry : fromNotes) {
std::string data(sproutEntry.plaintext.memo().begin(), sproutEntry.plaintext.memo().end());
LogPrint("zrpcunsafe", "%s: Adding Sprout note input (txid=%s, vjoinsplit=%d, ciphertext=%d, amount=%s, memo=%s)\n",

This comment has been minimized.

Copy link
@bitcartel

bitcartel May 10, 2019

Contributor

ciphertext= is incorrect, as the uint8_t index of the joinsplit output is passed in here. Should be jsoutindex= (and in other asyncrpcoperation classes, where this line was copied from).

asyncrpcoperation_mergetoaddress.cpp:648:            LogPrint("zrpcunsafe", "%s: spending note (txid=%s, vjoinsplit=%d, ciphertext=%d, amount=%s, height=%d, confirmations=%d)\n",
asyncrpcoperation_saplingmigration.cpp:117:            LogPrint("zrpcunsafe", "%s: Adding Sprout note input (txid=%s, vjoinsplit=%d, ciphertext=%d, amount=%s, memo=%s)\n",
asyncrpcoperation_sendmany.cpp:787:            LogPrint("zrpcunsafe", "%s: spending note (txid=%s, vjoinsplit=%d, ciphertext=%d, amount=%s, height=%d, confirmations=%d)\n",
asyncrpcoperation_sendmany.cpp:1051:        LogPrint("zrpcunsafe", "%s: found unspent Sprout note (txid=%s, vjoinsplit=%d, ciphertext=%d, amount=%s, memo=%s)\n",

This comment has been minimized.

Copy link
@bitcartel

bitcartel May 10, 2019

Contributor

@Eirik0 commit 5600562 is missing the fix for mergetoaddress and one (of the two) in sendmany.

This comment has been minimized.

Copy link
@Eirik0

Eirik0 May 10, 2019

Author Contributor

Fixed and squashed in to the same commit.

++numTxCreated;
amountMigrated += amountToSend;
amountMigrated += amountToSend - FEE;

This comment has been minimized.

Copy link
@bitcartel

bitcartel May 10, 2019

Contributor

Why was the FEE excluded previously, but now included?

This comment has been minimized.

Copy link
@Eirik0

Eirik0 May 10, 2019

Author Contributor

This is to prevent confusion between this value and the amount_migrated returned by z_getmigrationstatus. It probably should have excluded the FEE originally.

@@ -604,7 +604,7 @@ void CWallet::RunSaplingMigration(int blockHeight) {
// height N-5
if (blockHeight % 500 == 495) {
std::shared_ptr<AsyncRPCQueue> q = getAsyncRPCQueue();
std::shared_ptr<AsyncRPCOperation> lastOperation = q->popOperationForId(saplingMigrationOperationId);
std::shared_ptr<AsyncRPCOperation> lastOperation = q->getOperationForId(saplingMigrationOperationId);
if (lastOperation != nullptr) {
lastOperation->cancel();

This comment has been minimized.

Copy link
@bitcartel

bitcartel May 10, 2019

Contributor

So now the base class cancel() will be invoked. Is that the intended behaviour? Will the cancel state be set correctly i.e. is the operation currently in READY state to transition to CANCEL state?

This comment has been minimized.

Copy link
@bitcartel

bitcartel May 10, 2019

Contributor

Also note that the operation will still exist in an internal job map used by AsyncRPCQueue. That means, the queue will still try and process the operation. If the cancel state has been set correctly, this is fine. If not, the operation will be executed. Popping the operation removes the operation from the internal job map, which means the queue will have nothing to execute.

/**
 * Return the operation for a given operation id and then remove the operation from internal storage.
 */
std::shared_ptr<AsyncRPCOperation> AsyncRPCQueue::popOperationForId(AsyncRPCOperationId id) {
    std::shared_ptr<AsyncRPCOperation> ptr = getOperationForId(id);
    if (ptr) {
        std::lock_guard<std::mutex> guard(lock_);
        // Note: if the id still exists in the operationIdQueue, when it gets processed by a worker
        // there will no operation in the map to execute, so nothing will happen.
        operation_map_.erase(id);
    }
    return ptr;
}

...
void AsyncRPCQueue::run(size_t workerId) {
   ...
        if (!operation) {
            // cannot find operation in map, may have been removed
        } else if (operation->isCancelled()) {
            // skip cancelled operation
        } else {
            operation->main();
        }

This comment has been minimized.

Copy link
@Eirik0

Eirik0 May 10, 2019

Author Contributor

So now the base class cancel() will be invoked. Is that the intended behaviour? Will the cancel state be set correctly i.e. is the operation currently in READY state to transition to CANCEL state?

The intent of this is to cancel the operation if it is already running, this however does not work, and the fix does not seem to be simple (see #3988). What will currently happen is if a node is not finished generating migration transactions by the time we try to send them, then we will send what we have, but it will continue to generate transactions after the fact. These extra transactions will then be discarded in the next round of migrations.

This comment has been minimized.

Copy link
@Eirik0

Eirik0 May 10, 2019

Author Contributor

Also note that the operation will still exist in an internal job map used by AsyncRPCQueue. That means, the queue will still try and process the operation. If the cancel state has been set correctly, this is fine. If not, the operation will be executed. Popping the operation removes the operation from the internal job map, which means the queue will have nothing to execute.

This has been changed from pop to get

@Eirik0 Eirik0 requested a review from bitcartel May 10, 2019

@Eirik0 Eirik0 force-pushed the Eirik0:migration-status-info branch from 5600562 to 7f3b00b May 10, 2019

@LarryRuane

This comment has been minimized.

Copy link
Contributor

commented May 10, 2019

Suggestion, we can probably catch many of the LogPrint() format string errors with this modification:

$ git diff
diff --git a/qa/rpc-tests/sprout_sapling_migration.py b/qa/rpc-tests/sprout_sapling_migration.py
index cd67f9ea8..3f9c0c3d3 100755
--- a/qa/rpc-tests/sprout_sapling_migration.py
+++ b/qa/rpc-tests/sprout_sapling_migration.py
@@ -45,9 +45,10 @@ class SproutSaplingMigration(BitcoinTestFramework):
         # Add migration parameters to nodes[0]
         extra_args[0] = extra_args[0] + [
             '-migration',
-            '-migrationdestaddress=' + SAPLING_ADDR
+            '-migrationdestaddress=' + SAPLING_ADDR,
+            '-debug=zrpcunsafe',
         ]
-        assert_equal(4, len(extra_args[0]))
+        assert_equal(5, len(extra_args[0]))
         assert_equal(2, len(extra_args[1]))
         return start_nodes(4, self.options.tmpdir, extra_args)
 

With this change, running the test (without Eirik's latest commit) generates the expected error:

=== Running testscript sprout_sapling_migration.py ===
Initializing test directory /tmp/testQ9vGJ_
Running test using '-migrationdestaddress'...
Mining blocks...
waiting for async operation opid-289167a8-8aba-471f-a6f2-6405b54497ee
Mining to block 494 % 500...
migration operation: [{u'status': u'failed', u'creation_time': 1557521612, u'id': u'opid-0d6361c0-108c-431c-a703-9cb85139bcbc', u'target_height': 500, u'error': {u'message': u'runtime error: tinyformat: Too many conversion specifiers in format string', u'code': -1}, u'method': u'saplingmigration'}]
waiting for async operation opid-0d6361c0-108c-431c-a703-9cb85139bcbc
Assertion failed: (left == right)
  left: <None>
 right: <runtime error: tinyformat: Too many conversion specifiers in format string>
  File "/home/larry/zcash/qa/rpc-tests/test_framework/test_framework.py", line 121, in main
    self.run_test()
  File "/home/larry/zcash/qa/rpc-tests/sprout_sapling_migration.py", line 162, in run_test
    self.run_migration_test(self.nodes[0], sproutAddr0, SAPLING_ADDR, 500)
  File "/home/larry/zcash/qa/rpc-tests/sprout_sapling_migration.py", line 86, in run_migration_test
    result = wait_and_assert_operationid_status_result(node, operationstatus[0]['id'])
  File "/home/larry/zcash/qa/rpc-tests/test_framework/util.py", line 412, in wait_and_assert_operationid_status_result
    assert_equal(in_errormsg, errormsg)
  File "/home/larry/zcash/qa/rpc-tests/test_framework/util.py", line 362, in assert_equal
    raise AssertionError("(left == right)%s\n  left: <%s>\n right: <%s>" % (message, str(expected), str(actual)))
Stopping nodes
Cleaning up
Failed
!!! FAIL: sprout_sapling_migration.py !!!

The only downside is that more lines get written to the test's /tmp/testQ9vGJ_/node0/regtest/debug.log but that temp directory gets deleted anyway at the end of the test.

I'd even suggest enabling all debug logging (for all RPC tests), and run the full test suite -- I bet we'd uncover many more of these format string bugs. I may just try that.

@Eirik0

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

@LarryRuane I like your suggestion. I will add this.

@Eirik0 Eirik0 force-pushed the Eirik0:migration-status-info branch from 7f3b00b to 42daf46 May 10, 2019

@Eirik0

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

The error found by @LarryRuane was in the TransactionBuilder that is now fixed too.

@bitcartel
Copy link
Contributor

left a comment

ACK with sprout-to-sapling test passing.

@bitcartel

This comment has been minimized.

Copy link
Contributor

commented May 10, 2019

I rebased locally and there were no conflicts.

@LarryRuane
Copy link
Contributor

left a comment

utACK (well... some testing, only ran the python test)

@daira

daira approved these changes May 10, 2019

Copy link
Contributor

left a comment

re-utACK

@mms710 mms710 added this to the v2.0.5 milestone May 10, 2019

@mms710 mms710 moved this from Needs Prioritization to Bugs/Security Issues/TechDebt Backlog in Arborist Team May 10, 2019

@mms710 mms710 moved this from Bugs/Security Issues/TechDebt Backlog to PRs That Need Review + Their Associated Issues in Arborist Team May 10, 2019

@mms710 mms710 moved this from PRs That Need Review + Their Associated Issues to In Progress in Arborist Team May 10, 2019

@mms710 mms710 moved this from In Progress to In Review in Arborist Team May 10, 2019

@mms710 mms710 moved this from In Review to Merge Queue in Arborist Team May 10, 2019

@Eirik0 Eirik0 force-pushed the Eirik0:migration-status-info branch from 42daf46 to 23d9826 May 10, 2019

@Eirik0

This comment has been minimized.

Copy link
Contributor Author

commented May 13, 2019

@zkbot r+

@zkbot

This comment has been minimized.

Copy link
Collaborator

commented May 13, 2019

📌 Commit 23d9826 has been approved by Eirik0

@zkbot

This comment has been minimized.

Copy link
Collaborator

commented May 13, 2019

⌛️ Testing commit 23d9826 with merge 4880676...

zkbot added a commit that referenced this pull request May 13, 2019

Auto merge of #4002 - Eirik0:migration-status-info, r=Eirik0
Add Logging and persist async operation for Sapling migration

Currently zcashd will automatically remove the last async migration operations when it reaches the height where it sends the transactions it just made. This is not in alignment with other async operations, which are not removed until a node is restarted or a user calls `z_getoperationresult`. This PR removes the calls to pop the operations so that they can be accessed and reviewed later. In this PR I also correct the operation's `amount_migrated` field to exclude the transaction fee (this field existed for debugging purposes, but should be consistent with `z_getmigrationstatus`), and have included the list of migration txids in the operation's result (this is similar to the async rpcs such as `z_sendmany`).

Documentation: none needed.

Test plan:
* After migration transactions have been created, list the opids and call `z_getoperationresult` on them.
* Check that the operations' `amount_migrated` fields exclude the fee.
@daira

daira approved these changes May 13, 2019

Copy link
Contributor

left a comment

Rereviewed from scratch after force push. utACK.

@zkbot

This comment has been minimized.

Copy link
Collaborator

commented May 13, 2019

☀️ Test successful - pr-merge
Approved by: Eirik0
Pushing 4880676 to master...

@zkbot zkbot merged commit 23d9826 into zcash:master May 13, 2019

1 check passed

homu Test successful
Details

Arborist Team automation moved this from Merge Queue to Released (Merged in Master) May 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.