Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent concurrent savemempool #12842

Merged
merged 1 commit into from Oct 24, 2018

Conversation

Projects
None yet
9 participants
@promag
Copy link
Member

commented Mar 30, 2018

Follow up of #12172, this change prevents calling savemempool RPC concurrently.

@promag

This comment has been minimized.

Copy link
Member Author

commented Mar 30, 2018

In order to test the new check and error add:

--- a/src/rpc/blockchain.cpp
+++ b/src/rpc/blockchain.cpp
@@ -1619,6 +1619,8 @@ UniValue savemempool(const JSONRPCRequest& request)
         throw JSONRPCError(RPC_MISC_ERROR, "Currently dumping the mempool");
     }

+    MilliSleep(10000);
+
     if (!g_is_mempool_loaded) {
         throw JSONRPCError(RPC_MISC_ERROR, "The mempool was not loaded yet");
     }

Then run

bitcoin-cli -regtest savemempool & bitcoin-cli -regtest savemempool

The second should wait for the first to terminate.

src/rpc/blockchain.cpp Outdated
@@ -1614,6 +1614,11 @@ UniValue savemempool(const JSONRPCRequest& request)
);
}

TxMempoolDumpReserver reserver(mempool);
if (!reserver.reserve()) {
throw JSONRPCError(RPC_MISC_ERROR, "Currently dumping the mempool");

This comment has been minimized.

Copy link
@conscott

conscott Apr 2, 2018

Contributor

Nit: maybe instead,

"savemempool already in progress"

This comment has been minimized.

Copy link
@promag

promag Apr 2, 2018

Author Member

Sounds better.

@conscott
Copy link
Contributor

left a comment

Tested ACK 7f722a1f0c82e1cc8dd9f6af9159bf7d7b8f2626

@laanwj

This comment has been minimized.

Copy link
Member

commented Apr 3, 2018

After #12863 I'm even more convinced that this might be the wrong approach. Protecting access using a mutex instead of returning these kind of 'busy errors' would avoid having to implement retry-poll-loops client-side (including in the tests).

If you want to stick with this you'd at least need to define a new error code that means 'transient error', similar to RPC_IN_WARMUP. But I think pushing this responsibility to the client is unnecessary.

@promag

This comment has been minimized.

Copy link
Member Author

commented Apr 3, 2018

@laanwj indeed it's very arguable. I think it's preferable to have the client retrying than reserving resources on the server side. IMO having the client waiting is also not desirable because the client can wait indefinitely and also timeout (but then he can raise the timeout) and we would process the request anyway (?). Getting an error instead of waiting is also more informative and allows the client to explicitly log and retry when he feels like.

@laanwj

This comment has been minimized.

Copy link
Member

commented Apr 3, 2018

OTOH

  • Then the API wouldn't be changed. Users of savemempool can rely on the call working, or fatally failing.
  • Hanging indefinitely is a bug no matter what. If it takes too long, something is wrong.
  • Interrupt semantics usually require less resources than polling. In this case: sleeping on a mutex requires virtually no resources. Whereas repetitive RPC requests take up CPU cycles and I/O.
@MarcoFalke

This comment has been minimized.

Copy link
Member

commented Apr 3, 2018

I think we had reports that loading the mempool took 1.5 hours or so, just noting without further comment.

@promag

This comment has been minimized.

Copy link
Member Author

commented Apr 3, 2018

Then the API wouldn't be changed. Users of savemempool can rely on the call working, or fatally failing.

The caller should always have error handling, and this new error would fall in his "unkown error".

Hanging indefinitely is a bug no matter what. If it takes too long, something is wrong.

We have calls that can hang a lot of time, for instance when the wallet is really big (as in lots of keys, lots of transactions).

Interrupt semantics usually require less resources than polling.

Right, but at least a thread and a socket is on hold. And when the mutex is acquired, it might do something that is no longer relevant - in this case dumping the mempool again..

We have this semantic in rescanblockchain, do you think we should change that too?

And regarding #12863, having the client blindingly wait for 1.5 hours should not be an option (?)

@sipa

This comment has been minimized.

Copy link
Member

commented Apr 3, 2018

I'm also more a fan of blocking until the existing save has finished. Given that no modifications in the mempool can happen in the mean time anyway (I think?), we could even just wait until the existing save operation finishes and then return from both RPC calls (without an additional save).

@promag

This comment has been minimized.

Copy link
Member Author

commented Apr 3, 2018

Given that no modifications in the mempool can happen in the mean time anyway

Actually it can, DumpMempool takes a copy of the mempool contents and then releases the lock to then write to file. So, while writing to disk, the mempool can change.

@promag

This comment has been minimized.

Copy link
Member Author

commented Jun 7, 2018

What should be done here? I've to address @conscott comment above, but I guess I should wait for more concept ACKs.

@luke-jr

This comment has been minimized.

Copy link
Member

commented Jun 12, 2018

I agree with @laanwj - it should just hold a lock on the file to avoid problems, but otherwise succeed.

@promag promag force-pushed the promag:2018-03-concurrent-savemempool branch 3 times, most recently Jun 14, 2018

@promag

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2018

Rebased and updated to block/wait while there is another dump.

@promag promag force-pushed the promag:2018-03-concurrent-savemempool branch Jun 14, 2018

@promag

This comment has been minimized.

Copy link
Member Author

commented Jun 15, 2018

@luke do you think the current code is enough?

@MarcoFalke

This comment has been minimized.

Copy link
Member

commented Jun 15, 2018

utACK 35bdaf7489cf33d120b3c66f87067c2698ed2e8c

@DrahtBot

This comment has been minimized.

Copy link
Contributor

commented Aug 10, 2018

The last travis run for this pull request was 57 days ago and is thus outdated. To trigger a fresh travis build, this pull request should be closed and re-opened.

@DrahtBot DrahtBot closed this Aug 10, 2018

@DrahtBot DrahtBot reopened this Aug 10, 2018

src/txmempool.h Outdated
@@ -485,6 +485,7 @@ class CTxMemPool

mutable CCriticalSection cs;
indexed_transaction_set mapTx GUARDED_BY(cs);
std::mutex m_dump_mutex;

This comment has been minimized.

Copy link
@ajtowns

ajtowns Sep 18, 2018

Contributor

(a) Shouldn't this be a global mutex, since it's specific to the "mempool.dat.new" file whose path is based on the (global) GetDataDir()?
(b) Why a std::mutex instead of a CCriticalSection?

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Sep 18, 2018

Member

Why a CCriticalSection instead of a Mutex?

This comment has been minimized.

Copy link
@promag

promag Sep 18, 2018

Author Member

a) Not sure why that matters. To reduce exposure the mutex could be a static variable in DumpMempool() function;
b) we don't need a recursive mutex.

This comment has been minimized.

Copy link
@ajtowns

ajtowns Sep 19, 2018

Contributor

a) Yeah, a static variable in/next to the DumpMempool() function was what I was thinking. I don't think it matters in practice though.
b) More about trying to just use one set of locking primitives; Mutex/LOCK per MarcoFalke's suggestion would make more sense to me.

src/validation.cpp Outdated
@@ -4777,6 +4777,8 @@ bool DumpMempool(void)
std::map<uint256, CAmount> mapDeltas;
std::vector<TxMempoolInfo> vinfo;

std::lock_guard<std::mutex> lock(mempool.m_dump_mutex);

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Sep 19, 2018

Member

Could rebase on master, make the type Mutex and use LOCK(...) here?

@MarcoFalke MarcoFalke modified the milestone: 0.18.0 Sep 19, 2018

@promag promag force-pushed the promag:2018-03-concurrent-savemempool branch to 585b47c Oct 20, 2018

@promag

This comment has been minimized.

Copy link
Member Author

commented Oct 20, 2018

@MarcoFalke @achow101 rebased on master and changed to Mutex/LOCK and static variable in DumpMempool.

@MarcoFalke

This comment has been minimized.

Copy link
Member

commented Oct 24, 2018

re-utACK 585b47c

@laanwj

This comment has been minimized.

Copy link
Member

commented Oct 24, 2018

I really like this clean and simple solution

utACK 585b47c

@laanwj laanwj merged commit 585b47c into bitcoin:master Oct 24, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

laanwj added a commit that referenced this pull request Oct 24, 2018

Merge #12842: Prevent concurrent savemempool
585b47c rpc: Prevent concurrent savemempool (João Barbosa)

Pull request description:

  Follow up of #12172, this change prevents calling `savemempool` RPC concurrently.

Tree-SHA512: 4759a7107658a9794f5c6ab7e3e3002276abadd64996010be67a2791914d284db6fe0377c071a8d6a42387bfb0178f219b73aeec077ce5c4fe5c634a30b3e081

@laanwj laanwj added the Backport label Oct 26, 2018

@laanwj laanwj added this to the 0.17.1 milestone Oct 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.