Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception: "FAILED TO FIND PREVIOUS TX" #141

Closed
Francisco-DAnconia opened this issue Nov 6, 2022 · 36 comments
Closed

Exception: "FAILED TO FIND PREVIOUS TX" #141

Francisco-DAnconia opened this issue Nov 6, 2022 · 36 comments
Labels
BTC Related to Bitcoin BTC & Bitcoin Core question Further information is requested Requires Investigation Not clear if bug here or bug outside of Fulcrum

Comments

@Francisco-DAnconia
Copy link

I started to see a few of these exceptions (Mempool.cpp #190).

// Uh oh. If it wasn't in the mempool or in the db.. something is very wrong with our code... // (or there maybe was a race condition and a new block came in while we were doing this). // We will throw if missing, and the synch process aborts and hopefully we recover with a reorg // or a new block or somesuch.

Followed up by "Failed to synch blocks and/or mempool..."

In all the cases I looked into, the %2 transaction was in the block immediately following the %1 transaction.

Can these safely be ignored (no dB integrity issues)?

Or is it something that should investigated (on my node OR within Fulcrum)?

@cculianu
Copy link
Owner

cculianu commented Nov 6, 2022

Does it happen often? Also what coin are you on and what daemon make and version? (Eg Bitcoin core v22.0.0 Bitcoin cash node v24.0.0 etc)

usually if it resolves itself immediately after it’s safe to ignore but I’ve never seen these.

this may happen on Bitcoin core though perhaps due to the rbf and private fee market shenanigans .. maybe

but if it immediately resolved itself that’s fine

@Francisco-DAnconia
Copy link
Author

I'll frequently spot check the Fulcrum terminal window - but nothing exhaustive. I've never noticed this while running v22.0.0 & Fulcrum 1.7.0 (which I did for several months)

I upgraded Bitcoin Core to v23.0.0 a few weeks ago.

I saw 4 of these recently. One on 11/1 and three on 11/4 (blocks 761,343, 761,675, 761,772 & 761,777).

Oops, I forgot to mention all of these exceptions are immediately preceded by the warning:

<SynchMempool> Synch mempool expected to drop ####, but in fact dropped #### -- retrying getrawmempool
(second number always greater than the first)

Likely unrelated, but since you mentioned RBF. I do see these warnings, but they don't seem connected to the exceptions.

<SynchMempool> Tx dropped out of mempool (possibly due to RBF): ######...#### (error response: No such mempool or blockchain transaction. Use gettransaction for wallet transactions.), ignoring mempool tx .

@cculianu
Copy link
Owner

cculianu commented Nov 6, 2022

Hmm.. looking over the code briefly I think it can happen if the mempool is very full and Fulcrum is busy downloading new txns, and a new block comes in at that time. There is a bit of a race condition there in that you ask bitcoind "what is the latest block hash?" and it replies "Block XXXX", and you are like "cool, great, let me get the mempool now please" and as you download it, bitcoind receives a block, and new txns appear that spend from that block inthe mempool, and you are in the middle of that process, confused as to why a txn that was there a moment ago is no longer there anymore, or worse, a txn that references another txn you haven't seen because it's in a block you haven't seen yet.. is now there.

That's my hypothesis about what is happening.. most likely.. and it's not really a bug. If it happens about once every few days that's fine. I think the problem is particularly acute or likely to occur on BTC where you have full mempools, lots of rbf so lots of re-downloading of essentially the same txn, and so it's very chatty and very noisy to keep the mempool synched.. all that jazz that is the wonderful system they created makes this more likely to occur, is my hypothesis.

So it is what it is. I'd say this is safe to ignore unless it happens quite often.


EDIT: The fact that it almost never happened before and is happening more frequently now is worthy of investigation. I haven't upgraded my bitcoind since 22.0.0 so maybe they "broke" something in 23.0.0 where previous assumptions don't hold as often as they did before. It is worth investigating -- so thanks for the heads-up.

@cculianu cculianu added question Further information is requested BTC Related to Bitcoin BTC & Bitcoin Core Requires Investigation Not clear if bug here or bug outside of Fulcrum labels Nov 6, 2022
@Francisco-DAnconia
Copy link
Author

Understood.
Thanks for looking into it!

@hMsats
Copy link

hMsats commented Nov 24, 2022

I run Bitcoin Core 23.0 and have similar problems. When I connect Electrum only to my public Fulcrum server I get the following message:

W/i | interface.[electrum.bitcoinserver.nl:50514] | disconnecting due to RPCError(2, "daemon error: DaemonError({'code': -5, 'message': 'No such mempool or blockchain transaction. Use gettransaction for wallet transactions.'})")
E | asyncio | Task exception was never retrieved
future: <Task finished name='Task-145' coro=<Synchronizer._get_transaction() done, defined at /home/user/Electrum-4.3.2/electrum/synchronizer.py:216> exception=RPCError(2, "daemon error: DaemonError({'code': -5, 'message': 'No such mempool or blockchain transaction. Use gettransaction for wallet transactions.'})")>
Traceback (most recent call last):
  File "/home/user/Electrum-4.3.2/electrum/synchronizer.py", line 220, in _get_transaction
    raw_tx = await self.interface.get_transaction(tx_hash)
  File "/home/user/Electrum-4.3.2/electrum/interface.py", line 971, in get_transaction
    raw = await self.session.send_request('blockchain.transaction.get', [tx_hash], timeout=timeout)
  File "/home/user/Electrum-4.3.2/electrum/interface.py", line 171, in send_request
    response = await asyncio.wait_for(
  File "/usr/lib/python3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/home/user/Electrum-4.3.2/packages/aiorpcx/session.py", line 540, in send_request
    return await self._send_concurrent(message, future, 1)
  File "/home/user/Electrum-4.3.2/packages/aiorpcx/session.py", line 512, in _send_concurrent
    return await future

Maybe it has something to do with see:

    • The -deprecatedrpc=addresses configuration option has been removed. RPCs
      gettxout, getrawtransaction, decoderawtransaction, decodescript,
      gettransaction verbose=true and REST endpoints /rest/tx, /rest/getutxos,
      /rest/block no longer return the addresses and reqSigs fields, which
      were previously deprecated in 22.0. (#22650)

When I remove .electrum (to start from scratch) and restart Electrum connected to my hardware wallet (Trezor) and only connected to my Fulcrum server, it will only show the balance of a few addresses and thus a balance that's too low. When I connect to my Electrum Personal Server everything is fine and also if I run Electrum allowing it to automatically connect to other servers.

I only found this out when I was trying to help someone with his trezor and reinstalled .electrum from scratch. It may not be obvious/show otherwise.

@cculianu
Copy link
Owner

It wouldn't have anything to do with the verbose option removing addresses -- Fulcrum doesn't use the verbose output of any of the RPCs -- it does its own decoding of raw hex.

I am having a hard time understanding the issue you have, actually -- the symptom is that the Electrum synchronizer doesn't find new addresses? But somehow this works with Electrum Personal Server but on Fulcrum it's bad?

I don't think this has anything to do with Core deprecating RPCs.. something else is afoot here.. I also don't even think it's a bug in Fulcrum necessarily.. something else is going on.

@hMsats
Copy link

hMsats commented Nov 24, 2022

the symptom is that the Electrum synchronizer doesn't find new addresses? But somehow this works with Electrum Personal Server but on Fulcrum it's bad?

Thanks for the immediate reply. Yes, that's what's going on. Very strange indeed. Also other (probably ElectrumX servers) can find the balances of all addresses (instead of a few).

@hMsats
Copy link

hMsats commented Nov 24, 2022

Probably my hint was wrong but the error message from Electrum is of course real.

@cculianu
Copy link
Owner

What error message do you get in Fulcrum when the Electrum error is triggered? Anything in the log? Maybe start with -d option or debug=true in the conf file to see more verbose logging.

It seems to me that some previous tx is missing and this is odd indeed.

@cculianu
Copy link
Owner

What happens if you connect to my Fulcrum server at: blackie.c3-soft.com SSL port 57002? Do you see all your addresses then, or same issue? (Hopefully you are not afraid I will log your wallet scripthashes.. I promise you I delete all logs daily).

@cculianu
Copy link
Owner

Also you are running bitcoind with the -txindex=1 option... right?1?!?

@hMsats
Copy link

hMsats commented Nov 24, 2022

I will have a look after dinner :-)

@hMsats
Copy link

hMsats commented Nov 24, 2022

Your server is working just fine. I don't see the error message and get the right balance. I'm going to have a look at the -d option.

@hMsats
Copy link

hMsats commented Nov 24, 2022

Also you are running bitcoind with the -txindex=1 option... right?1?!?

ha ha yes

@hMsats
Copy link

hMsats commented Nov 24, 2022

The output of the -d option you can find here. My fulcrum server is electrum.bitcoinserver.nl, ssl port 50514. I only see the most recent 3 out of 19 transactions. With your server I see all 19 transactions.

@hMsats
Copy link

hMsats commented Nov 24, 2022

To emphasize this problem isn't very visible once the .electrum is created with the help of other electrum servers. After that everything seems to work just fine and there is no error message when I connect only to my server (with that existing .electrum).

@cculianu
Copy link
Owner

cculianu commented Nov 24, 2022

Yeah so what's happening is bitcoind somehow lost that txn from its txindex. Either your txindex is corrupted or.. something else is going on. Once the .electrum folder is built locally on your wallet .. you have the txn cached and you never ask for it again. It's only asked-for once when "verifying" a txn (using SPV mechanism to verify you aren't being lied-to by the server).

So I think it's possible the txindex on your bitcoind is corrupted somehow. The reason why this works on Electrum Personal Server connected to the existing (possibly-corrupted) bitcoind is that EPS doesn't rely on txindex -- and it actually tells bitcoind what block height to use to lookup a txn. This call path is seemingly great right? Except it's not. You are asking bitcoind to deserialize a whole block each time you do that in order to find a single transaction.. which is slow as molasses and not scalable.

Fulcrum explicitly requires txindex to not have bitcoind have to do this.. to be fast. You ask for a txn and using txindex we know exactly what byte offset it's at on the filesystem and we don't have to read-in a whole block to get a single txn. This is scalable. But it does require a txindex that works and is not corrupted.

But a corrupted txindex would explain everything here. I wonder how one can rebuild the txindex on an existing node? What happens if you rename your ~/.bitcoin/indexes/txindex to something else and make a new ~/.bintcoin/indexex/txindex folder that is empty? Does bitcoind realize it needs to rebuild the index or does it just die?

Either way you probably have a corrupt txindex on the bitcoind side in some way, is my hypothesis. It is the only thing I can think of that would explain the observables.

@hMsats
Copy link

hMsats commented Nov 24, 2022

Thanks a lot, I will look into my txindex via a backup. I can always start bitcoind with -reindex but that would take a long time. For EPS scalability is less important because it only needs to get a small number of transactions but indeed even then it's slow. I'll report back in a few days. Thanks a lot for all your help and hard work!

@cculianu
Copy link
Owner

cculianu commented Nov 24, 2022

Yeah for sure for EPS -- the lack of scalability is in some sense a feature rather than a bug or anything. They save on disk space by having a smaller db, and also don't require txindex which wastes space. And yes, for a single user the way it works is fine.

Yeah let me know if you solve it. I'm like 97% sure it's a txindex issue on the bitcoind side. Fulcrum is doing what it can -- if bitcoind can't find the txn, it can't find the txn... and the only way that happens is if there's corruption in the txindex...

@Francisco-DAnconia
Copy link
Author

(Hopefully you are not afraid I will log your wallet scripthashes.. I promise you I delete all logs daily).

Where are these logs stored?
Are scripthashes and I guess other information only captured if you turn on Fulcrum's "debug" setting?

@cculianu
Copy link
Owner

Oh yeah true. Correct. And I don't log that info. And on top of that, I even have my journalctl stuff set to clear logs every day...

@hMsats
Copy link

hMsats commented Nov 28, 2022

@cculianu You were right! My txindex was corrupt and also my blockchain. I have rebuilt everything from scratch and now fulcrum is working as it should. So maybe @Francisco-DAnconia has a similar issue (or maybe not). So it had indeed nothing to do with Bitcoin Core v23.0. @cculianu thanks again for all your hard work and suggestions.

Btw I did something smart to speedup the (painful) blockchain download/repair (to a new ssd). I started another full node (say node B) on the same server (different port) next to my usual full node (say node A). I then connected node B only to node A via an internal IP address which would serve blocks very fast. Because node A has a corrupted blockchain, it couldn't serve all the blocks. So sometimes node B wouldn't continue. Then I added an external node (say node C) to node B via the bitcoin-rpc addnode <ip> add command. Node B would then get a number of correct blocks from node C (but slower) after which I would remove node C via bitcoin-rpc addnode <ip> remove followed by bitcoin-rpc disconnectnode <ip> and node B would continue getting blocks only from node A (very fast).

@cculianu
Copy link
Owner

Ah wow so blockchain corruption too? Yeesh. Glad you found a creative solution to rebuild it fast.

@hMsats
Copy link

hMsats commented Nov 30, 2022

Oh, maybe it's a good idea for Fulcrum to give a warning of a possibly corrupted txindex when bitcoind returns the error:

No such mempool or blockchain transaction. Use gettransaction for wallet transactions.

@cculianu
Copy link
Owner

It can't differentiate that from a genuine error where the client is crazy and is really asking for a non-existant tx. There are hundreds of millions of txns it doesn't know what is real or fake if bitcoind doesn't know what is real or fake.

You may want to contact the bitcoin core developers and complain to them that their amazing txindex can go corrupt and it doesn't complain ever.. this is a bitcoind issue.

@hMsats
Copy link

hMsats commented Nov 30, 2022

txindex can go corrupt and it doesn't complain ever.. this is a bitcoind issue.

Agreed. I'll think about it.

@Francisco-DAnconia
Copy link
Author

@cculianu You were right! My txindex was corrupt and also my blockchain. I have rebuilt everything from scratch and now fulcrum is working as it should. So maybe @Francisco-DAnconia has a similar issue (or maybe not).

I've not noticed any problems with my node - it sounds like bitcoind doesn't perform sanity checks. So FWIW.

These exceptions occur a few times a week.

I did startup Fulcrum with "checkdb = true" recently. It was fine.

@hMsats
Copy link

hMsats commented Dec 10, 2022

@Francisco-DAnconia, last question and extra check.

Are your blocks, chainstate and txindex about this big in kB?

491217448 blocks/

4998388 chainstate/

39442976 indexes/txindex/

@cculianu
Copy link
Owner

Hmm this is what I have as of the time I type this:

492497601 blocks/
4977780 chainstate/
39468704 indexes/txindex/

@Francisco-DAnconia
Copy link
Author

Are your blocks, chainstate and txindex about this big in kB?

Yes, ATM
/blocks: 503100820382 bytes, 491309395 kB
/chainstate: 5097478344 bytes, 4978006 kB
/indexes: 49192957612 bytes, 48039997 kB

I'm not sure what you're after here.

Wouldn't blocksize (4096 in my case) play a factor and potentially filesystem/OS?
Or are you looking for some larger order divergence?

@hMsats
Copy link

hMsats commented Dec 10, 2022

Or are you looking for some larger order divergence?

Yes that's what I had and a first indication something is wrong. My indexes was a lot smaller.
Both your numbers and that of @cculianu seem correct.

Edit: although it's strange that your (@Francisco-DAnconia) indexes is so much bigger that that of @cculianu and mine, while blocks and chainstate are almost equal.

@Francisco-DAnconia
Copy link
Author

Edit: although it's strange that your (@Francisco-DAnconia) indexes is so much bigger that that of @cculianu and mine, while blocks and chainstate are almost equal.

Sorry, that previous number of mine was /indexes - I missed the sub-folder designation.

/indexes/txindex: 40372685112 bytes, 39426450 kB
/indexes/blockfilter: 8820273660 bytes, 8613548 kB

@hMsats
Copy link

hMsats commented Dec 11, 2022

@cculianu

txindex can go corrupt and it doesn't complain ever.. this is a bitcoind issue.

Agreed. I'll think about it.

I though about it :-) The way I test my txindex (and indirectly Fulcrum) now is using these addresses that I found on the Internet:

https://www.blockchain.com/btc/address/18zuLTKQnLjp987LdxuYvjekYnNAvXif2b
https://www.blockchain.com/btc/address/1NT1jtYLNwFXLztD4U4B9sLizdYatirhWW
https://www.blockchain.com/btc/address/1f1miYFQWTzdLiCBxtHHnNiW7WAWPUccr
https://www.blockchain.com/btc/address/12tkqA9xSoowkzoERHMWNKsTey55YEBqkv
https://www.blockchain.com/btc/address/1KbrSKrT3GeEruTuuYYUSQ35JwKbrAWJYm
https://www.blockchain.com/btc/address/198aMn6ZYAczwrE5NvNTUMyJ5qkfy4g3Hi

which contain transactions from every year, from 2009 to 2022. They sum up to:

8020.70182731 + 2500.00155992 + 10009.26282042 + 28151.05862779 + 10000.01672481 + 8000.00430382 = 66681.04586407

Then move your .electrum to another filename and start electrum as:

run_electrum --oneserver --server <electrumserver>:<port>:s

and choose "Import Bitcoin addresses or private keys" and past the addresses;

18zuLTKQnLjp987LdxuYvjekYnNAvXif2b 1NT1jtYLNwFXLztD4U4B9sLizdYatirhWW 1f1miYFQWTzdLiCBxtHHnNiW7WAWPUccr 12tkqA9xSoowkzoERHMWNKsTey55YEBqkv 1KbrSKrT3GeEruTuuYYUSQ35JwKbrAWJYm 198aMn6ZYAczwrE5NvNTUMyJ5qkfy4g3Hi

I got exactly the above total: 66681.04586407, so everything is (now) working fine!

@cculianu
Copy link
Owner

cculianu commented Dec 11, 2022

Yeah I got the same balance for those addresses. Well, I guess that's a good way to get random txns from all over. Thanks for posting that.

@Francisco-DAnconia
Copy link
Author

I got exactly the above total: 66681.04586407, so everything is (now) working fine!

Me too. Thanks, that is interesting.

@cculianu
Copy link
Owner

cculianu commented Nov 22, 2023

Note that there was a bug recently exposed that I fixed just now. There is a rare corner case where a block-only txn can be referenced by a mempool, and if the block arrives right before we synch the mempool, Fulcrum would get very confused and angry. See #214.

That may explain some issues seen that look like this error message (although in this issue likely culprit was lack of txindex).

I am going to do a Fulcrum release likely tomorrow that should finally close the corner-case bug, though. Some of you may have been seeing this bug crop up occasionally.. I felt the need to announce it here as well. Look out for Fulcrum 1.9.7 to be released tomorrow.

Anyway, closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BTC Related to Bitcoin BTC & Bitcoin Core question Further information is requested Requires Investigation Not clear if bug here or bug outside of Fulcrum
Projects
None yet
Development

No branches or pull requests

4 participants
@cculianu @Francisco-DAnconia @hMsats and others