[rpc] Allow fetching tx directly from specified block in getrawtransaction #10275

Open
wants to merge 7 commits into
from

Conversation

Projects
None yet
Contributor

kallewoof commented Apr 25, 2017 edited

[Reviewer hint: use ?w=1 to avoid seeing a bunch of indentation changes.]

Presuming a user knows the block hash of the block containing a given transaction, this PR allows them to fetch the raw transaction, even without -txindex. It also enables support for getting transactions that are in orphaned blocks.

Note that supplying a block hash will override mempool and txindex support in GetTransaction. The rationale behind this is that a transaction may be in multiple places (orphaned blocks) and if the user supplies an explicit block hash it should be adhered to.

$ # a41.. is a tx inside an orphan block ..3c6f.. -- first try getting it normally
$ ./bitcoin-cli getrawtransaction a41e66ee1341aa9fb9475b98cfdc1fe1261faa56c0a49254f33065ec90f7cd79 1
error code: -5
error message:
No such mempool transaction. Use -txindex to enable blockchain transaction queries. Use gettransaction for wallet transactions.
$ # now try with block hash
$ ./bitcoin-cli getrawtransaction a41e66ee1341aa9fb9475b98cfdc1fe1261faa56c0a49254f33065ec90f7cd79 1 0000000000000000003c6fe479122bfa4a9187493937af1734e1e5cd9f198ec7
{
  "hex": "01000000014e7e81144e42f6d65550e59b715d470c9301fd7ac189[...]90488ac00000000",
  "inMainChain": false,
  "txid": "a41e66ee1341aa9fb9475b98cfdc1fe1261faa56c0a49254f33065ec90f7cd79",
  "hash": "a41e66ee1341aa9fb9475b98cfdc1fe1261faa56c0a49254f33065ec90f7cd79",
  "size": 225,
[...]
}
$ # another tx 6c66... in block 462000
$ ./bitcoin-cli getrawtransaction 6c66b98191e9d6cc671f6817142152ebf6c5cab2ef008397b5a71ac13255a735 1 00000000000000000217f2c12922e321f6d4aa933ce88005a9a493c503054a40
{
  "hex": "0200000004d157[...]88acaf0c0700",
  "inMainChain": true,
  "txid": "6c66b98191e9d6cc671f6817142152ebf6c5cab2ef008397b5a71ac13255a735",
  "hash": "6c66b98191e9d6cc671f6817142152ebf6c5cab2ef008397b5a71ac13255a735",
  "size": 666,
[...]
}
$ 
src/rpc/rawtransaction.cpp
+ uint256 hash;
+ int64_t blockHeight = -1;
+ bool inMainChain = true;
+ auto p = request.params[0].get_str().find(':');
@laanwj

laanwj Apr 25, 2017 edited

Owner

This functionality could come in useful.

As for the API I prefer to not do any string combining/parsing here, this makes the API less clean to work with at least in my experience. I'd prefer to add an optional (can be null or missing) fromblock argument.

@gmaxwell

Concept ACK. I've really wanted this before.

Allowing it work on orphan blocks is an interesting idea. I'm a little less sure about that-- I think it could allow it to return transactions that have never been validated, which would be somewhat surprising.

Member

jonasschnelli commented Apr 25, 2017

Nice feature!
I agree with @laanwj about the string parsing. The <block>:<txid> schematics looks good at first sight, but we are using JSON and should stay with JSON and don't add another form or key/value encoding.

Contributor

kallewoof commented Apr 26, 2017

Fair enough, I'll add a blockhash argument instead. I was kind of toying with the idea of a new standard for referencing transactions which included the block height (not hash) so everyone could always find a tx presuming they had the block in question, and that thought sort of seeped in here.

Contributor

kallewoof commented Apr 26, 2017 edited

@gmaxwell

Allowing it work on orphan blocks is an interesting idea. I'm a little less sure about that-- I think it could allow it to return transactions that have never been validated, which would be somewhat surprising.

I just realized my logic was flawed on this. I am passing the block height only to the GetTransaction method, which means it will always pick the active chain at the given height. Either I throw when the block is not in the main chain (i.e. no support for orphaned blocks) or I move the height determine logic over to validation.cpp. I am leaning towards the latter, but feedback welcome.

Edit: passing CBlockIndex to GetTransaction seems like a great way to do this. Going with that.

Contributor

kallewoof commented Apr 26, 2017 edited

The code now works as advertised (see updated OP).

History:

src/rpc/rawtransaction.cpp
- fVerbose = true;
- }
+ fVerbose = (request.params[1].get_int() != 0);
+ } else if (request.params[1].isBool()) {
@jnewbery

jnewbery May 2, 2017

Member

nit: You can replace this and the next three lines with:

} else {
    fVerbose = (request.params[1].get_bool());
}

since get_bool() does the type testing for you and throws the JSONRPCError if the type isn't a VBOOL.

Up to you whether you think that's clearer.

@kallewoof

kallewoof May 4, 2017 edited

Contributor

Ohh, good point! Thanks.
Edit: I don't want it to throw for null values though.

@kallewoof

kallewoof May 4, 2017

Contributor

So I end up with

        if (request.params[1].isNum()) {
            fVerbose = (request.params[1].get_int() != 0);
        } else if (!request.params[1].isNull()) {
            fVerbose = (request.params[1].get_bool());
        }
Member

jnewbery commented May 2, 2017

utACK, but I think this deserves a new functional test case.

Contributor

kallewoof commented May 4, 2017

@jnewbery I agree. Will get to work on that.

[...]:

Contributor

kallewoof commented May 8, 2017 edited

@jnewbery Added some tests to rawtransactions.py for the included blockhash variant (a6b8461).

[...]:

@jnewbery

tested ACK the integration test in a6b8461 with a couple of nits.

test/functional/rawtransactions.py
+ # make a tx by sending then generate 2 blocks; block1 has the tx in it,
+ # presumably
+ tx = self.nodes[2].sendtoaddress(self.nodes[1].getnewaddress(), 1)
+ [ block1, block2 ] = self.nodes[2].generate(2)
@jnewbery

jnewbery May 15, 2017

Member

nit: no need for brackets here. The following should do:

block1, block2 = self.nodes[2].generate(2)

test/functional/rawtransactions.py
+ [ block1, block2 ] = self.nodes[2].generate(2)
+ self.sync_all()
+ # We should be able to get the raw transaction by providing the correct block
+ assert self.nodes[0].getrawtransaction(tx, True, block1)
@jnewbery

jnewbery May 15, 2017

Member

nit: I think it's better to assert on the actual value here (ie verify that the getrawtransaction returned the correct transaction rather than returned anything). The following should do that:

assert_equal(self.nodes[0].getrawtransaction(tx, True, block1)['txid'], tx)

@kallewoof

kallewoof May 16, 2017

Contributor

@jnewbery Good points, thanks. Addressed!

Owner

sipa commented May 15, 2017

Is this still needed after #8704?

Contributor

kallewoof commented May 16, 2017

@sipa The use cases are quite different I think. #8704 lets you see all transactions in a given block. This lets you grab a transaction directly from a block without indexing.

Owner

sipa commented May 16, 2017

@kallewoof #8704 does not require indexing either

Contributor

kallewoof commented May 16, 2017

Yeah, sorry, I meant that this is a way to get a specific transaction if you know the block hash, whereas #8704 shows you all transactions in the entire block. You get the info, but you have to wade through stuff to get it.

Owner

laanwj commented May 16, 2017

In both cases the whole block has to be loaded from disk, and parsed, and searched linearly. The difference is whether the linear search step happens on the server or client.

I think both #8704 and this can be useful, but have to agree there's only superficial difference.

Contributor

maaku commented May 16, 2017

The difference between the two is the serialization and transmission and parsing of ~5MB of JSON data vs a few hundred bytes of hex encoded data. That's a 1,000x difference on the client side, and the same absolute improvement on the server -- although as a multiplier it'd be less since as you note the server still has to parse the block from disk. That's a nontrivial performance difference.

(Also, this would have greatly helped me in the past so another +1 from me.)

Member

jnewbery commented May 16, 2017

Agree with @maaku - simply encoding the data into 5MB of json could be time-consuming, during which time the cs_main lock is held. Having a command to return just a single transaction from a block seems very useful.

src/rpc/rawtransaction.cpp
"\nNOTE: By default this function only works for mempool transactions. If the -txindex option is\n"
- "enabled, it also works for blockchain transactions.\n"
+ "enabled, it also works for blockchain transactions. If the block hash is known, it can be provided\n"
+ "for nodes without -txindex.\n"
@luke-jr

luke-jr Jun 3, 2017

Member

Should probably mention that it MUST be in that block in this case...

src/rpc/rawtransaction.cpp
"\nResult (if verbose is not set or set to false):\n"
"\"data\" (string) The serialized, hex-encoded data for 'txid'\n"
"\nResult (if verbose is set to true):\n"
"{\n"
+ " \"inMainChain\": b, (bool) Whether transaction is in the main chain or not. Only visible when specifying block hash\n"
@luke-jr

luke-jr Jun 3, 2017

Member

Whether the block specified is in the main chain or not... This could be false with the tx being still in the main chain!

src/rpc/rawtransaction.cpp
+ if (!blockHash.IsNull()) {
+ BlockMap::iterator it = mapBlockIndex.find(blockHash);
+ if (it == mapBlockIndex.end()) {
+ throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Block hash not found in chain");
@luke-jr

luke-jr Jun 3, 2017

Member

Remove "in chain". Maybe "in database"?

@luke-jr luke-jr added a commit to luke-jr/bitcoin that referenced this pull request Jun 3, 2017

@kallewoof @luke-jr kallewoof + luke-jr [rpc] Allow getrawtransaction to take optional blockhash to fetch tra…
…nsaction from a block directly.

Github-Pull: #10275
Rebased-From: 4d006e7
e096099

@luke-jr luke-jr added a commit to luke-jr/bitcoin that referenced this pull request Jun 3, 2017

@kallewoof @luke-jr kallewoof + luke-jr [rpc] Fix fVerbose parsing (remove excess if cases and ensure null is…
… accepted).

Github-Pull: #10275
Rebased-From: 5b389a8
99dd1fc

@luke-jr luke-jr added a commit to luke-jr/bitcoin that referenced this pull request Jun 3, 2017

@kallewoof @luke-jr kallewoof + luke-jr [test] Updated rawtransactions.py to assert for adjusted exception.
Github-Pull: #10275
Rebased-From: 9f980f6
57cb225

@luke-jr luke-jr added a commit to luke-jr/bitcoin that referenced this pull request Jun 3, 2017

@kallewoof @luke-jr kallewoof + luke-jr [test] Add tests for getrawtransaction with block hash.
Github-Pull: #10275
Rebased-From: a6b8461
c188d00

@luke-jr luke-jr added a commit to luke-jr/bitcoin that referenced this pull request Jun 3, 2017

@kallewoof @luke-jr kallewoof + luke-jr f'test nits
Github-Pull: #10275
Rebased-From: 8f84e8c
675c1fe
Member

luke-jr commented Jun 3, 2017

Suggested message fix on my gettx-with-blockhash-0.14 branch (cherry-pick).

Owner

sipa commented Jun 4, 2017

Needs rebase.

Contributor

kallewoof commented Jun 5, 2017

@luke-jr Thanks for the review! I cherry-picked your commit.

[...]:

@sipa Rebased.

@TheBlueMatt

You may wish to squash "[rpc] Fix fVerbose parsing (remove excess if cases and ensure null is…" and "[test] Updated rawtransactions.py to assert for adjusted exception.".
Generally we try to make sure that after each individual commit, at least it builds and all tests pass.

src/rpc/rawtransaction.cpp
"\nNOTE: By default this function only works for mempool transactions. If the -txindex option is\n"
- "enabled, it also works for blockchain transactions.\n"
+ "enabled, it also works for blockchain transactions. If the block hash is known, it can be provided\n"
+ "for nodes without -txindex, in which case the transaction will only be found if it is in that\n"
@TheBlueMatt

TheBlueMatt Jun 7, 2017

Contributor

grammar nit: this reads funny to me, and could be a bit more explicit. Maybe:

"If the block which contains the transaction is known, its hash can be provided even for nodes without -txindex."
"Note that if a blockhash is provided, only it will be searched and if the transaction is in mempool, other blocks, or if this node does not have the given block available, the transaction will not be found."

@kallewoof

kallewoof Jun 8, 2017

Contributor

Thanks, that looks better yeah. Adding with minor tweaks.

            "\nNOTE: By default this function only works for mempool transactions. If the -txindex option is\n"
            "enabled, it also works for blockchain transactions. If the block which contains the transaction\n"
            "is known, its hash can be provided even for nodes without -txindex. Note that if a blockhash is\n"
            "provided, only that block will be searched and if the transaction is in the mempool or other\n"
            "blocks, or if this node does not have the given block available, the transaction will not be found.\n"
src/rpc/rawtransaction.cpp
"\nResult (if verbose is not set or set to false):\n"
"\"data\" (string) The serialized, hex-encoded data for 'txid'\n"
"\nResult (if verbose is set to true):\n"
"{\n"
+ " \"inMainChain\": b, (bool) Whether specified block is in the main chain or not\n"
@TheBlueMatt

TheBlueMatt Jun 7, 2017

Contributor

Hmm, maybe say "if blockhash is specified" or otherwise mention this wont appear unless a blockhash is provided. Even better, fill it out if GetTransaction returns the blockhash cause it found it via UTXO/txindex.

@kallewoof

kallewoof Jun 8, 2017

Contributor

I was sure I did, but guess not:

            "  \"inMainChain\": b,     (bool) Whether specified block is in the main chain or not (only present with explicit \"blockhash\" argument)\n"

I like the idea of including when able but will keep it out of this PR for now.

src/rpc/rawtransaction.cpp
+ }
+
+ if (request.params.size() > 2 && !request.params[2].isNull()) {
+ uint256 blockHash = ParseHashV(request.params[2], "parameter 3");
@TheBlueMatt

TheBlueMatt Jun 7, 2017

Contributor

Hmm? Shouldn't we use the parameter's name here instead of "parameter 3"?

@kallewoof

kallewoof Jun 8, 2017

Contributor

The general tendency seems to be to identify the parameter index so I stuck with that. I agree it may be better to be more descriptive though...

src/rpc/rawtransaction.cpp
- if (!GetTransaction(hash, tx, Params().GetConsensus(), hashBlock, true))
- throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, std::string(fTxIndex ? "No such mempool or blockchain transaction"
+ if (!GetTransaction(hash, tx, Params().GetConsensus(), hashBlock, true, blockIndex))
+ throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, std::string(fTxIndex || blockIndex ? "No such mempool or blockchain transaction"
@TheBlueMatt

TheBlueMatt Jun 7, 2017

Contributor

Maybe further update this error message, eg if (blockIndex) "No such transaction found in the provided block".

@kallewoof

kallewoof Jun 8, 2017 edited

Contributor

Yeah, I wanted to avoid ?:?:. Rewritten as:

    if (!GetTransaction(hash, tx, Params().GetConsensus(), hashBlock, true, blockIndex)) {
        std::string errmsg;
        if (blockIndex) {
            errmsg = "No such transaction found in the provided block";
        } else {
            errmsg = fTxIndex
              ? "No such mempool or blockchain transaction"
              : "No such mempool transaction. Use -txindex to enable blockchain transaction queries";
        }
        throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, errmsg + ". Use gettransaction for wallet transactions.");
    }
Contributor

kallewoof commented Jun 8, 2017 edited

@TheBlueMatt Thanks for the review!

Generally we try to make sure that after each individual commit, at least it builds and all tests pass.

We try to keep tests as separate commits though, so that would assume tests and code changes come in pairs (tests will fail before test commit or after test commit and before change commit, obv). That was my intention with the split here. I may have screwed up. I'll double check and/or squash as appropriate.

Edit: I noticed the order was off (two fixes then two tests). Rearranged them. The commit/test pairs now pass make check individually (i.e. a43ec61 and 8f2ce52).

Edit 2: [...]:

@luke-jr luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Jun 15, 2017

@kallewoof @luke-jr kallewoof + luke-jr [rpc] Fix fVerbose parsing (remove excess if cases and ensure null is…
… accepted).

Github-Pull: #10275
Rebased-From: 4c41073
5dfa87a

@luke-jr luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Jun 15, 2017

@kallewoof @luke-jr kallewoof + luke-jr [test] Updated rawtransactions.py to assert for adjusted exception.
Github-Pull: #10275
Rebased-From: a43ec61
f080d5f

@luke-jr luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Jun 15, 2017

@kallewoof @luke-jr kallewoof + luke-jr [rpc] Allow getrawtransaction to take optional blockhash to fetch tra…
…nsaction from a block directly.

Github-Pull: #10275
Rebased-From: 8485239
633d334

@luke-jr luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Jun 15, 2017

@kallewoof @luke-jr kallewoof + luke-jr [test] Add tests for getrawtransaction with block hash.
Github-Pull: #10275
Rebased-From: 8f2ce52
c35815a
Contributor

TheBlueMatt commented Jun 22, 2017

utACK 8f2ce52

Member

jonasschnelli commented Jul 14, 2017

Needs rebase.

Contributor

kallewoof commented Jul 14, 2017

Rebased.

Member

jonasschnelli commented Jul 14, 2017

a) Is there a reason why mainchain height is not supported as alternative for the blockhash? Eventually with a security of only accepting heights of a hundred blocks below the tip as a reorganisation protection (but I'd prefer to not add this protection).

b) @kallewoof the idea about the standard for a transaction reference has already been worked into a BIP: bitcoin/bips#555 (maybe we can support this – if we agree on that BIP to be worth implementing – also via getrawtransaction).

Contributor

kallewoof commented Jul 14, 2017 edited

@jonasschnelli Regarding height, I chose not to include it as it could potentially cause unexpected results when a reorg happens, but if people don't think that's an issue it should be fairly straightforward to allow for both.

Edit: as for the standard, that looks exciting for sure. If it matures enough and this PR isn't merged already I may take a stab at it.

src/rpc/rawtransaction.cpp
"\nResult (if verbose is not set or set to false):\n"
"\"data\" (string) The serialized, hex-encoded data for 'txid'\n"
"\nResult (if verbose is set to true):\n"
"{\n"
+ " \"inMainChain\": b, (bool) Whether specified block is in the main chain or not (only present with explicit \"blockhash\" argument)\n"
@jtimon

jtimon Jul 17, 2017 edited

Member

This may be confused with Params().NetworkIDString() == "main" (kind of like we do with "testnet" in getinfo).
Can we rename to "inActiveChain" or something of the sort?

- txOut = ptx;
- return true;
- }
+ if (!blockIndex) {
@jtimon

jtimon Jul 17, 2017

Member

Perhaps the diff can be less disruptive by moving everything inside if (!blockIndex) {...} to a new static function defined right above instead of indenting all of it?

@kallewoof

kallewoof Jul 18, 2017

Contributor

I'm not a huge fan of changing code just to make diffs smaller, and as I mention above, you can put ?w=1 to get diff without indentation changes.

Member

jtimon commented Jul 17, 2017

utACK 3ec2d28 besides small nits.

kallewoof added some commits May 6, 2017

@kallewoof @kallewoof kallewoof [rpc] Fix fVerbose parsing (remove excess if cases and ensure null is…
… accepted).
d2c2f8a
@kallewoof kallewoof [rpc] Allow getrawtransaction to take optional blockhash to fetch tra…
…nsaction from a block directly.
7ba57b3
@kallewoof @kallewoof kallewoof [test] Updated rawtransactions.py to assert for adjusted exception. 4d8bf80
@kallewoof kallewoof [test] Add tests for getrawtransaction with block hash. 950ae53
@kallewoof kallewoof f'inMainChain -> inActiveChain 45d371a
@kallewoof kallewoof f'Added support for block height instead of block hash.. 1152f81
@kallewoof kallewoof f'Added tests for block height. 6325671
Contributor

kallewoof commented Jul 18, 2017

Addressed @jtimon nits.
@jonasschnelli I added support for supplying block height instead of block hash.

Member

jtimon commented Jul 18, 2017

I'm not sure about allowing the height.
One can always call getblockhash if he knows the txid and block height without knowing the block hash somehow.
That's not the case for the BIP proposal @jonasschnelli refers to, since they don't know the txid anyway (only block height and tx position). I suggest we leave the BIP discussion out of this PR, which in my opinion makes sense on its own. And also leave the height option for later (or never).

Contributor

kallewoof commented Jul 19, 2017

@jtimon It feels like a convenience thing that, IMO, getblock should have as well (i.e. allow both height number and block hash). I don't have a strong opinion on the subject though, and will drop the last 2 commits (height support) whenever this is ready for merging unless someone argues for block height support.

Member

jtimon commented Jul 20, 2017

Well, don't have a strong opinion on adding the block height either, if more people like it, let's keep it.
Or perhaps it can be proposed after this for both getrawtransaction and getblock at the same time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment