New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix displaying of invalid and non-minimal small pushes as numbers #8598

Closed
wants to merge 4 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@fivepiece
Contributor

fivepiece commented Aug 26, 2016

Currently, commands like decodescript that return an assembly representation of the script will show seemingly valid integers for a number of incorrectly pushed int values of sizes 0 to 4. The purpose of this change is to allow ScriptToAsmStr to return a more accurate representation of the assembly.

In addition, ScriptToAsmStr will not fail on what are considered invalid pushes altogether. This change will make ScriptToAsmStr return an '[error]' string, much like other invalid pushes.

I believe this makes more sense when displaying decoded script.
As opposed to showing the valid representation of an int for an incorrectly pushed value, the value is returned either as a number if both the push is minimal and the encoding is correct, or as hex if the push is not minimal. Invalid pushes are shown as '[error]' and end the decoding.

Examples:

Here the first two pushes are valid numbers. The rest are shown as hex data.

$ for i in 52 0152 025200 03520000 0452000000 055200000000; do echo "hex script: ${i}"; bitcoin-cli decodescript ${i} | grep asm; done

hex script: 52
  "asm": "2",
hex script: 0152
  "asm": "82",
hex script: 025200
  "asm": "5200",
hex script: 03520000
  "asm": "520000",
hex script: 0452000000
  "asm": "52000000",
hex script: 055200000000
  "asm": "5200000000",

Pushing -1,[1,16] incorrectly, after and before the patch.

$ for i in 81 {01..09} 0A 0B 0C 0D 0E 0F 10; do echo "hex script: 01${i}"; bitcoin-cli decodescript 01${i} | grep asm; done

After:                             Before:

hex script: 0181                   hex script: 0181
  "asm": "[error]",                     "asm": "-1",
hex script: 0101                   hex script: 0101
  "asm": "[error]",                     "asm": "1",
hex script: 0102                   hex script: 0102
  "asm": "[error]",                     "asm": "2",
hex script: 0103                   hex script: 0103
  "asm": "[error]",                     "asm": "3",
hex script: 0104                   hex script: 0104
  "asm": "[error]",                     "asm": "4",
hex script: 0105                   hex script: 0105
  "asm": "[error]",                     "asm": "5",
hex script: 0106                   hex script: 0106
  "asm": "[error]",                     "asm": "6",
hex script: 0107                   hex script: 0107
  "asm": "[error]",                     "asm": "7",
hex script: 0108                   hex script: 0108
  "asm": "[error]",                     "asm": "8",
hex script: 0109                   hex script: 0109
  "asm": "[error]",                     "asm": "9",
hex script: 010A                   hex script: 010A
  "asm": "[error]",                     "asm": "10",
hex script: 010B                   hex script: 010B
  "asm": "[error]",                     "asm": "11",
hex script: 010C                   hex script: 010C
  "asm": "[error]",                     "asm": "12",
hex script: 010D                   hex script: 010D
  "asm": "[error]",                     "asm": "13",
hex script: 010E                   hex script: 010E
  "asm": "[error]",                     "asm": "14",
hex script: 010F                   hex script: 010F
  "asm": "[error]",                     "asm": "15",
hex script: 0110                   hex script: 0110
  "asm": "[error]",                     "asm": "16",

Actual example of how '0' and '0x00' are different. Both pushes are currently shown as '0'. After the patch, the first will be shown as '0' and the second as '00', making the distinction more obvious.

$ bitcoin-cli decodescript 00A820E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B85587 | grep asm

  "asm": "0 OP_SHA256 e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 OP_EQUAL",

$ bitcoin-cli signrawtransaction \
0100000001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa00000000252400a820e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b85587ffffffff0100000000000000001976a914bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb88ac00000000 \
'[{"txid":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA","vout":0,"scriptPubKey":"A914D227B69A4FE6FED60B20CFC31AED65027CF3DAC287","redeemScript":"00A820E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B85587","amount":0}]'

...
  "complete": true

vs

bitcoin-cli decodescript 0100A8206E340B9CFFB37A989CA544E6BB780A2C78901D3FB33738768511A30617AFA01D87 | grep asm

  "asm": "0 OP_SHA256 6e340b9cffb37a989ca544e6bb780a2c78901d3fb33738768511a30617afa01d OP_EQUAL",

bitcoin-cli signrawtransaction \
0100000001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa0000000026250100a8206e340b9cffb37a989ca544e6bb780a2c78901d3fb33738768511a30617afa01d87ffffffff0100000000000000001976a914bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb88ac00000000 \
'[{"txid":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA","vout":0,"scriptPubKey":"A914CF527A41C9908B23B16F854A15F2B4ECB18D8A7187","redeemScript":"0100A8206E340B9CFFB37A989CA544E6BB780A2C78901D3FB33738768511A30617AFA01D87","amount":0}]'

...
  "complete": true
Fix displaying of invalid and non-minimal small pushes as numbers
Currently, commands like decodescript that return an assembly representation
of the script will show seemingly valid integers for a number of incorrectly
pushed int values of sizes 0 to 4. The purpose of this change is to allow
ScriptToAsmStr to return a more accurate representation of the assembly.

In addition, ScriptToAsmStr will not fail on what are considered invalid
pushes altogether. This change will make ScriptToAsmStr return an '[error]'
string, much like other invalid pushes.
@jonasschnelli

This comment has been minimized.

Show comment
Hide comment
@jonasschnelli

jonasschnelli Aug 26, 2016

Member

Would you mind adding some unit tests for this? There are already some ScriptToAsmStr() in script_tests.cpp?

Member

jonasschnelli commented Aug 26, 2016

Would you mind adding some unit tests for this? There are already some ScriptToAsmStr() in script_tests.cpp?

@fivepiece

This comment has been minimized.

Show comment
Hide comment
@fivepiece

fivepiece Aug 26, 2016

Contributor

Sure, I'll give it a go.

Contributor

fivepiece commented Aug 26, 2016

Sure, I'll give it a go.

@luke-jr

This comment has been minimized.

Show comment
Hide comment
@luke-jr

luke-jr Aug 27, 2016

Member
hex script: 0181                   hex script: 0181
  "asm": "[error]",                     "asm": "-1",
hex script: 0101                   hex script: 0101
  "asm": "[error]",                     "asm": "1",

Am I missing something? Why are these errors? Shouldn't they be valid non-minimally-encoded numbers to display as hex?

Also, it seems this creates ambiguity between number and hex literals in asm? "10" asm could be either "5a" or "0110" now, right?

Member

luke-jr commented Aug 27, 2016

hex script: 0181                   hex script: 0181
  "asm": "[error]",                     "asm": "-1",
hex script: 0101                   hex script: 0101
  "asm": "[error]",                     "asm": "1",

Am I missing something? Why are these errors? Shouldn't they be valid non-minimally-encoded numbers to display as hex?

Also, it seems this creates ambiguity between number and hex literals in asm? "10" asm could be either "5a" or "0110" now, right?

@fivepiece

This comment has been minimized.

Show comment
Hide comment
@fivepiece

fivepiece Aug 27, 2016

Contributor

Thanks for the feedback!

As I understand it, "wasteful" data pushes are themselves invalid. It's fine to push a non-minimally encoded number and then not do math operations on it, but pushing a value with an a "larger" opcode would fail. So in the case of 0x81 and [0x01,0x10], not using 1NEGATE or OP_1-OP_16 would cause the script to fail :

$ bitcoin-cli decodescript 51A8204BF5122F344554C53BDE2EBB8CD2B7E3D1600AD631C385A5D7CCE23C7785459A87 | grep asm

  "asm": "1 OP_SHA256 4bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a OP_EQUAL",

$ bitcoin-cli signrawtransaction \
0100000001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa00000000252451a8204bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a87ffffffff0100000000000000001976a914bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb88ac00000000 \
'[{"txid":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA","vout":0,"scriptPubKey":"A91405FCBC74F0D5EAE9229CDED69AF137CAD27834F587","redeemScript":"51A8204BF5122F344554C53BDE2EBB8CD2B7E3D1600AD631C385A5D7CCE23C7785459A87","amount":0}]'

...
  "complete": true

vs

$ bitcoin-cli decodescript 0101A8204BF5122F344554C53BDE2EBB8CD2B7E3D1600AD631C385A5D7CCE23C7785459A87 | grep asm

  "asm": "1 OP_SHA256 4bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a OP_EQUAL",

$ bitcoin-cli signrawtransaction \
0100000001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa0000000026250101a8204bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a87ffffffff0100000000000000001976a914bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb88ac00000000 \
'[{"txid":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA","vout":0,"scriptPubKey":"A914120FD81D49579E40E968EEDD5215E92644BBD48A87","redeemScript":"0101A8204BF5122F344554C53BDE2EBB8CD2B7E3D1600AD631C385A5D7CCE23C7785459A87","amount":0}]'

{
  "hex": "0100000001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa0000000026250101a8204bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a87ffffffff0100000000000000001976a914bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb88ac00000000",
  "complete": false,
  "errors": [
    {
      "txid": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
      "vout": 0,
      "scriptSig": "250101a8204bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a87",
      "sequence": 4294967295,
      "error": "Data push larger than necessary"
    }
  ]
}

As for the ambiguity, specifically wrt 10 in asm, after this change, the only case where you'd see a 10 is if you executed 5a. Currently you will see 10 either by using 5a or 010a (though this is not a valid push). To actually push the byte 0x10 you would have to use OP_16, 60.

I do agree that numbers in base 10 and short hex are ambigous in asm (like the example in OP with 52), but that is already the case. This change only helps in the sense that values that would not pass as numbers, are not displayed as such.

You have given me more food for thought, what about larger invalid push operations? Those still do not appear as errors after this change.
For example this script would also fail;
4C0175A8200BFE935E70C321C7CA3AFC75CE0D0CA2F98B5422E008BB31C00C6D7F1F1C0AD687

If I'm wrong and these "larger than necessary" are not invalid, then you are correct that the '[error]'s should be removed and the original functionality restored.

Contributor

fivepiece commented Aug 27, 2016

Thanks for the feedback!

As I understand it, "wasteful" data pushes are themselves invalid. It's fine to push a non-minimally encoded number and then not do math operations on it, but pushing a value with an a "larger" opcode would fail. So in the case of 0x81 and [0x01,0x10], not using 1NEGATE or OP_1-OP_16 would cause the script to fail :

$ bitcoin-cli decodescript 51A8204BF5122F344554C53BDE2EBB8CD2B7E3D1600AD631C385A5D7CCE23C7785459A87 | grep asm

  "asm": "1 OP_SHA256 4bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a OP_EQUAL",

$ bitcoin-cli signrawtransaction \
0100000001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa00000000252451a8204bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a87ffffffff0100000000000000001976a914bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb88ac00000000 \
'[{"txid":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA","vout":0,"scriptPubKey":"A91405FCBC74F0D5EAE9229CDED69AF137CAD27834F587","redeemScript":"51A8204BF5122F344554C53BDE2EBB8CD2B7E3D1600AD631C385A5D7CCE23C7785459A87","amount":0}]'

...
  "complete": true

vs

$ bitcoin-cli decodescript 0101A8204BF5122F344554C53BDE2EBB8CD2B7E3D1600AD631C385A5D7CCE23C7785459A87 | grep asm

  "asm": "1 OP_SHA256 4bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a OP_EQUAL",

$ bitcoin-cli signrawtransaction \
0100000001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa0000000026250101a8204bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a87ffffffff0100000000000000001976a914bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb88ac00000000 \
'[{"txid":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA","vout":0,"scriptPubKey":"A914120FD81D49579E40E968EEDD5215E92644BBD48A87","redeemScript":"0101A8204BF5122F344554C53BDE2EBB8CD2B7E3D1600AD631C385A5D7CCE23C7785459A87","amount":0}]'

{
  "hex": "0100000001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa0000000026250101a8204bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a87ffffffff0100000000000000001976a914bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb88ac00000000",
  "complete": false,
  "errors": [
    {
      "txid": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
      "vout": 0,
      "scriptSig": "250101a8204bf5122f344554c53bde2ebb8cd2b7e3d1600ad631c385a5d7cce23c7785459a87",
      "sequence": 4294967295,
      "error": "Data push larger than necessary"
    }
  ]
}

As for the ambiguity, specifically wrt 10 in asm, after this change, the only case where you'd see a 10 is if you executed 5a. Currently you will see 10 either by using 5a or 010a (though this is not a valid push). To actually push the byte 0x10 you would have to use OP_16, 60.

I do agree that numbers in base 10 and short hex are ambigous in asm (like the example in OP with 52), but that is already the case. This change only helps in the sense that values that would not pass as numbers, are not displayed as such.

You have given me more food for thought, what about larger invalid push operations? Those still do not appear as errors after this change.
For example this script would also fail;
4C0175A8200BFE935E70C321C7CA3AFC75CE0D0CA2F98B5422E008BB31C00C6D7F1F1C0AD687

If I'm wrong and these "larger than necessary" are not invalid, then you are correct that the '[error]'s should be removed and the original functionality restored.

@fivepiece

This comment has been minimized.

Show comment
Hide comment
@fivepiece

fivepiece Aug 28, 2016

Contributor

I've added a check that happens if the push opcode is PUSHDATA1 - PUSHDATA4.
A temporary script is created that only pushes the data, then we compare our own push op with the one the script has.
I'm guessing this does cost some to execute, but I think it's better to eventually return an "[error]" instead of some wasteful push that will produce errors on actual script evaluation.

Thoughts?

Contributor

fivepiece commented Aug 28, 2016

I've added a check that happens if the push opcode is PUSHDATA1 - PUSHDATA4.
A temporary script is created that only pushes the data, then we compare our own push op with the one the script has.
I'm guessing this does cost some to execute, but I think it's better to eventually return an "[error]" instead of some wasteful push that will produce errors on actual script evaluation.

Thoughts?

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa Sep 4, 2016

Member

MINIMALDATA is not a consensus rule, and transactions that use non-minimal pushes or non-minimally encoded numbers are perfectly valid in blocks. I don't think we should fail to decode them. I don't really have good advice here, as I'm fine with somehow marking these as different, but they shouldn't fail to decode.

Member

sipa commented Sep 4, 2016

MINIMALDATA is not a consensus rule, and transactions that use non-minimal pushes or non-minimally encoded numbers are perfectly valid in blocks. I don't think we should fail to decode them. I don't really have good advice here, as I'm fine with somehow marking these as different, but they shouldn't fail to decode.

@fivepiece

This comment has been minimized.

Show comment
Hide comment
@fivepiece

fivepiece Sep 4, 2016

Contributor

Seems like I was fixing a problem that did not exist in the first place. I will be closing this PR for the sake of resubmitting another one which should reflect non-standard use of Script in a warning fasion rather than error and terminate.

Both non-minimally encoded numbers /and/ wasteful pushes are valid in blocks. Thanks @sipa for clarifying.

Contributor

fivepiece commented Sep 4, 2016

Seems like I was fixing a problem that did not exist in the first place. I will be closing this PR for the sake of resubmitting another one which should reflect non-standard use of Script in a warning fasion rather than error and terminate.

Both non-minimally encoded numbers /and/ wasteful pushes are valid in blocks. Thanks @sipa for clarifying.

@fivepiece fivepiece closed this Sep 4, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment