Skip to content

Latest commit

 

History

History
628 lines (454 loc) · 29.7 KB

ch06.asciidoc

File metadata and controls

628 lines (454 loc) · 29.7 KB

Script

The ability to lock and unlock coins is the mechanism by which we transfer Bitcoin. Locking is giving some Bitcoins to some entity. Unlocking is spending some Bitcoins that you have received.

In this chapter we examine this locking/unlocking mechanism, which is often called a smart contract. Elliptic Curve Cryptography ([chapter_elliptic_curve_cryptography]) is used to Script to validate that the transaction was properly authorized ([chapter_tx_parsing]). Script essentially allows people to be able to prove that they have the right to spend certain UTXOs. We’re getting a little ahead of ourselves, though, so let’s start with how Script works and go from there.

Mechanics of Script

If you are confused about what a "smart contract" is, don’t worry. "Smart contract" is a fancy way of saying "programmable" and the "smart contract language" is simply a programming language. In Bitcoin, Script is the smart contract language, or the programming language used to express the conditions under which bitcoins are spendable.

Bitcoin has the digital equivalent of a contract in Script. Script is a stack-based language similar to Forth. Script is intentionally limited in the sense that Script avoids certain features. Specifically, Script avoids any mechanism for loops and is therefore not Turing complete.

Note
Why Bitcoin isn’t Turing Complete

Turing completeness in a programming language essentially means that the program has the ability to loop. Loops are a useful construct in programming, so you may be wondering at this point why Script doesn’t have the ability to loop.

There are a lot of reasons for this, but let’s start with program execution. Anyone can create a Script program that every full node on the network executes. If Script were Turing Complete, it would be possible for the loop to go on executing forever. This would validating node to enter and never leave that loop. This would be an easy way to attack the network through what would be called a Denial of Service attack (DoS). A single Script that has an infinite loop could take down Bitcoin! This would be a large systematic vulnerability and protecting against this vulnerability is one of the major reasons why Turing-completeness is avoided. Ethereum, which has Turing Completeness in its smart contract language, Solidity, handles this problem by forcing contracts to pay for program execution with something called "gas". An infinite loop will exhaust whatever gas is in the contract as by definition, it will run an infinite number of times.

There are other reasons to avoid Turing Completeness and that’s because smart contracts with Turing Completeness are very difficult to analyze. A Turing Complete smart contract’s execution conditions are very difficult to enumerate and thus it’s easy to create unintended behavior, causing bugs. Bugs in a smart contract mean that it’s vulnerable to being unintentionally spent, which means the contract participants could lose money. Such bugs are not just theoretical: this was the major problem in the DAO (Decentralized Autonomous Organization), a Turing Complete smart contract which ended with the Ethereum Classic hard fork.

Transactions assign Bitcoins to a locking Script. The locking Script is what’s specified in ScriptPubKey (see [chapter_tx_parsing]). Think of this as the lockbox where some money is deposited which only a particular key can open. The money inside, of course, can only be accessed by the owner who has the key.

The unlocking of the lockbox is done in the ScriptSig field (see [chapter_tx_parsing]) and proves ownership of the locked box which authorizes spending of the funds.

How Script works

Script is a programming language and like most programming languages, processes one instruction at a time. The instructions operate on a stack of elements. There are two possible types of instructions: elements and operations.

Elements are data. Technically, they’re a push operation onto the stack of that element. They are byte strings of length 1 to 520. A typical element might be a DER signature or a SEC pubkey (Figure 6-1).

Script Elements
Figure 1. Elements

Operations do something to the data (Figure 6-2). They consume zero or more elements from the processing stack and push zero or more elements back on the stack.

Script Operations
Figure 2. Operations

A typical operation is OP_DUP (Figure 6-3), which will duplicate the top element (consuming 0) and pushing the new element to the stack (pushing 1).

OP_DUP
Figure 3. OP_DUP duplicates the top element

At the end of evaluating all the instructions, the top element of the stack must be non-zero for the Script to resolve as valid. Having no elements in the stack or the top element being zero would resolve as invalid. Resolving as invalid means that the transaction which includes the unlocking Script is not accepted on the network.

Example Operations

There are many other operations besides OP_DUP. OP_HASH160 (Figure 6-4) does a sha256 followed by a ripemd160 (aka hash160) to the top element of the stack (consuming 1) and pushing a new element to the stack (pushing 1). Note in the diagram that y = hash160(x).

OP_HASH160
Figure 4. OP_HASH160 does a sha256 followed by ripemd160 to the top element

Another very important operation is OP_CHECKSIG (Figure 6-5). OP_CHECKSIG consumes 2 elements from the stack, the first being the pubkey, the second being a signature, and examines if the signature is good for the given pubkey. If so, OP_CHECKSIG pushes a 1 to the stack, otherwise pushes a 0 to the stack.

OP_CHECKSIG
Figure 5. OP_CHECKSIG checks if the signature for the pubkey is valid or not

Coding opcodes

We can now code OP_DUP, given a stack. OP_DUP simply duplicates the top element of the stack.

link:code-ch06/op.py[role=include]
...
OP_CODE_FUNCTIONS = {
...
    118: op_dup,  # (3)
...
}
  1. We have to have at least one element to duplicate, otherwise, we can’t execute this opcode.

  2. This is how we duplicate the top element of the stack.

  3. 118 = 0x76 which is the code for OP_DUP

Note that we return a boolean with this opcode, as a way to tell whether the operation was successful. A failed operation automatically fails Script evaluation.

Here’s another one for OP_HASH256. This opcode will consume the top element, perform a hash256 operation on it and push the result on the stack.

link:code-ch06/op.py[role=include]
...
OP_CODE_FUNCTIONS = {
...
    170: op_hash256,
...
}

Parsing the Script fields

Both ScriptPubKey and ScriptSig are parsed the same way. If the byte is between 0x01 and 0x4b (which we call n), we read the next n bytes as an element. Otherwise, the byte represents an operation, which we have to look up. Here are some operations and their byte codes:

  • 0x00 - OP_0

  • 0x51 - OP_1

  • 0x60 - OP_16

  • 0x75 - OP_DUP

  • 0x93 - OP_ADD

  • 0xa9 - OP_HASH160

  • 0xac - OP_CHECKSIG

Note
Longer than 75-byte elements

You might be wondering what would happen if you had an element that’s greater than 0x4b (75 in decimal). There are 3 specific opcodes for handling elements with length greater than 75, namely, OP_PUSHDATA1, OP_PUSHDATA2 and OP_PUSHDATA4. OP_PUSHDATA1 means that the next byte contains how many bytes we need to read for the element. OP_PUSHDATA2 means that the next 2 bytes contain how many bytes we need to read for the element. OP_PUSHDATA4 means that the next 4 bytes contain how many bytes we need to read for the element.

Practically speaking, this means if we have an element that’s between 76 and 255 bytes inclusive, we use OP_PUSHDATA1 <1-byte length of the element> <element>. For anything between 256 bytes and 520 bytes inclusive, we use OP_PUSHDATA2 <2-byte length of the element in Little-Endian> <element>. Anything larger than 520 bytes is not allowed on the network, so OP_PUSHDATA4 is unnecessary, though OP_PUSHDATA4 <4-byte length of the element in Little-Endian, but value less than or equal to 520> <element> is still legal.

It is possible to encode a number below 76 using OP_PUSHDATA1 or a number below 256 using OP_PUSHDATA2 or even any number 520 and below using OP_PUSHDATA4. These are considered non-standard transactions, meaning most bitcoin nodes (particularly those running Bitcoin Core software) will not relay them.

There are many more opcodes, which are coded in op.py and the full list can be found at http://wiki.bitcoin.it.

Coding a Script parser and serializer

Now that we know how Script works, we can write a Script parser.

link:code-ch06/script.py[role=include]
...
link:code-ch06/script.py[role=include]
  1. Each instruction is either an opcode to be executed or an element to be pushed onto the stack.

  2. Script serialization always starts with the length of the entire Script.

  3. We parse until the right amount of bytes are consumed.

  4. The byte determines if we have an opcode or element.

  5. This converts the byte into an integer in Python.

  6. For a number between 1 to 75, we know the next n bytes are an element

  7. 76 is OP_PUSHDATA1, so the next byte tells us how many bytes to read

  8. 77 is OP_PUSHDATA2, so the next two bytes tell us how many bytes to read

  9. We have an opcode that we store.

  10. Script should have consumed exactly the length of bytes we expected, otherwise we raise an error.

We can similarly write a Script serializer.

class Script:
...
link:code-ch06/script.py[role=include]
  1. If the instruction is an integer, we know that’s an opcode.

  2. If the byte is between 1 and 75 inclusive, we encode the length as a single byte

  3. For any element with length from 76 to 255, we put OP_PUSHDATA1 first, then encode the length as a single byte and finally, the element.

  4. For anything from 256 to 520, we put OP_PUSHDATA2 first, then encode the length as two bytes in little Endian and finally, the element

  5. Any element longer than 520 bytes cannot be serialized.

  6. Script serialization starts with the length of the entire Script.

Note that both the parser and serializer were used in [chapter_tx_parsing] parsing/serializing the ScriptSig and ScriptPubKey fields.

Combining the Script fields

The Script object represents the instruction set that requires evaluation. To evaluate Script, we need to combine the ScriptPubKey and ScriptSig fields. The lockbox (ScriptPubKey) and the unlocking (ScriptSig) are in different transactions. Specifically, the lockbox is where the bitcoins are received, the unlocking is where the bitcoins are spent. The input in the spending transaction points to the receiving transaction. Essentially, we have a situation like Figure 6-6:

ScriptPubKey and ScriptSig
Figure 6. ScriptPubKey and ScriptSig

Since ScriptSig unlocks ScriptPubKey, we need a mechanism by which the two Scripts combine. To evaluate the two together, we take the instructions from ScriptSig and ScriptPubKey and combine them as above. The instructions from the ScriptSig go on top of all the instructions from ScriptPubKey. Instructions are processed one at a time until no instructions are left to be processed or if the Script fails early.

Coding the combined instruction set

The evaluation of Script requires that we take the ScriptSig and ScriptPubKey, combine them into a single instruction set and execute the instructions. In order to do this, we require a way to combine the Scripts.

class Script:
...
link:code-ch06/script.py[role=include]
  1. We are combining the instruction set to create a new, combined Script object.

We will use this ability to combine Scripts for evaluation later in this chapter.

Standard Scripts

There are many types of standard Scripts in Bitcoin including the following:

  • p2pk - Pay-to-pubkey

  • p2pkh - Pay-to-pubkey-hash

  • p2sh - Pay-to-Script-hash

  • p2wpkh - Pay-to-witness-pubkey-hash

  • p2wsh - Pay-to-witness-Script-hash

Addresses are data that fit into known Script templates like these. Wallets know how to interpret various address types (p2pkh, p2sh, p2wpkh) and create the appropriate ScriptPubKey. All of the above have a particular type of address format (base58, bech32) so wallets can pay to them.

To show exactly how all this works, we’ll start with one of the original Scripts, pay-to-pubkey.

p2pk

Pay-to-pubkey (p2pk) was used largely during the early days of bitcoin. Most coins thought to belong to Satoshi are in p2pk UTXOs. That is, transaction outputs whose ScriptPubKeys have the p2pk form. There are some limitations that we’ll discuss later, but first, let’s look at how p2pk works.

Back in [chapter_elliptic_curve_cryptography], we learned both ECDSA signing and verification. In order to verify an ECDSA signature, we need the message, z, the public key, P and the signature, r and s. In p2pk, Bitcoins are sent to a public key and the owner of the private key can unlock, or spend the Bitcoins by creating a signature. The ScriptPubKey of a Transaction puts the assigned Bitcoins under the control of the private key owner.

Specifying where the Bitcoins go is the job of the ScriptPubKey. ScriptPubKey is the lockbox that receives the bitcoins. The p2pk ScriptPubKey looks like Figure 6-7:

P2PK ScriptPubKey
Figure 7. Pay-to-pubkey (p2pk) ScriptPubKey

Note the OP_CHECKSIG, as that will be very important. The ScriptSig is the part that unlocks the received bitcoins. The pubkey can be compressed or uncompressed, though early on in Bitcoin’s history when p2pk was more prominent, uncompressed was the only one being used (see [chapter_serialization]).

For p2pk, the ScriptSig required to unlock the corresponding ScriptPubKey is just the signature as shown in Figure 6-8:

P2PK ScriptSig
Figure 8. Pay-to-pubkey (p2pk) ScriptSig

The ScriptPubKey and ScriptSig combine to make an instruction set that looks like Figure 6-9:

P2PK Combination
Figure 9. p2pk Combined

The two columns below are instructions of Script and the elements stack. At the end of this processing, the top element of the stack must be non-zero to be considered a valid ScriptSig. The Script instructions are processed one instruction at a time. We start with the instructions as combined above (Figure 6-10):

P2PK Start
Figure 10. p2pk Start

The first instruction is the signature, which is an element. This is data that is pushed to the stack (Figure 6-11).

P2PK Step 1
Figure 11. p2pk Step 1

The second instruction is the pubkey, which is also an element. This is again, data is pushed to the stack (Figure 6-12).

P2PK Step 2
Figure 12. p2pk Step 2

OP_CHECKSIG consumes 2 stack instructions (pubkey and signature) and determines if they are valid for this transaction. OP_CHECKSIG will push a 1 to the stack if the signature is valid, 0 if not. Assuming that the signature is valid for this public key, we have this situation (Figure 6-13):

P2PK End 1
Figure 13. p2pk Step 3

We’re finished processing all the instructions of Script and we’ve ended with a single element on the stack. Since the top element is not zero, (1 is definitely not 0), this Script is valid.

If this transaction instead had an invalid signature, the result from OP_CHECKSIG would be zero, ending our Script processing like Figure 6-14:

P2PK End 2
Figure 14. p2pk End

We end with a single element on the stack which is zero. Since the top element is zero, the combined Script is invalid and a transaction with this ScriptSig in the input is invalid.

The combined Script will validate if the signature is valid, but fail if the signature is invalid. The ScriptSig will only unlock the ScriptPubKey if the signature is valid for that public key. In other words, only someone with knowledge of the private key can produce a valid ScriptSig.

Incidentally, we can see where ScriptPubKey got its name. The public key in uncompressed SEC format is the main instruction in ScriptPubKey for p2pk (the other instruction being OP_CHECKSIG). Similarly, ScriptSig is named as such because the ScriptSig for p2pk is a single element: the DER signature.

Coding Script Evaluation

We now code a way to evaluate Script. This requires us to go through each instruction and evaluate whether the Script is valid. What we want to be able to do is this:

link:code-ch06/examples.py[role=include]
  1. p2pk ScriptPubkey is the SEC format pubkey followed by OP_CHECKSIG which is 0xac or 170.

  2. We can do this because of the add method we created above.

  3. We want to evaluate the instructions and see if the Script validates.

Here is the method that we’ll use for the combined instruction set (combination of the ScriptPubKey of the previous transaction and the ScriptSig of the current transaction).

from op import OP_CODE_FUNCTIONS, OP_CODE_NAMES
...
class Script:
...
link:code-ch06/script.py[role=include]
  1. As the instructions list will change, we make a copy

  2. We execute until the instructions list is empty

  3. The function that executes the opcode is in the OP_CODE_FUNCTIONS array (e.g. OP_DUP, OP_CHECKSIG, etc)

  4. 99 and 100 are OP_IF and OP_NOTIF respectively. They require manipulation of the instructions array based on the top element of the stack.

  5. 107 and 108 are OP_TOALTSTACK and OP_FROMALTSTACK respectively. They move stack elements to/from an "alternate" stack, which we call altstack.

  6. 172, 173, 174 and 175 are OP_CHECKSIG, OP_CHECKSIGVERIFY, OP_CHECKMULTISIG, and OP_CHECKMULTISIGVERIFY, which all require the signature hash, z, from [chapter_elliptic_curve_cryptography] for signature validation.

  7. If the instruction is not an opcode, it’s an element, so we push that element to the stack.

  8. If the stack is empty at the end of processing all the instructions, we fail the Script by returning False.

  9. If the stack’s top element is an empty byte string (which is how the stack stores a 0), then we also fail the Script by returning False.

  10. Any other result means that the Script has validated.

Warning
Making Script Evaluation Safe

The code shown here is a little bit of a cheat as the combined Script is not executed this way exactly. The ScriptSig is evaluated separately from the ScriptPubKey as to not allow operations from ScriptSig to affect the ScriptPubKey instructions.

Specifically, the stack after all the ScriptSig instructions are evaluated are stored and then the ScriptPubkey instructions are evaluated on their own with the stack from the first execution.

Stack elements under the hood

It may be confusing that the stack elements are sometimes numbers like 0 or 1 and other times byte-strings like a DER signature or SEC pubkey. Under the hood, they’re all bytes, but some are interpreted as numbers for certain opcodes. For example, 1 is stored on the stack as the 01 byte, 2 is stored as the 02 byte, 999 as the e703 and so on. Any byte-string is interpreted as a Little-Endian number for arithmetic opcodes. The integer 0 is not stored as the 00 byte, but as the empty byte-string.

The code in op.py can clarify what’s going on:

link:code-ch06/op.py[role=include]

Numbers being pushed to the stack are encoded into bytes and decoded from bytes when the numerical value is needed.

Problems with p2pk

Pay-to-pub-key is intuitive in the sense that there is a public key that anyone can send Bitcoins to and a signature that can only be produced by the owner of the private key. This works well, but there are some problems.

First, the public keys are long. We know from [chapter_serialization] that secp256k1 public points are 33 bytes in compressed SEC and 65 bytes in uncompressed SEC. Unfortunately, humans can’t interpret 33 or 65 raw bytes easily. Most character encodings don’t render certain byte ranges as they are control characters, newlines or similar. The SEC format is typically encoded instead in hexadecimal, doubling the length (hex encodes 4 bits per character instead of 8). This makes the compressed and uncompressed SEC formats 66 and 130 characters respectively, which is bigger than most identifiers (your username on a website, for instance, is usually less than 20). To compound this, early Bitcoin transactions didn’t use the compressed versions so the hexadecimal addresses were 130 characters each! This is not fun or easy for people to transcribe, much less communicate by voice.

That said, the original use-cases for p2pk were for IP-to-IP payments and mining outputs. For IP-to-IP payments, IP addresses were queried for their public keys, so communicating the public keys were done machine-to-machine, which meant that human communication wasn’t necessarily a problem. Use for mining outputs also don’t require human communication. Incidentally, this IP-to-IP payment system was phased out as it’s not secure and prone to man-in-the-middle attacks.

Why did Satoshi use the uncompressed SEC format?

It seems the uncompressed SEC format doesn’t make sense for Bitcoin given that block space is at a premium, so why did Satoshi use it? Satoshi was using the OpenSSL library to do the SEC format conversions and the OpenSSL library at the time Satoshi wrote Bitcoin (circa 2008) did not document the compressed format very well. It’s speculated this is why Satoshi used the uncompressed SEC format.

When Pieter Wuille discovered that the compressed SEC format existed in OpenSSL, more people started using the compressed SEC format in Bitcoin.

Second, because the public keys are long, this causes a more subtle problem. The UTXO set becomes bigger since this large public key has to be kept around and indexed to see if it’s spendable. This requires more resources on the part of full nodes.

Third, because we’re storing the public key in the ScriptPubKey field, it’s known to everyone. That means should ECDSA someday be broken, these outputs could be stolen. This is not a very big threat since ECDSA is used in a lot of applications besides Bitcoin and would affect all of those things, too. For example, quantum computing has the potential to reduce the calculation times significantly for RSA and ECDSA, so having something else in addition to protect these outputs would be more secure.

Solving the problems with p2pkh

Pay-to-pubkey-hash (p2pkh) is an alternative Script that has advantages over p2pk:

  1. The addresses are shorter.

  2. It’s additionally protected by sha256 and ripemd160.

Addresses are shorter due to the use of the sha256 and ripemd160 hashing algorithms. We do both in succession and call that hash160. The result of hash160 is 160-bits or 20 bytes, which are encoded into an address.

The result is what you may have seen on the Bitcoin network and coded in [chapter_serialization]:

1PMycacnJaSqwwJqjawXBErnLsZ7RkXUAs

This address encodes within 20 bytes that look like this in hexadecimal:

f54a5851e9372b87810a8e60cdd2e7cfd80b6e31

These 20 bytes are the result of doing a hash160 operation on this (compressed) SEC public key:

0250863ad64a87ae8a2fe83c1af1a8403cb53f53e486d8511dad8a04887e5b2352

Given p2pkh is shorter and more secure, p2pk use declined significantly after 2010, though it’s still fully supported today.

p2pkh

Pay-to-pubkey-hash was used during the early days of bitcoin, though not as much as p2pk.

The p2pkh ScriptPubKey, or locking Script, looks like Figure 6-15:

P2PKH ScriptPubKey
Figure 15. Pay-to-pubkey-hash (p2pkh) ScriptPubKey

Like p2pk, OP_CHECKSIG is here and OP_HASH160 makes an appearance. Unlike p2pk, the SEC pubkey is not here but a 20 byte hash is. There is also a new opcode here: OP_EQUALVERIFY.

The p2pkh ScriptSig, or the unlocking Script, looks like Figure 6-16:

P2PKH ScriptSig
Figure 16. Pay-to-pubkey-hash (p2pkh) ScriptSig

Like p2pk, the ScriptSig has the DER signature. Unlike p2pk, the ScriptSig also has the SEC pubkey. The main difference between p2pk and p2pkh ScriptSigs is that the SEC pubkey has moved from ScriptPubKey to ScriptSig.

The ScriptPubKey and ScriptSig combine to a list of instructions that looks like Figure 6-17:

P2PKH Combination
Figure 17. p2pkh Combined

At this point, the Script is processed one instruction at a time. We start with the instructions as above (Figure 6-18).

P2PKH Start
Figure 18. p2pkh Start

The first two instructions are elements, so they are pushed to the stack (Figure 6-19).

P2PKH Step 1
Figure 19. p2pkh Step 1

OP_DUP duplicates the top element, so the pubkey gets duplicated (Figure 6-20):

P2PKH Step 2
Figure 20. p2pkh Step 2

OP_HASH160 takes the top element and performs the hash160 operation on it (sha256 followed by ripemd160), creating a 20-byte hash (Figure 6-21):

P2PKH Step 3
Figure 21. p2pkh Step 3

The 20-byte hash is an element and is pushed to the stack (Figure 6-22).

P2PKH Step 4
Figure 22. p2pkh Step 4

We are now at OP_EQUALVERIFY. This opcode consumes the top two elements and checks if they’re equal. If they are equal, the Script continues execution. If they are not equal, the Script stops immediately and fails. We assume here that they’re equal, leading to Figure 6-23:

P2PKH Step 5
Figure 23. p2pkh Step 5

We are now exactly where we were during the OP_CHECKSIG part of processing p2pk. Once again, we assume that the signature is valid (Figure 6-24):

P2PKH End
Figure 24. p2pkh End

There are two ways this Script can fail. If the ScriptSig provides a public key that does not hash160 to the 20-byte hash in the ScriptPubKey, the Script will fail at OP_EQUALVERIFY (Figure 6-22). The other failure condition is if the ScriptSig has a public key that hash160s to the 20-byte hash in the ScriptPubKey, but has an invalid signature. That would end the combined Script evaluation with a 0, ending in failure.

This is why we call this type of Script pay-to-pubkey-hash. The ScriptPubKey has the 20-byte hash160 of the public key and not the public key itself. We are locking Bitcoins to a hash of the public key and the spender is responsible for revealing the public key as part of constructing the ScriptSig.

The major advantage is that the ScriptPubKey is shorter (just 25 bytes) and a thief would not only have to solve the Discrete Log problem in ECDSA, but also figure out a way to find pre-images of both ripemd160 and sha256.

Scripts can be Arbitrarily Constructed

Note that Scripts can be any arbitrary program. Script is a smart contract language and can lock Bitcoins in many different ways. Figure 6-25 is an example ScriptPubKey:

Example 1 ScriptPubKey
Figure 25. Example ScriptPubKey

Figure 6-26 is a ScriptSig that will unlock the the ScriptPubKey from Figure 6-25.

Example 1 ScriptSig
Figure 26. Example ScriptSig

The combined Script is shown in Figure 6-27:

Example 1 Combination
Figure 27. Example Combined

Script evaluation will start like Figure 6-28:

Example 1 Start
Figure 28. Example Start

OP_4 will push a 4 to the stack (Figure 6-29).

Example 1 Step 1
Figure 29. Example Step 1

OP_5 will likewise push a 5 to the stack (Figure 6-30).

Example 1 Step 2
Figure 30. Example Step 2

OP_ADD will consume the top two elements of the stack, add them together and push the sum to the stack (Figure 6-31).

Example 1 Step 3
Figure 31. Example Step 3

OP_9 will push a 9 to the stack (Figure 6-32).

Example 1 Step 4
Figure 32. Example Step 4

OP_EQUAL will consume 2 elements and push a 1 if equal, 0 if not (Figure 6-33).

Example 1 End
Figure 33. Example End

Note that the ScriptSig isn’t particularly hard to figure out and contains no signature. As a result, ScriptPubKey is vulnerable to being taken by anyone who can solve it. Think of this ScriptPubKey as a lockbox with a very flimsy lock that anyone can break into. It is for this reason that most transactions have a signature requirement in the ScriptSig.

Once a UTXO has been spent, included in a block and secured by proof-of-work, the coins are locked to a different ScriptPubKey and no longer as easily spendable. Someone attempting to spend already spent coins would have to provide proof-of-work, which is expensive (see [chapter_blocks]).

Utility of Scripts

The previous exercise was a bit of a cheat as OP_MUL is no longer allowed on the Bitcoin network. Version 0.3.5 of Bitcoin disabled a lot of different opcodes as anything that had even a little bit of potential to create vulnerabilities on the network were disabled.

This is just as well since most of the functionality in Script is actually not used much. From a software maintenance standpoint, this is not a great situation as the code has to be maintained despite its lack of usage. Simplifying and getting rid of certain capabilities can be seen as a way to make Bitcoin more secure.

This is in stark contrast to other projects which try to expand their smart contract languages, often increasing the attack surface along with new features.

Sha1 Piñata

In 2013, Peter Todd created a Script very similar to the Script in Exercise 4 and put some Bitcoins into it to create an economic incentive for people to find hash collisions. The donations reached 2.49153717 BTC and when Google actually found a hash collision for sha1 in February of 2017 (https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html), this Script was promptly redeemed. The transaction output was 2.48 Bitcoins which was $2848.88 USD at the time.

Peter created more piñatas for sha256, hash256 and hash160, which add economic incentives to find collisions for these hashing functions.

Conclusion

We’ve covered Script and how it works. We can now proceed to the creation and validation of transactions.