# Bitcoin Transaction Overview

## Introduction
Transactions are perhaps the most important part of the bitcoin protocol. All the other aspects of the bitcoin protocol ultimately serve to verify, secure, and agree upon the order of bitcoin transactions. 

The first bitcoin transactions created by Satoshi are very different from the typical bitcoin transactions you see on the blockchain today. Over the years there have been a number of upgrades to the bitcoin protocol that have resolved issues such as transaction malleability, and introduced new features such as timelocks and schnorr signatures. 

To understand how various types of bitcoin transactions work, it's helpful to understand how the earliest transaction types worked, and how new features got introduced. Taken at face value, the protocol can seem strangely written. The design makes more sense when appreciating that all upgrades have used softforks to enable backwards compatibility.

With this view in mind, we'll start by looking at the very first transaction types, then work our way chronologically through the various upgrades to arrive at the latest transaction types and features used today. 

## The first bitcoin transaction

Legacy bitcoin transactions have four main components: the version, inputs, outputs, and locktime. To illustrate each of these fields and what they do, we'll go through an example using the first ever bitcoin transaction. On January 11, 2009 at 7:30 PM PST, Satoshi Nakamoto transfered 10 BTC to Hal Finney.

This transaction spent a UTXO that was mined directly by Satoshi, where the block reward was 50 BTC. The transaction had two outputs, one to Hal Finney for 10 BTC, and a change output for 40 BTC.

Here is the raw serialized transaction in hex, as you'd see it in the blockchain:

<font color='#32CD32'>01000000</font><font color='#800000'>01</font><font color='blue'>c997a5e56e104102fa209c6a852dd90660a20b2d9c352423edce25857fcd3704</font><font color='#ADD8E6'>00000000</font><font color='800000'>48</font><font color='#46966b'>47304402204e45e16932b8
af514961a1d3a1a25fdf3f4f7732e9d624c6c61548ab5fb8cd410220181522ec8eca07de4860a4acdd12909d831cc56cbbac46220822
21a8768d1d0901</font><font color='#30b5a8'>ffffffff</font><font color='#800000'>02</font><font color='#FFD700'>00ca9a3b00000000</font><font color='#800000'>43</font><font color='ff6f1d'>4104ae1a62fe09c5f51b13905f07f06b99a2f7159b2225f374cd378d71302fa28414e7
aab37397f554a7df5f142c21c1b7303b8a0626f1baded5c72a704f7e6cd84cac</font><font color='#FFD700'>00286bee00000000</font><font color='#800000'>43</font><font color='ff6f1d'>410411db93e1dcdb8a016b498
40f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac</font><br><font color='#A9A9A9'>00000000</font>

### Version
- <font color='#32CD32'>01000000</font> - Version number (4-byte signed integer, little endian)

The first four bytes of a transaction represents the version number. This number is a way for the transaction to signal what features or consensus rules the transaction may be using. To date, the only feature that uses the version field is relative timelocks (BIP68). A relative timelock will only be enforced if the version field is set to 2. We'll demonstrate this in the chapter on relative timelocks.

In the future, version numbers can be used to signal new features that don't currently exist.

### Inputs

<font color='#800000'>01</font><font color='blue'>c997a5e56e104102fa209c6a852dd90660a20b2d9c352423edce25857fcd3704</font><font color='#ADD8E6'>00000000</font><font color='#800000'>48</font><font color='#46966b'>47304402204e45e16932b8af51496
1a1d3a1a25fdf3f4f7732e9d624c6c61548ab5fb8cd410220181522ec8eca07de4860a4acdd12909d831cc56cbbac4622082221a87<br>68d1d0901</font><font color='#30b5a8'>ffffffff</font>


Breaking it down further:
- <font color='#800000'>01</font> - Number of inputs (1-9 byte variable integer)

This lets us know how many transaction inputs to expect in this transaction. 

- For each input:
    - outpoint
        - <font color='blue'>c997a5e56e104102fa209c6a852dd90660a20b2d9c352423edce25857fcd3704</font> - Input txid (32-byte hash big endian)
        - <font color='#ADD8E6'>00000000</font> - Index (4-byte integer)        
    The outpoint uniquely represents the transaction output being spent by this transaction. The txid represents the transaction, and the index indicates which output is being spent.

    - scriptSig
        - <font color='#800000'>48</font> - scriptSig length (1-9 byte variable integer)
        - <font color='#46966b'>47304402204e45e16932b8af514961a1d3a1a25fdf3f4f7732e9d624c6c61548ab5fb8cd410220181522ec8eca07de48
    60a4acdd12909d831cc56cbbac4622082221a8768d1d0901</font> - scriptSig (arbitrary length)   
    Since the length of the scriptSig will vary depending on the transaction, it is prepended with its length. This lets us know where the scriptSig ends and where the next field begins.
    The scriptSig (aka unlocking script) proves ownership of the input. To evaluate if an input is valid, a node will combine the scriptSig with the scriptPubKey of the transaction referenced by the outpoint. If the script is successfully evaluated and the top stack item is non-zero, the input is considered valid. 
    - sequence 
        - <font color='#30b5a8'>ffffffff</font> - (4 bytes)  
       In satoshi's initial version of bitcoin, the sequence field had no purpose and was intended for future use for something like a payment channel. The solution didn't work however, and since then the field has been repurposed. It has two uses. The first is to signal 'replace by fee' (RBF). Any value less than `0xfffffffe` will signal for RBF. The second is to for use as relative time-lock. For more on this review the chapter on transaction-level and script-level timelocks.

### Outputs

<font color='#800000'>02</font><font color='#FFD700'>00ca9a3b00000000</font><font color='#800000'>43</font><font color='ff6f1d'>4104ae1a62fe09c5f51b13905f07f06b99a2f7159b2225f374cd378d71302fa28414e7aab37397f554a7df5
    f142c21c1b7303b8a0626f1baded5c72a704f7e6cd84cac</font><font color='#FFD700'>00286bee00000000</font><font color='#800000'>43</font><font color='ff6f1d'>410411db93e1dcdb8a016b49840f8c53bc1eb68
a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac</font>



Breaking it down:
- <font color='#800000'>02</font> - Number of outputs (1-9 byte variable integer)

Similar to number of inputs, it lets us know how many outputs to expect in this transaction.

- For each output:
    - First output:
        - <font color='#FFD700'>00ca9a3b00000000</font> - amount in satoshis (8-byte signed integer little endian)
        - <font color='#800000'>43</font> - scriptPubkey length (1-9 byte variable integer)
        - <font color='#ff6f1d'>4104ae1a62fe09c5f51b13905f07f06b99a2f7159b2225f374cd378d71302fa28414e7aab37397f554a7df5f142c21c1b7
    303b8a0626f1baded5c72a704f7e6cd84cac</font> - scriptPubkey (arbitrary length)  
     The scriptPubkey (aka locking script) is what secures the output and the amount associated with it. This very first bitcoin transaction uses the outdated 'Pay to pubkey' output type. The scriptPubkey is an uncompressed (65-byte) pubkey, followed by `0xac` which correspeonds to `OP_CHECKSIG`. For more on bitcoin script see the chapter on bitcoin script.
    - Second output:
        - <font color='#FFD700'>00286bee00000000</font> - amount in satoshis (8-byte signed integer little endian)
        - <font color='#800000'>43</font> - scriptPubkey length (1-9 byte variable integer)
        - <font color='#ff6f1d'>410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160b
    fa9b8b64f9d4c03f999b8643f656b412a3ac</font> - scriptPubkey

### Locktime

- <font color='#A9A9A9'>00000000</font> - Locktime (4 bytes)

The final four bytes are the locktime. The locktime is used to set an absolute timelock on the transaction. The timelock can be expressed in blocks or unix time. Only once this timelock has expired can the transaction be included in a block. We will cover locktime in detail in the chapter on timelocks.

## Transaction metadata

To make more sense of this transaction, we can decode it using the bitcoind command `decoderawtransaction`, or an online web app such as https://btc.com/tools/tx/decode.

```
{
    "txid": "f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16",
    "hash": "f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16",
    "version": 1,
    "size": 275,
    "vsize": 275,
    "weight": 1100,
    "locktime": 0,
    "vin": [
        {
            "txid": "0437cd7f8525ceed2324359c2d0ba26006d92d856a9c20fa0241106ee5a597c9",
            "vout": 0,
            "scriptSig": {
                "asm": "304402204e45e16932b8af514961a1d3a1a25fdf3f4f7732e9d624c6c61548ab5fb8cd410220181522ec8eca07de4860a4acdd12909d831cc56cbbac4622082221a8768d1d09[ALL]",
                "hex": "47304402204e45e16932b8af514961a1d3a1a25fdf3f4f7732e9d624c6c61548ab5fb8cd410220181522ec8eca07de4860a4acdd12909d831cc56cbbac4622082221a8768d1d0901"
            },
            "sequence": 4294967295
        }
    ],
    "vout": [
        {
            "value": 10,
            "n": 0,
            "scriptPubKey": {
                "asm": "04ae1a62fe09c5f51b13905f07f06b99a2f7159b2225f374cd378d71302fa28414e7aab37397f554a7df5f142c21c1b7303b8a0626f1baded5c72a704f7e6cd84c OP_CHECKSIG",
                "hex": "4104ae1a62fe09c5f51b13905f07f06b99a2f7159b2225f374cd378d71302fa28414e7aab37397f554a7df5f142c21c1b7303b8a0626f1baded5c72a704f7e6cd84cac",
                "type": "pubkey"
            }
        },
        {
            "value": 40,
            "n": 1,
            "scriptPubKey": {
                "asm": "0411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3 OP_CHECKSIG",
                "hex": "410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac",
                "type": "pubkey"
            }
        }
    ]
}
```

You'll notice that bitcoind outputs some fields at the top which are not defined in the original transaction. We'll go through each field and explain what it is and how it's calculated.

#### txid
Transactions do not explicitly contain their own transaction ID. Instead, they are calculated by hashing the contents of the transaction. Note that transaction ids are displayed in little endian format, so we'll need to reverse the byte order after performing the hashing operations.

In [1]:
# store the transaction as bytes 
raw_tx = bytes.fromhex('0100000001c997a5e56e104102fa209c6a852dd90660a20b2d9c352423edce25857fcd3704000000004847304402204e45e16932b8af514961a1d3a1a25fdf3f4f7732e9d624c6c61548ab5fb8cd410220181522ec8eca07de4860a4acdd12909d831cc56cbbac4622082221a8768d1d0901ffffffff0200ca9a3b00000000434104ae1a62fe09c5f51b13905f07f06b99a2f7159b2225f374cd378d71302fa28414e7aab37397f554a7df5f142c21c1b7303b8a0626f1baded5c72a704f7e6cd84cac00286bee0000000043410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac00000000')

# import hashlib for sha256
import hashlib

# first round of sha256
hash1 = hashlib.sha256(raw_tx).digest()

# second round of sha256 gives us the txid
hash2 = hashlib.sha256(hash1).digest()

print("Two rounds of SHA256 on the raw tx gives us: ", hash2.hex())

# We can use the python shorthand '[::-1]' to reverse the bytes, giving us the output in little endian notation
txid = hash2[::-1]
print("Reversing the bytes to little endian: ", txid.hex())

Two rounds of SHA256 on the raw tx gives us:  169e1e83e930853391bc6f35f605c6754cfead57cf8387639d3b4096c54f18f4
Reversing the bytes to little endian:  f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16


Note that for segwit transactions (v0 and taproot), their txids calculated on the raw transaction but without the segwit specific fields (flag, marker, witness).

#### size, vsize, and weight

'size' refers to the size of the transaction in bytes. We can get this in python with `len(raw_tx)`. Knowing the size of a transaction is helpful for calculating a fee rate based on satoshis/byte. 

Since the introduction of segwit, rather than blocks being limited by 1MB, blocks are instead limited to 4 million _weight units_. With this new rule, each byte in the witness counts as one weight unit, and each byte in the legacy fields count as 4 weight units. Since our transaction was a legacy transaction, every byte counts as 4 weight units, making the total weight of the transaction the `size*4`. 'vsize' is a new term introduced in segwit meaning 'size in virtual bytes'. It is calculated by dividing the weight units by 4, so that one could calculate a satoshi/vbyte fee rate in units comparable to legacy transactions.

In [2]:
size = len(raw_tx)
print("size: ", size)

weight = size*4
print("weight: ", weight)

import math
vsize = math.ceil(weight/4) # Note that vsize/vbytes will round up
print("vsize: ", vsize)

size:  275
weight:  1100
vsize:  275


## Things that have changed

Over the years a number of upgrades and improvements have been made to the bitcoin protocol, allowing for more sophisticated transactions and user friendliness. Below is a preview of some of the main changes that we'll cover in later chapters.

- pubkeys - This first ever transaction used an uncompressed 65-byte pubkey. These days 33-byte compressed pubkeys are more common, and for taproot we use 32-byte pubkeys.   
- P2PK - The output of this transaction was a 'Pay to Pubkey' which is generally considered obsolete. The standard transaction outputs types involve either paying to a public key hash, a script hash, or a taproot pubkey.
- Addresses - The newer standard transaction types have conventions for converting the scriptPubkey into a human readable address. This makes the process of sending and receiving bitcoins more user friendly and less vulnerable to errors.
- Script OP_CODES - New script OP_CODES allow for more sophisticated logic in how bitcoin transactions can be locked up/spent.
- Segwit - Since segwit, a new witness field was introduced. 
- Taproot - taproot introduced schnorr signatures, providing a host of privacy and scaling benefits.

## Coinbase transactions

All bitcoins originate from coinbase transactions. Coinbase transactions are the first transaction in every block. The output spends up to the total of the block subsidy and the total of all transaction fees included in the block. As well as being the first transaction in a block, coinbase transactions are unique in that the single 'input' field consists of all zeros.

Below is the first ever coinbase transaction. 

```
01000000010000000000000000000000000000000000000000000000000000000000000000ffffffff4d04ffff001d0104455468652054696d65732030332f4a616e2f32303039204368616e63656c6c6f72206f6e206272696e6b206f66207365636f6e64206261696c6f757420666f722062616e6b73ffffffff0100f2052a01000000434104678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5fac00000000
```

Decoded:
```                         
{
    "txid": "4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b",
    "hash": "4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b",
    "version": 1,
    "size": 204,
    "vsize": 204,
    "weight": 816,
    "locktime": 0,
    "vin": [
        {
            "coinbase": "04ffff001d0104455468652054696d65732030332f4a616e2f32303039204368616e63656c6c6f72206f6e206272696e6b206f66207365636f6e64206261696c6f757420666f722062616e6b73",
            "sequence": 4294967295
        }
    ],
    "vout": [
        {
            "value": 50,
            "n": 0,
            "scriptPubKey": {
                "asm": "04678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5f OP_CHECKSIG",
                "hex": "4104678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5fac",
                "type": "pubkey"
            }
        }
    ]
}
```

# Questions

1. Deconstructing the bitcoin transaction above, you will notice that there are no special characters to mark the beginning or end of the different fields. How is it possible then to parse raw bitcoin transactions?

2. While the input to the transaction contains all zeros, the scriptSig seems to have some data in it. What does this data represent? Clue - try decoding the data in a hex to ascii converter.

# Answers

1. The fields of a transaction either have a fixed length (e.g. `version` and `locktime` are 4 bytes each), or they are preceeded with a variable integer that denoted how long the next component will be (e.g. 'Number of inputs' and 'scriptSig length').

2. Satoshi used this field to encode some arbitrary data with the message: 'The Times 03/Jan/2009 Chancellor on brink of second bailout for banks'