# Transactions Through an Electronic Ledger
Alice, Bob, and Christie understand that they will need a central electronic ledger to store all transactions. This means that they need a database or just a spreadsheet. In its simplest form their database has just one table with four columns. The first one captures the date and time (`timestamp`), the second one is the person who initiates a transfer of cash (`from_account`), the third one is the person who receives the cash (`to_account`), and the last column contains the amount transfered (`amount`). Each row in this table represents a transaction (i.e., an update in the database) and specifies that on a given day and time, money was transfered from one account to another. 

In the following paragraphs, we will use R to demonstrate how the exchange among Alice, Bob, and Christie could be enabled through a central electronic ledger. 

## A Time Stamped Transaction
The equivalent of a database table or a spreadsheet in R is a data frame (`data.frame`). With the following example, we create such a data frame that has one line(row) and it captures the time stamped transaction between Alice and Bob.


In [3]:
txn_example  <- data.frame(timestamp = Sys.time(), 
                            to_account = 'Alice',
                            from_account = 'Bob',
                            amount = 900)

In [2]:
txn_example

timestamp,to_account,from_account,amount
2018-08-09 17:13:52,Alice,Bob,900


## A Function to Generate Transactions on Ledger
In our database, we want to require that exactly four values describe every transaction: `timestamp`,  `from_account`, `to_account`, and `amount`. For this, we will create a function that takes these inputs to return a well-formed transaction. In addition to creating a data frame, our function can also validate the inputs. In the case below, the function verifies that amount is a numeric value.

In [2]:
transaction <- function(from_account, to_account, amount) {
   if (is.numeric(amount)) {
     new_txn <- data.frame(timestamp    = Sys.time(), 
                           from_account = from_account,
                           to_account   = to_account, 
                           amount       = amount,
                           stringsAsFactors = FALSE)
     return(new_txn)
   } else {
     writeLines('Amount must be numeric.')
   }
 }

## Genesis Block on Ledger
The sequence of transactions starts with Bob's payment to Alice. However for this to be possible Bob will need to own funds before making a payment to Alice. We will do this by creating an original transaction in which initial funds are distributed to him.  When a blockchain is first initialized, one such transaction exists, and it is contained in the first block, called the *Genesis Block*.
The Genesis Block does not follow any other transaction since it is the first one in the chain of transactions. With the following script, we use the function `transaction` to create the genesis block, and name it `ldgr_txn_genesis`. We initiate our `ledger` by assigning the `ldgr_txn_genesis`.

In [4]:
ldgr_txn_genesis <- transaction(
    from_account = 'Genesis Endowment',
    to_account = 'Bob',
    amount = 2000)
ledger <- ldgr_txn_genesis
ledger

timestamp,from_account,to_account,amount
2018-08-09 17:39:10,Genesis Endowment,Bob,2000


## Account Balances on Ledger
Querying the electronic ledger, we can establish the total amount of deposits in an account, the total withdrawals, and the resulting balance. Before we add new transaction to the `ledger`, such as Bob payment to Alice for the purchase of the car, we need verify  that there are sufficient funds in his account. The following function `get_balance_ldgr` is designed to perform this task.

In [5]:
get_balance_ldgr <- function(ledger, account) {
   deposits <- ledger[ledger$to_account == account, 'amount']
   total_deposits <- sum(deposits, na.rm = TRUE)
   
   withdrawls <- ledger[ledger$from_account == account, 'amount']
   total_withdrawls <- sum(withdrawls, na.rm = TRUE)
   
   #Return the balance
   balance <- unlist(total_deposits - total_withdrawls)
   return(balance)
 }

Use the function to check the balance in Bob's account before he buys the car from Alice.

In [6]:
get_balance_ldgr(ldgr_txn_genesis, 'Bob')

Given that the balance is higher than the amount that he wants to transfer, the transaction with Alice is approved.

In [8]:
ldgr_txn_bob2alice <- transaction(
    from_account = 'Bob', 
    to_account = 'Alice', 
    amount = 900)
ldgr_txn_bob2alice

timestamp,from_account,to_account,amount
2018-08-09 18:05:24,Bob,Alice,900


# Create a Block on Ledger
In a blockchain, multiple transactions are packaged together to form a *block*. To generate a similar scenario with our ledger, we group transactions using `rbind` to create a single data frame that collects two transactions. Given that we used a function that enforces the shape of the data frames, we know that `rbind` will receive consistent data frames that can be joined together without errors.

With the following script we update the ledger to include the new transaction, and review the content of the `ledger`.

In [9]:
ledger <- rbind(ledger, ldgr_txn_bob2alice)
ledger

timestamp,from_account,to_account,amount
2018-08-09 17:39:10,Genesis Endowment,Bob,2000
2018-08-09 18:05:24,Bob,Alice,900


In [16]:
get_balance_ldgr(ledger, 'Alice')
get_balance_ldgr(ledger, 'Bob')

Are there sufficient funds in Alice's account to allow a 300 transfer to Christie? If yes, we can approve the following transaction, update the ledger, and generate account balances as follows:

In [12]:
ldgr_txn_alice2christie <- transaction(
    from_account = 'Alice', 
    to_account = 'Christie', 
    amount = 300)

In [13]:
ledger <- rbind(ledger, ldgr_txn_alice2christie)
ledger

timestamp,from_account,to_account,amount
2018-08-09 17:39:10,Genesis Endowment,Bob,2000
2018-08-09 18:05:24,Bob,Alice,900
2018-08-09 18:08:50,Alice,Christie,300


In [17]:
get_balance_ldgr(ledger, 'Alice')
get_balance_ldgr(ledger, 'Bob')
get_balance_ldgr(ledger, 'Christie')

## Practice Problem
Add a new transaction in which Bob wants to transfer 500 to David.

In [None]:
# Enter your answer here.



# Transactions Through Blockchain
While transactions through a central electronic ledger have shown that it is possible to have cash-less transactions, one of the original issues remains. We still need a trusted third-part to maintain the single copy of the ledger (database). The most common scenario is that a bank is the trusted third-party and keeper of the ledger. However, this trust in the financial system was challenged during the financial crisis. This problem can be resolved by providing all participants with a copy of the ledger (hence, the term *distributed ledger*)

However, this solution solves a problem (i.e., need for trusted third party) by creating a new one (i.e., how to update and achieve a consistent distribution of the ledger across all network participants). In the case of a central ledger, a single  entity has been trusted with the task of maintaining the ledger. In the case of a distributed ledger, each network participant plays a role in making sure that the ledger is properly updated. Therefore, to the extent that blockchain provides a solution to this problem, blockchain is a technology for shared databases.

In blockchain when a new transaction is entered, it is not immediately retransmitted across the network. Instead, transactions are grouped and packaged into blocks. When a block reaches a certain size, then it is transmitted to the rest of the network. Each block contains timestamped transactions, the previous block's hash and a proof of work. As we will see in the following paragraphs, it is through this process that blockchain provides a robust solution to the shared database problem.

## Blockchain Genesis Block
To enable subsequent transaction, a blockchain will need to start with an initial distribution of money (*genesis block*). To maintain consistency with our central ledger example, we create such a block that endows Bob with 2000. The main difference between the genesis block on ledger and the one on blockchain is that the latter contains a hash and a proof of work.  Notice that the initial hash and proof of work are set to `NULL` and zero respectively.

In [18]:
genesis_block <- list(
    transactions = transaction(
        from_account = 'Genesis Endownment',
        to_account = 'Bob', 
        amount = 2000),
    prev_hash = NULL,
    proof_of_work = 0)

We initiate our blockchain, i.e., create the first block, by simply adding the genesis block. In our example, we assume that each block is equal to one transaction. This is an over-simplification that allows us to focus on the general concepts that we want to introduce. This means that the initial `blockchain` is  just the `genesis_block`.

In [19]:
blockchain <- list(genesis_block)
blockchain

timestamp,from_account,to_account,amount
2018-08-09 18:15:08,Genesis Endownment,Bob,2000


Subsequent transactions, such as the transaction between Alice and Bob, will form another block which will be attached to the `genesis_block` in order to form the second block of the blockchain. The transaction of Alice and Christie will form the third block, and so on.

# Hash Function
In blockchain we need a method for blocks to chain together. To ensure this, we will embed information in the blocks such that their order is maintained. For this purpose, we will use a hash function. In cooking, to hash is to break ingredients apart and put them back together, as in making hash brown out of shredded potatoes. A hash function does something similar. It takes a value and maps it to another, ideally random, value.

In R, hash functions are contained in the package `digest`. In our application, we will use the SHA 256 algorithm. To do so, we can use the digest function, specifying the hash algorithm after the string that we want to hash.

In [20]:
library(digest)
digest('banana', 'sha256')

There are two features to notice about hash functions: First, the hash value (the value returned by the hash function) is of equal length; in this case, it is 64 characters long. Second, small variations to the inputs produce significant variations to the output. In the following example, we have changed the input from singular (banana) to plural (bananas) and from all lower case (banana) to capitalized first letter (Banana). These small changes produce completely different hash values.

In [21]:
digest('bananas', 'sha256')
digest('Banana', 'sha256')

Random (or pseudo-random) mapping is essential so that clusters of similar initial values are mapped to ending values that are distinct and do not coincide nor look alike \parencite{knuth_art_1998}. This type of randomness is important since we require a function that is not easily invertible. If the function were invertible, then one would be able to obtain the original value from the ending value.

The randomness of hash values allows us to view them as if they were fingerprints of blocks of transactions. Given that they are of a given length, 64 characters in our case, we can map a block of arbitrary size to its fingerprint, which is a hash of fixed size.

# Hashing the Genesis Block
Using the function `digest` and the `genesis_block` as input, we can create the hash value `genesis_hash` of the `genesis_block` as follows:

In [23]:
genesis_hash <- digest(genesis_block, 'sha256')
genesis_hash

# Practice Problem
Will the output of the `genesis_hash` remain the same if we were to re-run the above sequence of commands?

1. `genesis_block`
2. `blockchain`
3. `genesis_hash`

Explain why.

# Proof of Work and Validation
Proof of work and validation/verification play a very important role. They ensure that the blockchain, which is visible/transparent to all participants, is also virtually immutable. The proof of work  is just evidence that some computation has occurred. The computation must be hard in the sense that it takes time for it to be done, but it must be easy to verify.

Blockchain uses hash functions to implement the validation and proof of work. We know that a hash function is not invertible, so the only way to figure out which input value will give us a hash value with specific characteristics is to guess. In the case  of blockchain, we  seek hash values that have three trailing zeros. This means that the proof of work will take the proof of work from the genesis block and try to append to it a new number. The hash of the combined number should end in three zeros. 

# Validation Function with R
Assuming that the proof of work has already been done, the verification will take the proof of work from the previous block, append the proposed proof of work and verify that the hash of the resulting string ends with ‘000’. In R we create a new function named \textit{is_valid_proof} that will automate the proposed process as follows:

In [24]:
is_valid_proof <- function(last_proof, this_proof){
   guess = paste0(last_proof, this_proof)
   guess_hash = digest(guess, 'sha256')
   test = grepl('0{3}$', guess_hash)
   return(test)
}

The function as a whole, takes the previous proof of work (`last_proof`) and a proposed proof of work (`this_proof`), concatenates them, hashes the resulting value and verifies whether it ends with three zeros. Lets take a closer look at each step in this function.

1. Create a new variable called `guess`. The new variable is created by concatenating the previous proof of work ('last_proof') and a proposed proof of work (`this_proof`). In R, we use the function `paste0` to concatenate strings. For example, `paste0('apple', 'wood')` will result in `applewood`.
2. Create a new variable called `guess_hash`. The variable guess_hash uses the function digest to generate the hash value of the variable `guess`.
3. Create a variable called `test` which takes the value `TRUE` or `FALSE`.
    + Use the command `grepl` to verify if a certain pattern appears in a string by using regular expressions. In particular, the term `0{3}$` means that we are searching for a specific pattern `(0)`, we want this pattern to appear three times `{3}`, and we want this sequence of three zeros to appear at the end of the string of characters (`$`). 
    + The command `grepl` returns the value `TRUE` if the specified pattern appears in the string. Else, it returns the value of `FALSE`.
    
    
4. The function ends by returning the value of the variable `test`.

With this function, we can now construct another function that will search for a proof of work by guessing. 

# Proof of Work Function with R
The aim of proof of work is to find a new number that has the following property. If we create a string by concatenating the `last_proof` and this new number, the hash value of this string should end with three zeros.

The following R function (`proof_of_work`), starts by creating a new variable (`candidate_proof`) and assigns an arbitrary value, in this case, 0. Test this suggested proof using the function `is_valid_proof`. If it passes, then return the value of `candidate_proof`. If it does not pass the test, then add one to `candidate_proof` and repeat until it passes the validity test.

In [26]:
proof_of_work <- function(last_proof) {
   candidate_proof <- 0
   while(!is_valid_proof(last_proof, candidate_proof)) {
     candidate_proof <- candidate_proof + 1
   }
   return(candidate_proof)
 }

With the line `while(!is_valid_proof(last_proof,  candidate_proof))`, we  instruct R to repeat the commands that follow while the condition between brackets is true. This line could be read as "repeat while the candidate proof is not valid," bearing in mind that `!` means *not* in R.

Using this function, we can search for a valid proof of work for the genesis block.


In [27]:
genesis_proof <- proof_of_work(genesis_block[['proof_of_work']])
genesis_proof

The suggested proof is 7975. This means that if we take the previous proof of work (`last_proof`=0) and a proposed proof of work (`this_proof`=7975), concatenate them and hash the string, the resulting value must end with three zeros. In other words the function `is_valid_proof` should return the value `TRUE`.

In [28]:
is_valid_proof(0,7975)

We can further verify this by running the `digest` function on the concatenated values and see that the resulting hash value ends with three zeros.

In [29]:
digest(paste0(0,7975), 'sha256')