# Concurrency 1 and Databases 4

## Threads, the GIL, transactions and functional programming

### Threads

Some code examples here taken from the Python Cookbook, freely available at http://chimera.labs.oreilly.com/books/1230000000393/index.html. This is an excellent book and will help you with your python programming and with your project.

In [5]:
def fib(n):
    return fib(n - 1) + fib(n - 2) if n > 1 else n

In [33]:
from threading import Thread
from time import sleep
t = Thread(target=fib, args=(35,)) 
t.start()
while t.is_alive(): 
    print('Still running')
    sleep(1)
print("done")

Still running
Still running
Still running
Still running
Still running
done


But running something computationally intensive in a thread is a bad idea,
because python has a global interpreter lock (GIL). Thus, Python allows one thread to run in the interpreter at any given time. 

This makes Python threads better for handling concurrent execution where there is a lot of waiting...waiting for stuff from a file, from the network, from a database, sleeping, etc

#### What is concurrency, exactly?

 A concurrent program has multiple logical threads, which may or may not run in parallel.
 
>Rob Pike puts it: Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.

Concurrency is when your program needs to handle multiple simultaneous **events**: multiple network connections to a database, a repl connection, some other process, etc.
 

- concurrent programs are often non-deterministic —the order will depend on the precise timing of events. So its best to hadle independent streams of execution, or communicate between these streams in loosely coupled ways such as queues.
- concurrency is critical to responsivity. Your web broser handles many resource pull ins together, some in separate threads. But even within the main process it might need to handle a mouse event and an image draw/network fetch at the same time. 


### I/O bound code vs cpu bound code

In [26]:
from threading import Thread
from time import sleep
from time import time


def sleepy(context): #like io
    i=0
    while i < 10:
        print("context {}, {} -- {} Sleepy!".format(context, i, int(time())), flush=True)
        #you could replace this by a request to a website in a crawler
        sleep(1)
        i += 1


def cpuy(context):
    for i in range(35):
        val = fib(i)
        print("{} cpuy fib({}) is {}".format(context, i, val))

        
def main_cpu():
    start = time()
    cpuy("serial 1")
    cpuy("serial 2")
    print("serial cpu elapsed:", time() - start)
    start=time()
    #t = Thread(target=sleepy)
    #t.start()
    t2 = Thread(target=cpuy, args=("threaded 2",))
    t2.start()
    # Main thread will read and process input
    cpuy("threaded 1")
    print("thread cpu elapsed:", time() - start)

def main_sleep():
    # Second thread will print the hello message. Starting as a daemon means
    # the thread will not prevent the process from exiting.
    start = time()
    sleepy("serial 1")
    sleepy("serial 2")
    print("serial sleep elapsed:", time() - start)
    start=time()
    t = Thread(target=sleepy, args=("threaded 2",))
    t.start()
    # Main thread will read and process input
    sleepy("threaded 1")
    print("thread sleep elapsed:", time() - start)

In [27]:
main_cpu()

serial 1 cpuy fib(0) is 0
serial 1 cpuy fib(1) is 1
serial 1 cpuy fib(2) is 1
serial 1 cpuy fib(3) is 2
serial 1 cpuy fib(4) is 3
serial 1 cpuy fib(5) is 5
serial 1 cpuy fib(6) is 8
serial 1 cpuy fib(7) is 13
serial 1 cpuy fib(8) is 21
serial 1 cpuy fib(9) is 34
serial 1 cpuy fib(10) is 55
serial 1 cpuy fib(11) is 89
serial 1 cpuy fib(12) is 144
serial 1 cpuy fib(13) is 233
serial 1 cpuy fib(14) is 377
serial 1 cpuy fib(15) is 610
serial 1 cpuy fib(16) is 987
serial 1 cpuy fib(17) is 1597
serial 1 cpuy fib(18) is 2584
serial 1 cpuy fib(19) is 4181
serial 1 cpuy fib(20) is 6765
serial 1 cpuy fib(21) is 10946
serial 1 cpuy fib(22) is 17711
serial 1 cpuy fib(23) is 28657
serial 1 cpuy fib(24) is 46368
serial 1 cpuy fib(25) is 75025
serial 1 cpuy fib(26) is 121393
serial 1 cpuy fib(27) is 196418
serial 1 cpuy fib(28) is 317811
serial 1 cpuy fib(29) is 514229
serial 1 cpuy fib(30) is 832040
serial 1 cpuy fib(31) is 1346269
serial 1 cpuy fib(32) is 2178309
serial 1 cpuy fib(33) is 3524578
se

In [28]:
main_sleep()

context serial 1, 0 -- 1478101306 Sleepy!
context serial 1, 1 -- 1478101307 Sleepy!
context serial 1, 2 -- 1478101308 Sleepy!
context serial 1, 3 -- 1478101309 Sleepy!
context serial 1, 4 -- 1478101310 Sleepy!
context serial 1, 5 -- 1478101311 Sleepy!
context serial 1, 6 -- 1478101312 Sleepy!
context serial 1, 7 -- 1478101313 Sleepy!
context serial 1, 8 -- 1478101314 Sleepy!
context serial 1, 9 -- 1478101315 Sleepy!
context serial 2, 0 -- 1478101316 Sleepy!
context serial 2, 1 -- 1478101317 Sleepy!
context serial 2, 2 -- 1478101318 Sleepy!
context serial 2, 3 -- 1478101319 Sleepy!
context serial 2, 4 -- 1478101320 Sleepy!
context serial 2, 5 -- 1478101321 Sleepy!
context serial 2, 6 -- 1478101322 Sleepy!
context serial 2, 7 -- 1478101323 Sleepy!
context serial 2, 8 -- 1478101324 Sleepy!
context serial 2, 9 -- 1478101325 Sleepy!
serial sleep elapsed: 20.031989812850952
context threaded 2, 0 -- 1478101326 Sleepy!
context threaded 1, 0 -- 1478101326 Sleepy!
context threaded 1, 1 -- 147810

### How to write threads

In [37]:
#do this
# Code to execute in an independent thread
import time
def counter(nmax):
    n=0
    while n <= nmax:
        print('n', n) 
        n += 1 
        sleep(1)
# Create and launch a thread
from threading import Thread
t = Thread(target=counter, args=(5,)) 
t.start()

n 0
n 1
n 2
n 3
n 4
n 5


In [35]:
#dont do this
from threading import Thread
from time import sleep
class Counter(Thread): 
    def __init__(self, nmax): 
        super().__init__()
        self.nmax = nmax 
        self.n=0
    def run(self):
        while self.n <= self.nmax:
            print('n', self.n) 
            self.n += 1 
            sleep(1)
c = Counter(5)
c.start()

n 0
n 1
n 2
n 3
n 4
n 5


In [38]:
#because you might want to run it in a new process!
import multiprocessing
p = multiprocessing.Process(target=counter, args=(5,)) 
p.start()

n 0
n 1
n 2
n 3
n 4
n 5


### Communicating with threads

In [14]:
from queue import Queue 
import random
from threading import Thread
def producer(out_q):
    counter=5
    while counter > 0:
        # Produce some data
        data = random.random()
        print("put data", data, counter)
        out_q.put(data)
        time.sleep(2)
        counter -=1
        
# A thread that consumes data
def consumer(in_q): 
    counter = 0
    while True:
        # Get some data
        try:
            data = in_q.get(timeout=5) # Process the data ...
            counter += 1
            print("got data", data, counter)
        except:
            break
        # Create the shared queue and launch both threads
q = Queue()
t1 = Thread(target=consumer, args=(q,))
t2 = Thread(target=producer, args=(q,))
t1.start()
t2.start()

put data 0.06013972309645832 5
got data 0.06013972309645832 1
put data 0.9194857108251694 4
got data 0.9194857108251694 2
put data 0.8370650677184625 3
got data 0.8370650677184625 3
put data 0.21314236649834428 2
got data 0.21314236649834428 4
put data 0.6169153817399435 1
got data 0.6169153817399435 5


### Process models: especially in databases

I saw the characterization of database systems along the lines we have talked about in this course first in http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf . This is an incredible review paper, if a little bit dated. But I stongly recommend you read it now: you will get a lot out of it. Between this paper, "Designing Database Intensive Applications" from Oreilly, and what you have learnt, you will be able to read all modern database papers.


We need concurrency because you ou are writing a multi-user database: it must take more than one request at a time.

You must plan how to handle these requests. Lets talk in terms of a unit of execution, something used to handle one user, or do one transaction. Then the question arises: what do these map to?

- an operating system process (multiprocessing). Scheduled by OS kernel, state maintained in private, per-process address space and execution context. Really slow switching.
- an operating system thread (multithreading). Scheduled by OS, but does not have its own address space and context; shares it with other threads within the same process. Fast switching
- an inside-process thread (light weight thread or coroutine) with asynchronous IO: one process, so the adress and context are shared with the other "user" threads or **coroutines**. Any long running IO can block the process, so must be done asynchronously and perhaps in another OS thread. Super fast switching.

We'll call the unit-of-execution in the case of a DBMS, the DBMS Worker. In general, this will map onto 1 client request, or one transaction

#### Process per DBMS worker model

This really is two models: spawn a process every thime a new request comes in, or use a process pool from which a process is chosen to handle the request. Once the request is done, the process goes back to the pool.
This is easy to implement, and many databases were first started when OS based thread support was poor. Thus they used this model.

The complication here is sharing memory structures, especially for the lock table and buffer pool(this is a memory pool into which database pages are mapped to use: these must be flushed to disk on transactions but also serve as a cache for gets). This uses POSIX shared memory or System V ipc (see http://semanchuk.com/philip/sysv_ipc/)

Also note that memory-mapping can be used for this purpose. Memory maps map part of a file to memory, and let you address the filea s if you were addressing a memory buffer. They take care of bring in parts of the file to memory under the hood. This implementation is used in lmdb and boltdb to provide fast transactions.

(Note that OS's also provide buffer caches in which they load existing files in memory. But this is not shareable, and many databases prefer to manage this mapping themselves)

Because of the overhead of context-switching, process per dbms worker models tend to be not very scaleable. But in Python, with the GIL, it might be a preferred model if there is any substantial CPU bound processing like a stored procedure. Still, one could run a Threadpool, with the ability to use a process as a coprocessor. See http://chimera.labs.oreilly.com/books/1230000000393/ch12.html#_dealing_with_the_gil_and_how_to_stop_worrying_about_it .

The buffer pool is stored in shared memory thats shared by all these processes. Its important to use shared memory. You might be tempted to create the database process in the memory of the main process, and after fork write to it in the children. This is a bad idea as memory semantics for children are copy on write, so that you will be then writing to the new processes memory space and thus database replica, and not to the main database.

Process per worker model is supported by IBM DB2, PostgreSQL, and Oracle.

#### Worker per thread or threadpool model

Here the main thread listens for the database connections, and each connection is then allocated a new thread. This is a model thats really simple to start of with, but has issues with deadlocks and stuff: you must be careful. Although, these challenges are also present in the multiprocess model due to use of shared memory and thus the need to lock there.

The thread per worker model scales well to many concurrent connections (as long as you dont have the GIL, and even then, if the computation part is small: although note that if you had an overall faster speed by using an in memory db and thus minimizing IO, most of your remaining overhead would be in the GIL unless you used C code/Cython to access the memory and do some computations on it.) 

IBM DB2 has a worker per thread model, as do MS SQL server, MySQL, Informix, and Sybase.

In this case, the buffer pool is simply a heap resident data structure. where as in the process based model, it is allocated in shared memory so that it is available to all processes.

When a thread or process needs a page to be read from disk it will generate an io request (read-into) with the disk address and the memory address. Doing stuff in fixed size pages (where the size is oprimized for cache and the size of your data) makes this process fast.

TODO:??: The reverse is dont to write (write from). The lock table uses a similar implementation.

#### Light weight threads or coroutines

In the past, when OS thread support was not so good, and now again since the resurgence of interest in asynchronous programming, many widely used databases implement their own light-weight threads. To quote the stonebraker paper:

These lightweight threads, or DBMS threads, replace the role of the OS threads described in the previous section. Each DBMS thread is programmed to manage its own state, to perform all potentially block- ing operations (e.g., I/Os) via non-blocking, asynchronous interfaces, and to frequently yield control to a scheduling routine that dispatches among these tasks.

Sybase and Informix are examples of this mode of usage.

To avoid the potential for deadlock, programs that use locks should be written in a way such that each thread is only allowed to acquire one lock at a time.

Here is a locked dictionary, a proxy for a very simple database if you like:

In [20]:
import threading
class LockableDict: 
    def __init__(self):
        self._d={}
        self._dlock={}
        
    def __getitem__(self, attr):
        return self._d[attr]
    
    def __setitem__(self, attr, val):
        if attr not in self._d:
            self._dlock[attr]=threading.Lock()
        with self._dlock[attr]:
            self._d[attr] = val

In [21]:
l = LockableDict()
l['a'] = 3
l['a']

3

We'll implement a database server to front this next week.

### Race conditions

![](https://dl.dropboxusercontent.com/u/75194/counterupdate.png)

Our code is still problematic

In [77]:
import threading
class SharedCounter: 
    def __init__(self, initial_value = 0):
        self._value = initial_value
        
    def incr(self, context, delta=1):
        temp = self._value
        temp = temp + delta
        if temp %2 ==0:
            sleep(0.0001)
        self._value = temp
        print("{} {}".format(context, self._value))

In [78]:
def run_counter(n, c, context):
    for i in range(n):
        c.incr(context)

In [84]:
from threading import Thread
c = SharedCounter()
t = Thread(target=run_counter, args=(15,c,"thread")) 
t.start()
run_counter(15, c, "main" )

thread 1
main 2
main 3
thread 2
thread 3
main 4
main 5
thread 4
thread 5
main 6
main 7
thread 6
thread 7
main 8
main 9
thread 8
thread 9
main 10
main 11
thread 10
thread 11
main 12
main 13
thread 12
thread 13
main 14
main 15
thread 14
thread 15
main 16


In [85]:
import threading
class SharedCounterWithLock: 
    def __init__(self, initial_value = 0):
        self._value = initial_value
        self._value_lock = threading.Lock()
        
    def incr(self, context, delta=1):
        with self._value_lock:
            temp = self._value
            temp = temp + delta
#             if temp %2 ==0:
#                 sleep(0.0001)
            self._value = temp
            print("{} {}".format(context, self._value))

In [86]:
cl = SharedCounterWithLock()
t = Thread(target=run_counter, args=(15,cl,"thread")) 
t.start()
run_counter(15, cl, "main" )

thread 1
thread 2
thread 3
thread 4
thread 5
thread 6
thread 7
thread 8
thread 9
thread 10
thread 11
thread 12
thread 13
thread 14
thread 15
main 16
main 17
main 18
main 19
main 20
main 21
main 22
main 23
main 24
main 25
main 26
main 27
main 28
main 29
main 30


Only single opcodes (in the virtual machine) ops are **thread-safe**, the notion that shared structures and code does not need to be locked. See https://en.wikipedia.org/wiki/Thread_safety . Ways to make things thread safe are atomicity (discussed soon), immutable data (also discusses soon)

## Concurrent access to databases

### Transactions

(most diagrams and all quotes are from Oreilly's Designing Data Intensive Applications: a book you should read if you, like me, enjoy this stuff).

The general rules (we'll break these down soon...)
- The batch of operations is viewed as a single atomic operation, so all of the operations either succeed together or fail together.
- The database is in a valid state before and after the transaction.
- The batch update appears to be isolated; other queries should never see a database state in which only some of the operations have been applied.
 
Databases have a mechanism for wrapping a single or multiple processes into a **Transaction**. This means that the batch of operations either all happen (**commit**) or not happen at all (**abort**, **rollback**). This is called **atomicity**.

In your career you will likely at some point have toevaluate which database you use for the job. Always ask: do I need "ACID" transactions: for OLTP you probably do, for OLAP the answer maybe a no.
(The NOSQL movement made transactions into a casualty since they were distributed and worried about replication and partitioning. Be very careful evaluating transactional claims in NoSQL databases)

The transactional guarantees are represented by the acronym `ACID`. The A is for atomicity, which we just described.

* C is for **Consistency**: data invariants must be true. This is really a property of the application: eg accounting tables must be balanced. Databases can help with foreign keys, but this is a property of the app. We wont discuss this one further,
* D is for **Durability**: once a transaction has comitted successfully, data comitted wont be forgotten. This requires persistent storage, or replication, or both
* I is for **Isolation**. This is the most interesting of the lot, and critical to the sensible running of a databse. The idea is that transactions should not step on each other. Each transaction should pretends that its the only one running on the databse: in other words, as if the transactions wer completely serialized. In practice this would make things very slow, so we try different transactional guarantees that fall short of explicit serialization except in the situations that really need serialization.

#### Single vs multi-object

Atomicity and Isolation are usually provided by most databases for single object writes, like a single set (or for that matter a single get) on one machine. As we shall see later, these are critical to prevent "lost updates" and are a big part of the transaction story, one that is implemented even in the new-fangled databases. 

But they are not the full story. We need multi-object transactions to

- deal with foreign keys
- deal with the denormalization seen in nosql's like mongo
- deal with secondary indexes.

Lets look at all of these scenarios using a relational setup. We'll revisit these in Friday's lab when we create an append-only binary-tree key-value database embedded in a single index (ie trivially clustered).

We'll also revisit these concepts a few lectures later in the general context of programming, not just in databases.

## Why is isolation important?

It is hard to program without isolation. isolation is what guarantees stability. It is what makes sure that there are no dirty reads and dirty writes.

From "Designing data intensive applications", here are the definitions:

Dirty reads

>One client reads another client’s writes before they have been committed. The **read committed isolation level** and stronger levels prevent dirty reads.


Dirty writes

>One client overwrites data that another client has written, but not yet committed. Almost all transaction implementations prevent dirty writes.

Clearly, the notions of isolation are really the notions of concurrenvy: these issues will also occut when 2 programs access any data, in memory or in a database. In both cases locks and other ideas must be used to make sure that there is only one mutator at a time, and that an object is not exposed in an inconsistent state.

### Read Committed

No dirty reads and no dirty writes.

Dirty reads means that you can peek into the goings on inside a transaction and get at itermediate values. This could mean you see a value that would be later rolled back.

With dirty writes you could be overwriting an uncomitted value. When we write we will use a lock to ensure no-one else can write.

No dirty reads:

![](https://dl.dropboxusercontent.com/u/75194/nodirtyreads.png)


Dirty Writes in multiple objects: 

![](https://dl.dropboxusercontent.com/u/75194/dirtywrite.png)



For dirty writes: 
In a regular Relational Database Management System (RDBMS) we use row-level locks. Another writer must wait until the lock is given up, and the reader will read the value before the lock was set as long as the write transaction is in progress.

For dirty reads:
We could use the same lock, and require the reader to briefly lock the object. But a write transaction may be a long one, there might be the need to do  multiple reads in the meanwhile.

Thus, in Read Comitted mode, in essence we need to remember the pre-transaction value, ie the old comitted value and the also the new value being set by the transaction which holds the current write lock. 

**While this transaction runs, any other thread/transaction sees the old value**. Once this one commits, then the values read by the other transaction are switched. 

Read Comitted is almost always implemented: its the default in Oracle, Postgres, SQL Server, etc

### The Read Skew Problem and Snapshot Isolation

>Non-repeatable reads:
>A client sees different parts of the database at different points in time. This is most commonly prevented with snapshot isolation, which allows a transaction to read from a consistent snapshot at one point in time. It is usually implemented with multi-version concurrency control (MVCC).

![](https://dl.dropboxusercontent.com/u/75194/readskew.png)

If you look at the diagram above, it makes perfect sence if Alice's two reads are separate read transactions. But what if they are one transaction and we are in read-comitted mode? Since the reads took a very long time and were on both sides of a kosher transaction, they might give Alice a heart attack. This is a non-repeatable read as if Alice were to re-issue the read transaction, acct 1 would now show 600. This problem is perfectly consistent with operations in read-comitted mode: the second read gets the switched value because the write transaction is completed.

The fix for this would seem to be clear: if both the reads happen in one transaction, one must make sure that because this transaction happened before the write, it sees the older version of the accounts (500, 500). 

These thoughts form the basis of Multi-Version concurrency control, or MVCC, also known as snapshot isolation. This is needed for analytic queries testing integrity, backups, etc.

The key principle is (from DDIA):
> readers never block writers, and writers never block readers

Here then are the rules:

>1. At the start of each transaction, the database makes a list of all the other transations which are in progress (not yet committed or aborted) at that time. Any writes made by one of those transactions are ignored, even if the transaction subsequently commits.
2. Any writes made by aborted transactions are ignored.
3. Any writes made by transactions with a later transaction ID (i.e. which started after the current transaction started) are ignored, regardless of whether that transaction has committed.
4. All other writes are visible to the application’s queries.

Here's what snapshot isolation using MVCC looks like:

![](https://dl.dropboxusercontent.com/u/75194/mvcc.png)

Snapshot isolation is supported by postgres, MySQL with the InnoDB storage engine, Oracle, SQL Server, and more, but may not be the default, as its less performant than read-committed isolation. 

Indexes in a MVCC database are hard to get right...and affect performance as they need updating. We'll see some ways around this in the lab.

For bank accounts you need it, but for github you might not...

What about writes?

There is another problem we encounter here: 

### the lost update problem.

Consider an upsert, where we read a value, do something with it, like the balance change above, and then write it. Consider two such upserting transactions T1 and T2. Because we lock writes, a transaction T1 might read a value V0, and then transaction T2 starting when T1 is on can only read value V0. Since T2 does not (yet) have lock, the code in a `get` uses the old value. This is good as the transaction T1 is still running.

But here is the problem: T1 will create a new value, say with V0+1. It then relinquishes the lock. Now T2 comes gets lock, operates on V0, and makes it V0+2. 

Say V0 was 5 to start with. The notion of the process might have been 5->6->8. But we wont do that. Our latest value will be 7.

![](https://dl.dropboxusercontent.com/u/75194/lostupdate.png)

The fix to this is atomic updates/upserts. The `UPDATE` command in sql does just that, and if we build transactional facility into our database to do multiple gets, there is no reason not to extend it to an upsert. The entire upsert would grab the lock, and once that happens this would effectively serialize T1 and T2. Atomic updates are usually thus built by taking the lock when the object is read, or by forcing all atomic operations to be on one given thread.

So thus it seems we can add a transaction manager, have an explicit notion of read, write, and upsert transactions, and have a reasonable snapshot isolated database.

But even this does not solve some problems...

### Write skew

![](https://dl.dropboxusercontent.com/u/75194/writeskew.png)

Both Alice and Bob are last docs on a shift. They both want to leave. So they both go to their web portal and click on the "check-out" button. The app is using snalshot isolation. Since they both click at roughly the same time, before any one's check-out transaction has completed, the both "get" a count of two doctors on call and their transactions are allowed to proceed, leaving no docs. oops.

This anomaly is called **write-skew** and happens as the transactions here are updating different rows.

Another example is conflicting meetings being booked in a room at the same time.

The pattern here is that there is a SELECT query that two transactions do that would have retuened different results based on which a write would have happened or not. But these two transactions were touching two different rows or parts of the dbase and were thus allowed under MVCC. You can think of write-skew as a generalization of lost-update.

> This effect, where a write in one transaction changes the result of a search query in another transaction, is called a phantom. Snapshot isolation avoids phantoms in read-only queries, but in read-write transactions like the examples we discussed, phantoms can lead to particularly tricky cases of write skew.

One could solve these problems in multiple ways. The most general solution is to use serialized isolation.

### Serialized Isolation

- one could actually run things in serial on one thread. With in memory databases this has become faster. Voltdb/h-store does this with replication, and stored procedures (in java and groovy, a JVM language).

- stored procedures

Stored procedures run inside the database address space, the idea being to eliminate the overhead of serialization and de-serialization, network transfer, etc (but a badly written stored procedure could be worse)

- if multithreaded

but if you want multithreaded, the old, but not gold since its slowness makes people not like it is 2-phase locking(2PL). 

- the ingredients are shared read locks and exclusive write locks.

The process then is:

>- if a transaction wants to read an object, it must first acquire the lock in shared mode. Several transactions are allowed to hold the lock in shared mode simultaneously, but if another transaction already has an exclusive lock on the object, the transaction must wait.
- If a transaction wants to write to an object, it must first acquire the lock in exclu‐ sive mode. No other transaction may hold the lock at the same time (neither in shared nor in exclusive mode), so if there is any existing lock on the object, the transaction must wait.
- If a transaction first reads and then writes an object, it may upgrade its shared lock to an exclusive lock. The upgrade works the same as getting an exclusive lock directly.
- After a transaction has acquired the lock, it must continue to hold the lock until the end of the transaction (commit or abort). This is where the name “two- phase” comes from: the first phase (while the transaction is executing) is when the locks are acquired, and the second phase (at the end of the transaction) is when all the locks are released.

### Functional programming: IMMUTABLE DATA

All of these are illustrations of simultaneous access to **mutable** data by concurrent threads. In a world of multi-core computers where lots of events must be handled, we should be thinking of ways to not have to deal with so much state.

Functional programming has this notion of **referential transparency** where functions dont modify state: they are like maths functions: we can always substitute the snswer in place of the function because some hidden "instance" variable encapsulating state is not allowed to change a functions behaviour in functional programming.

The key idea to get out of that is immutability of bindings and data structures. We can use this idea in writing databases to similify things and make implementing isolation levels easier with no-overwriting.

## Immutable BST

We'll study the binary search tree as a proxy for the btree. Its not the best choice, as rebalancing and moving keys around creates havoc with seeking on disk, rather than being localizes to some pages, but its much more understandable.

![](https://dl.dropboxusercontent.com/u/75194/immutablebst.png)

(from purely functional data structures by Okasaki: great book BTW, with examples of balanced trees as well)

- what is Append-Only

![](https://dl.dropboxusercontent.com/u/75194/append1.png)

- What is copy on write

![](https://dl.dropboxusercontent.com/u/75194/cow1.png)
![](https://dl.dropboxusercontent.com/u/75194/cow2.png)
![](https://dl.dropboxusercontent.com/u/75194/cow3.png)

See http://symas.com/mdb/20141120-BuildStuff-Lightning.pdf