In [None]:
import os
os.environ['JDBC_HOST'] = 'jrtest01-splice-hregion'

# Transactions in Splice Machine

In the *For Developers, Part II* course we showed you how transactions are processed and handled in Splice Machine, using Spark and Scala. In this notebook we'll take a deeper dive, and will explain the concept of transactions and how Splice Machine handles them.

First, let's define a _transaction_, which is a series of events that appear single-threaded to the user. A transaction consists of the events between a begin timestamp and a commit timestamp. A transaction can be in one of four states:

<table class="splicezep">
    <thead>
        <tr>
            <th>State</th>
            <th>Description</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Active</td>
            <td>The transaction always begins in this state.</td>
        </tr>
        <tr>
            <td>Rollback</td>
            <td>The transaction is moved to the rolled back state.</td>
        </tr>
        <tr>
            <td>Committed</td>
            <td>The transaction has been committed to the database.</td>
        </tr>
        <tr>
            <td>Error</td>
            <td>This is logically equivalent to the rolled back state, but an uncontrollable error has occurred and should be investigated.</td>
        </tr>
    </tbody>
</table>

In Splice Machine, transactions are durably stored in the `SPLICE_TXN` table. This is a normal Hbase table, __not__ a Splice Machine table. `SPLICE_TXN` is not governed by transactional semantics; instead, it relies on Hbase's atomic row operations (`increment`, `compareAndSet`, and `put`). The rowkey is an 8-bit transaction ID with bits reversed to avoid sequential ordering.

## Snapshot Isolation

Splice Machine uses state-of-the-art snapshot isolation as a form of multi-version concurrency control (MVCC). Writers do not block readers and Splice Machine is able to provide fast, high concurrency. Transactions are defined with begin and commit timestamps. Overlapping transactions that write to the same row will conflict. Reads will see data committed with a later timestamp than the transaction's begin timestamp.

Here is a diagram that depicts how transactions are handled using snapshot isolation:

<img src="https://splice-training.s3.amazonaws.com/external/images/SIExample.png" width="640">
<br/>


<p style="padding-top:200px;">This diagram depicts 3 transactions:</p>

* Transaction `T1` starts at timeline `t2`
* 10 is added to Item A at timeline `t3` in the `T1` transaction
* Transaction `T2` starts at timeline `t4`
* 10 is added to Item A at timeline `t5` in the `T1` transaction
* Transaction `T1` is committed at timeline `t6`
* Transaction `T3` starts at timeline `t7`
* 10 is added to Item B at timeline `t9` in the `T2` transaction
* Item C is set to Item A + 10 at timeline `t10` in the `T2` transaction
* Transaction `T2` is committed at timeline `T11`
* Transaction `T3` attempts to add 10 to Item B at timeline `t13` but receives a write-write conflict because Item B was been updated by transaction `T1` after transaction `T3` was started
* Transaction `T3` is rolled back 
* A new transaction `T3'` is started at timeline `t16` 
* 10 os added to Item B at timeline `t17`
* Transaction `T3'` is committed at timeline `t18`


## Where to Go Next
The next notebook in this class, [*Query Optimization*](./f.%20Query%20Optimization.ipynb), shows you advanced optimization techniques for boosting query performance.
