Skip to content

TxGuide

Brad Bebee edited this page Feb 13, 2020 · 1 revision

Background

Bigdata uses Multi-Version Concurrency Control (MVCC) for transactions. MVCC is in the family of optimistic concurrency control algorithms. Bigdata does not obtain a lock when you start a transaction. Instead, it validates the transaction when you commit. The advantage of MVCC is that readers and writers never block and writers succeed unless there is a conflict. This can yeild higher concurrency that Two Phase Locking (2PL).

Timestamps are central to transaction processing in bigdata. There is a unique timestamp for each each commit point and each transaction. When a transaction commits, it first validates each tuple in its write set and then annotates each tuple with the revision time for that transaction. A transaction will abort if there is a write-write conflict. This occurs when a concurrent transaction (one running at the same time) modified the same tuple and committed its changes first. Write-write conflicts are detected by a revision timestamp on the tuple which is more recent than the start time of the transaction.

Unisolated Indices

By default, Blazegraph registers indices that do NOT support transactions. Write operations on such indices are always "unisolated". Unisolated write operations provide a higher throughput since writes are not double-buffered, but writes on a given index will be serialized.

Against a Journal, unisolated writes can provide full ACID semantics with high performance.

In scale-out, unisolated writes provide shard-wise ACID semantics.

Note that read-only transactions with snapshot isolation are always supported, even when the indices are not configured to support full read/write transactions.

Registering an index that supports transactions

You MUST explicitly enable transaction support when you register an index. Transaction processing requires that the index maintains both per-tuple delete markers and per-tuple version identifiers. While scale-out indices always maintain per-tuple delete markers, neither local nor scale-out indices maintain the per-tuple version identifiers by default.

final IndexMetadata indexMetadata = new IndexMetadata( "testIndex", UUID.randomUUID());

// this index will support transactions.
indexMetadata.setIsolatable(true);
                
// register the index.
store.registerIndex(indexMetadata);

Kinds of transactions

There are two kinds of transactions:

  • read-only transactions
  • read-write transactions

Read-only transactions are always supported. They provide extremely fast, highly concurrent snapshot isolation. You specify a read-only transaction by declaring the commit point from which you want to read to the transaction service. The returned transaction identifier provides snapshot isolation with a fully consistent view of the state of the database as of that commit point.

Read-write transactions fully buffer writes on "isolated" indices, then validate those writes during the commit protocol, and will fail a transaction if the write set cannot be validated (due to intervening commits). Read-write transaction support must be configured when you create an index.

In addition to transactions, you can have unisolated operations. Unisolated operations are key to extremely high concurrency since they do not require any global coordination. Both the RDF database and the "row store" make extensive use of unisolated operations.

Local transaction support

Creating and using transactions with the Journal is straightforward.

Journal store = ...

// start a read-write transaction.
final long txid = store.newTx(ITx.UNISOLATED);

// Obtain a view of a named index isolated by that transaction.
final IIndex isolatedBTree = store.getIndex("testIndex", txid);

// Write on the index.
isolatedBTree.insert("Hello", "World!");

// Commit the transaction.
store.commit(txid);

BigdataSail Update Transactions

The BigdataSail wraps the Journal. When wrapping the Journal, the index updates are fully ACID. The following pattern shows how to obtain a connection that supports mutation, work on that connection, and then commit the connection. If anything goes wrong, then the patterns will rollback the work performed on the connection. A similar pattern may be used with the BigdataSailRepository. This class is just a wrapper over the BigdataSail and the connection objects that it returns are just a wrapper over the BigdataSailConnection objects.

BigdataSailConnection conn = null;
boolean ok = false;
try {
conn = sail.getConnection();
doWork(conn);
conn.commit();
ok = true;
} finally {
   if( conn != null ) {
      if(!ok) {
         conn.rollback();
      }
   conn.close();
   }
}

Transaction Logger

Recycling behavior depends critically on the close of open transactions. The MVCC architecture of Blazegraph means that data for the historical commit points cannot be recycled until there are no active transactions reading on those commit points. If you are holding open a transaction (either a read-only or a read-write transaction) while writing on the database, the database cannot recycle storage and will start to grow in size on the disk once it fills up the available allocations. See the page on RetentionHistory for more about this issue, including the specifics of the RWStore recycler behavior.

If you suspect a storage leak, you should turn on the following logger in the log4j configuration file:

 com.bigdata.txLog=INFO

This will cause the following events to be logged:

Event Fields Description
OPEN-JOURNAL The UUID, file, and BufferMode of the Journal A Journal was opened.
CLOSE-JOURNAL The UUID and file of the Journal. A Journal was closed.
COMMIT commitTime The unisolated write set was committed.
OPEN txId, readsOnCommitTime A read-only or read-write transaction was opened.
CLOSE txId, readsOnCommitTime A read-only or read-write transaction was closed.
RECYCLER lastCommitTime, latestReleasableTime, lastDeferredReleaseTime, activeTxCount This is an information message generated when the recycler runs. The recycler cannot recycle allocations unless activeTxCount is ZERO (0). If the counter never becomes ZERO (0), then the RWStore will "leak storage". This is generally an application bug.
RECYCLED fromTime, toTime, totalFreed, commitPointsRecycled, commitPointsRemoved Deferred frees of allocations were released (recycled). Check totalFreed and commitPointsRemoved to see if anything was actually recycled.
ABORT N/A The unisolated write set of the Journal was discarded.
ROLLBACK N/A The state of the Journal was restored to the previous root block.
SAIL-CREATE-NAMESPACE namespace A new namespace was created (since 2.2.0).
SAIL-DESTROY-NAMESPACE namespace A namespace was destroyed (since 2.2.0).
SAIL-START-CONN conn A new BigdataSailConnection was created.
SAIL-NEW-TX txId, connn A new read/write transaction identifier was assigned to a BigdataSailConnection. This occurs when a read/write tx is created and each time you call rollback() or commit() on a read/write tx.
SAIL-COMMIT-CONN commitTime, conn commit() was invoked on a BigdataSailConnection.
SAIL-ROLLBACK-CONN conn rollback() was invoked on a BigdataSailConnection.
SAIL-CLOSE-CONN conn close() was invoked on a BigdataSailConnection.
REST-API-TASK-OPEN task A REST API task was created in response to an HTTP request (since 2.2).
REST-API-TASK-SUCCESS task A REST API task completed normally (since 2.2).
REST-API-TASK-ERROR task, cause A REST API task failed (since 2.2)
Clone this wiki locally