A variable length record, checksumming, append only rotating log implementation with graceful recovery
Switch branches/tags
Nothing to show
Pull request Compare This branch is 10 commits ahead of chirino:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



HawtJournal is a journal storage implementation based on append-only rotating logs and checksummed variable-length records, with fixed concurrent reads, full concurrent writes, dynamic batching and "dead" logs cleanup.


HawtJournal APIs are very simple and intuitive.

First, create and configure the journal:

Journal journal = new Journal();

Here, we're basically disabling cleaned-up logs archiving, enabling checksum, setting up a max log file length of 1 megabyte and a max batch size of 10 kilobytes.

Then, open the journal:


And write some records:

for (int i = 0; i < writes; i++) {
    boolean sync = i % 2 == 0 ? true : false;
    journal.write(new String("DATA" + i).getBytes("UTF-8"), sync);

You can dynamically write either in async or sync mode: in async mode, writes are batched until either the max batch size is reached, the journal is manually synced or closed, or a sync write is executed.

Finally, replay the log by going through the Journal as an iterable object returning record locations:

for (Location location : journal) {
    byte[] record = journal.read(location);
    // do something

Eventually delete some record:


Optionally do a manual sync:


Compact logs:


And close it:


That's all!

About data durability

HawtJournal provides three levels of data durability: batch, sync and physical sync.

Batch durability provides the lowest durability guarantees but the greatest performance: data is collected in memory batches and then written on disk at "sync points", either when a sync write is explicitly requested, a sync call is explicitly executed, or when the max batch size is reached.

Sync durability happens during sync points, providing higher durability guarantees with some performance cost: here, memory batches are written on disk.

Finally, physical sync durability provides the highest durability guarantees at the expense of a greater performance cost, flushing on disk hardware buffers at every sync point: disabled by default, it can be enabled by setting Journal#setPhysicalSync true.


Distributed under the Apache Software License.