BigBase Row Cache

Where is the BigTable's ScanCache?

ScanCache in Goggle's BigTable is responsible for caching hot rows data.
It improves read performance when read operation is much smaller than block size (blocks are cached in a block cache)
This feature is still missing in HBase (as of 0.98).
It's very hard to implement in Java as since it puts extreme pressure on GC (when done on heap).
GC pauses grow beyond acceptable levels in live enterprise environments due to heap fragmentation.
Unpredictability of GC interruption hurts as well.
Maximum GC pause is ~ 0.5-2 sec per 1GB of heap. For 64GB of heap memory maximum GC stop the world pause can easily exceed 2 min (CMS garbage collector).
G1 garbage collector in Java 7 does not resolve the problem ( http://www.aioug.org/sangam12/Presentations/20155.pdf )

Fast Off Heap ScanCache implementation for HBase (0.94+).
Its much more efficient than block cache when average read operation size is much less than block size because it does not pollute block cache with data that is not needed.
Cache size: 100's of GBs to TBs of RAM with low predictable query latency (tested up to 240GB).
Can be used with the following eviction policies (LRU, LFU, FIFO, Random). LRU is default.
Pure 100% - compatible Java. The only native code is Snappy/LZ4 compression codecs (supported in Windows, Linux x86 and Mac OSX). It can works on any Java platform with compression disabled.
Sub-millisecond latencies (on network), zero GC.
Implemented as RegionObserver co-processor.
Easy installation and configuration.

It caches data on read (read through cache).
Cache key is table:rowkey:family (table name + row key + column family)
If row has multiple CFs (column families) - they will be cached separately.
It caches the whole CF data (all column qualifiers + all versions) even if only part of it was requested.
Works best when size of CF is < block size. With default block size of 64KB we are talking about kilobytes - not megabytes.
Cached data is invalidated on every mutation for a particular rowkey:family. Every time, the column family is updated/deleted for a given row key, the corresponding entry gets evicted from the cache.
Make sure that your access is read mostly.
The ROW-CACHE can be enabled/disabled per table and per table:family (column family).
The setting is visible in HBase shell and HBase UI. There is new ROW_CACHE attribute on a table and table:column.
The table:column settings of ROW_CACHE overwrites table's setting.
One can enable both: row cache and block cache on a table, but usually it either first or second needs to be enabled.
Its very convenient for testing: run test with ROW_CACHE = false, then using provided utility - enable ROW_CACHE and re-run test.
Tables which have random (small) read mostly access pattern will benefit most from ROW-CACHE (make sure disable block cache).
Tables, which are accessed in a more predictable sequential way, must have block cache enabled instead.

GET (small rows < 100 bytes): 175K operations per sec per one Region Server (from cache).
MULTI-GET (small rows < 100 bytes): > 1M records per second (network limited) per one Region Server.
LATENCY: 99% < 1ms (for GETs) with 100K ops on a small (4 nodes) cluster.
Vertical scalability: tested up to 240GB (the maximum available in Amazon EC2).
Horizontal scalability: limited by HBase scalability.

Caches the whole rowkey:family data even if only subset is requested (not a big deal when rows are small).
Not suitable for large rows (10s of KB and above).
Invalidates cache entry on each mutation operation, which involves this entry (rowkey:family). Each time rowkey:family is updated, the corresponding cache entry is deleted to maintain full consistency.
Cache compression support is limited. LZ4 codec - Linux x86 and Mac OSX x86. Snappy - Windows x86, Linux x86, Mac OSX x86.
No 32bit OS support.

Note: The Row Cache has not been tested with HBase 0.92.x, but should probably work.