Skip to content

Commit

Permalink
Updating the readme.
Browse files Browse the repository at this point in the history
  • Loading branch information
chirino committed Oct 31, 2012
1 parent e36d4e6 commit 8b9f18f
Showing 1 changed file with 133 additions and 103 deletions.
236 changes: 133 additions & 103 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,106 +13,136 @@ others it has been converted to a more natural Java style. The plan is to
leave the code closer to the C++ original until the baseline performance has
been established.


## DB implementation

* Get, put, delete, batch writes and iteration implemented
* Snapshots implemented (needs testing)
* TX logging and recovery

## Storage

* MemTables implemented
* Read and write tables
* Read and write blocks
* Supports Snappy compression
* Supports CRC32c checksums

## Compaction

* MemTable to Level0 compaction
* Version persistence and VersionSet management
* Read and write log files
* Level0 compaction
* Arbitrary range compaction
* Compaction scheduling

# Implementation Nodes

## Iterators

The iterator intensive design of this code comes directly from the C++ code.
LevelDB can most easily described follows:

* DB merge Iterator
* MemTable iterator
* Immutable MemTable iterator (the one being compacted)
* Version merge iterator
* Level0 merge iterator over files
* Table merge iterator
* Block iterator
* Level1 concat iterator over files
* Table merge iterator
* Block iterator
* ...
* LevelN concat iterator over files
* Table merge iterator
* Block iterator

As you can see it is easy to get lost in these deeply nested data structures.
In addition to these iterators from the original C++ code, this code wraps the
DB merge iterator with a snapshot filtering iterator and finally a
transforming iterator to convert InternalKeys into the keys in the user space.

## Buffers

Currently the code uses Netty ChannelBuffers internally. This is mainly
because the Java ByteBuffer interface is so unfriendly. ChannelBuffers
are not really ideal for this code, either and a custom solution needs to be
considered

## Thread safety

None of the locking code from the original C++ was translated into Java largely
because Java and C++ concurrent primitives and data structures are so different.
Once the compaction is in place it should be more clear how to make the DB
implementation thread safe and concurrent.

## Memory usage

Since the code is a fairly literal translation of the original C++, the memory
usage is more restrained that most Java code, but further work need to be done
around clean up of abandoned user objects (like Snapshots). The code also
sometimes makes extra copies of buffers due to the ChannelBuffer, ByteBuffer
and byte[] impedance. Over time, the code should be tuned to reduce GC impact
(the most problematic code is the skip list in the memtable which may need to
be rewritten). Of course, all of this must be verified in a profiler.

# TODO

## Performance

There has been no performance tests yet.

* Port C++ performance benchmark to Java
* Establish performance base line against:
* C++ original
* Kyoto TreeDB
* SQLite3
* [LevelDB JNI] (https://github.com/fusesource/leveldbjni)

## API

The user APIs have not really been started yet, but there are a few ideas on
the drawing-board already.

* Factory/maker API for opening and creating databases (like Guava)
* Low-level simple buffer only API
* High-level java.util.Map like or full map wrapper with serialization support
* SPI for UserComparator

## Other

* Need logging interface
* Need iterator structure inspector for easier debugging
* All buffers must be in little endian
## API Usage:

Recommended Package imports:

import org.iq80.leveldb.*;
import static org.iq80.leveldb.impl.Iq80DBFactory.*;
import java.io.*;

Opening and closing the database.

Options options = new Options();
options.createIfMissing(true);
DB db = factory.open(new File("example"), options);
try {
// Use the db in here....
} finally {
// Make sure you close the db to shutdown the
// database and avoid resource leaks.
db.close();
}

Putting, Getting, and Deleting key/values.

db.put(bytes("Tampa"), bytes("rocks"));
String value = asString(db.get(bytes("Tampa")));
db.delete(wo, bytes("Tampa"));

Performing Batch/Bulk/Atomic Updates.

WriteBatch batch = db.createWriteBatch();
try {
batch.delete(bytes("Denver"));
batch.put(bytes("Tampa"), bytes("green"));
batch.put(bytes("London"), bytes("red"));

db.write(batch);
} finally {
// Make sure you close the batch to avoid resource leaks.
batch.close();
}

Iterating key/values.

DBIterator iterator = db.iterator();
try {
for(iterator.seekToFirst(); iterator.hasNext(); iterator.next()) {
String key = asString(iterator.peekNext().getKey());
String value = asString(iterator.peekNext().getValue());
System.out.println(key+" = "+value);
}
} finally {
// Make sure you close the iterator to avoid resource leaks.
iterator.close();
}

Working against a Snapshot view of the Database.

ReadOptions ro = new ReadOptions();
ro.snapshot(db.getSnapshot());
try {
// All read operations will now use the same
// consistent view of the data.
... = db.iterator(ro);
... = db.get(bytes("Tampa"), ro);

} finally {
// Make sure you close the snapshot to avoid resource leaks.
ro.snapshot().close();
}

Using a custom Comparator.

DBComparator comparator = new DBComparator(){
public int compare(byte[] key1, byte[] key2) {
return new String(key1).compareTo(new String(key2));
}
public String name() {
return "simple";
}
public byte[] findShortestSeparator(byte[] start, byte[] limit) {
return start;
}
public byte[] findShortSuccessor(byte[] key) {
return key;
}
};
Options options = new Options();
options.comparator(comparator);
DB db = factory.open(new File("example"), options);

Disabling Compression

Options options = new Options();
options.compressionType(CompressionType.NONE);
DB db = factory.open(new File("example"), options);

Configuring the Cache

Options options = new Options();
options.cacheSize(100 * 1048576); // 100MB cache
DB db = factory.open(new File("example"), options);

Getting approximate sizes.

long[] sizes = db.getApproximateSizes(new Range(bytes("a"), bytes("k")), new Range(bytes("k"), bytes("z")));
System.out.println("Size: "+sizes[0]+", "+sizes[1]);

Getting database status.

String stats = db.getProperty("leveldb.stats");
System.out.println(stats);

Getting informational log messages.

Logger logger = new Logger() {
public void log(String message) {
System.out.println(message);
}
};
Options options = new Options();
options.logger(logger);
DB db = factory.open(new File("example"), options);

Destroying a database.

Options options = new Options();
factory.destroy(new File("example"), options);

# Projects using this port of LevelDB

* [ActiveMQ Apollo](http://activemq.apache.org/apollo/): Defaults to using leveldbjni, but falls
back to this port if the jni port is not available on your platform.

0 comments on commit 8b9f18f

Please sign in to comment.