Skip to content
Find file
Fetching contributors…
Cannot retrieve contributors at this time
51 lines (35 sloc) 2.07 KB
We consider 0.3 most appropriate for someone who wants to evaluate
Cassandra without dealing with the highly variable degree of stability
that a nightly build offers. Here are the known issues you should
be most concerned about:
1. With enough and large enough keys in a ColumnFamily, Cassandra will
run out of memory trying to perform compactions (data file merges).
The size of what is stored in memory is (S + 16) * (N + M) where S
is the size of the key (usually 2 bytes per character), N is the
number of keys and M, is the map overhead (which can be guestimated
at around 32 bytes per key).
So, if you have 10-character keys and 1GB of headroom in your heap
space for compaction, you can expect to store about 17M keys
before running into problems.
2. Because fixing #1 requires a data file format change, 0.4 will not
be binary-compatible with 0.3 data files. A client-side upgrade
can be done relatively easily with the following algorithm:
for key in old_client.get_key_range(everything):
columns = old_client.get_slice or get_slice_super(key, all columns)
new_client.batch_insert or batch_insert_super(key, columns)
The inner loop can be trivially parallelized for speed.
3. Commitlog does not fsync before reporting a write successful.
Using blocking writes mitigates this to some degree, since all
nodes that were part of the write quorum would have to fail
before sync for data to be lost.
Additionally, row size (that is, all the data associated with a single
key in a given ColumnFamily) is limited by available memory, for
two reasons:
1. get_slice offsets are not indexed. Every time you do a get_slice,
Cassandra has to deserialize the entire ColumnFamily row into
memory. (This is already fixed in trunk.)
2. Compaction deserializes each row before merging.
Jump to Line
Something went wrong with that request. Please try again.