Add eleveldb:close #33

Merged
merged 9 commits into from Jul 18, 2012

4 participants

@jonmeredith

Added to fix the missing MANIFEST race around handoff / destroy / open.

This adds a mutex to the db_handle object to make sure it had not been closed from underneath any operations. All iterators created from the DB Ref are tracked and cleaned up when the database is closed. Any folds will throw exceptions if in progress and any other operation should return {error, einval} (modelled after the socket library).

As well as adding close, this also fixes a use-after-free bug to do with garbage collection. Although the NIF resources are released in the previous code, the order that the cleanup run is determined by the position in the heap by the garbage collector. The two examples show different ordering of cleanup.


11> spawn(fun() -> {ok, Db} = eleveldb:open("/tmp/t.db",[{create_if_missing, true}]), eleveldb:put(Db, <<"k">>,<<"v">>,[]), eleveldb:fold(Db, fun(_,_) -> lists:seq(1,10000) end, undefined, []) end).
**** db 0x7f8b220fda28 cleanup
**** iterator 0x1e261ae8 cleanup for db 0x7f8b220fda28
<0.54.0>
12> spawn(fun() -> {ok, Db} = eleveldb:open("/tmp/t.db",[{create_if_missing, true}]), eleveldb:put(Db, <<"k">>,<<"v">>,[]), eleveldb:fold(Db, fun(_,_) -> lists:seq(1,10000) end, undefined, []), garbage_collect() end).
**** iterator 0x7f8b220fca58 cleanup for db 0x7f8b220fc9c0
**** db 0x7f8b220fc9c0 cleanup

The code needs to be benchmarked before merging which we'll work on, but can be reviewed before then.

@jonmeredith jonmeredith commented on an outdated diff Jul 18, 2012
c_src/eleveldb.cc
+ {
+ // shutdown all the iterators - grab the lock as
+ // another thread could still be in eleveldb:fold
+ // which will get {error, einval} returned next time
+ for (std::set<eleveldb_itr_handle*>::iterator iters_it = db_handle->iters.begin();
+ iters_it != db_handle->iters.end();
+ ++iters_it)
+ {
+ eleveldb_itr_handle* itr_handle = *iters_it;
+ enif_mutex_lock(itr_handle->itr_lock);
+ free_itr(*iters_it);
+ enif_mutex_unlock(itr_handle->itr_lock);
+ }
+
+ // close the db
+ if (db_handle->db)
@jonmeredith
jonmeredith added a line comment Jul 18, 2012

No need to recheck - already know db_handle->db != NULL

@jonmeredith
jonmeredith added a line comment Jul 18, 2012

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jonmeredith jonmeredith commented on an outdated diff Jul 18, 2012
c_src/eleveldb.cc
handle->options = opts;
+ handle->iters = std::set<struct eleveldb_itr_handle*>();
@jonmeredith
jonmeredith added a line comment Jul 18, 2012

Check with C++ expert this is a good way to initialize a class in allocated memory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jonmeredith jonmeredith commented on an outdated diff Jul 18, 2012
c_src/eleveldb.cc
@@ -496,17 +617,17 @@ ERL_NIF_TERM eleveldb_iterator_close(ErlNifEnv* env, int argc, const ERL_NIF_TER
eleveldb_itr_handle* itr_handle;
if (enif_get_resource(env, argv[0], eleveldb_itr_RESOURCE, (void**)&itr_handle))
{
+ enif_mutex_lock(itr_handle->db_handle->db_lock);
@jonmeredith
jonmeredith added a line comment Jul 18, 2012

Add comment to explain acquiring locks in the same order.

@jonmeredith
jonmeredith added a line comment Jul 18, 2012

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jonmeredith jonmeredith commented on an outdated diff Jul 18, 2012
c_src/eleveldb.cc
@@ -645,7 +774,14 @@ static void eleveldb_itr_resource_cleanup(ErlNifEnv* env, void* arg)
{
delete itr_handle->itr;
@jonmeredith
jonmeredith added a line comment Jul 18, 2012

call free_itr instead of cleanup code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@slfritchie

+1 to merge. After chatting with Jon, my eyes think that this looks sane enough.

@jonmeredith jonmeredith was assigned Jul 18, 2012
@freerobby

Thanks for working on this. We got stung by this race condition this morning and it took down two nodes in our cluster. I'll likely upgrade to a 1.2 release candidate as soon as this is merged in.

@jonmeredith

@freerobby what platform/version were you running and how exactly did it fail? This definitely resolves a missing MANIFEST around handoff issue. We've had some coredumps related to memory management, but haven't been able to definitely pin it on this issue.

@freerobby

@jonmeredith We're running riak 1.1.2 on Ubuntu 12.04.

This was the top of our stack trace:

2012-07-18 17:28:47.287 [error] <0.560.0>@riak_kv_vnode:init:236 Failed to start riak_kv_eleveldb_backend Reason: {db_open,"IO error: /mnt/riak/leveldb/182687704666362864775460604089535377456991567872/MANIFEST-000002: No such file or directory"}

I checked by hand and verified the mentioned file was not on our file system. I was able to fix the node by deleting the /mnt/riak/leveldb/182687704666362864775460604089535377456991567872 directory. I suspect there is a better quick solution but I wanted to get the node back online quickly. All of our data has n=3 and we had a backup that was only one hour old, so I wasn't too concerned about data loss.

@jonmeredith

ok, looks very familiar. there should be a handoff entry for 182687704666362864775460604089535377456991567872 somewhat recently before the crash.

@jonmeredith jonmeredith merged commit 159dc6c into master Jul 18, 2012
@freerobby

Yep, indeed. Check the timestamps and partition IDs:

From console.log:

2012-07-18 08:13:26.298 [info] <0.29244.159>@riak_core_handoff_sender:start_fold:144 Handoff of partition riak_kv_vnode 182687704666362864775460604089535377456991567872 from 'riak@10.123.46.124' to 'riak@10.86.199.20' completed: sent 271350 objects in 269.95 seconds

From error.log:

2012-07-18 08:13:26.715 [error] <0.31447.159>@riak_kv_vnode:init:236 Failed to start riak_kv_eleveldb_backend Reason: {db_open,"IO error: /mnt/riak/leveldb/182687704666362864775460604089535377456991567872/MANIFEST-000002: No such file or directory"}

@jaredmorrow

@freerobby this change will make it into Riak 1.2.0 RC2 that will be out Friday 07/20

@seancribbs seancribbs deleted the jdm-add-close branch Apr 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment