Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Fix file descriptor leak after start() after AC power failure #61

Merged
merged 2 commits into from

2 participants

@slfritchie
Owner

A customer reports an eleveldb problem (emfile errors due to file
descriptor leaking) that was apparently the result of the twisted
flaming wreckage of:

  1. A large ring size on a small cluster of machines (exact ring
    size unknown, cluster size likely 5 or 7 nodes)

  2. Under an extremely heavy Riak CS workload, which was under a
    high stress workload involving exclusively files at 10+ MByte and
    thus writing much more data to Bitcask than a high stress workload
    involving 10 byte files.

  3. The power plug was pulled/tripped/popped on one of the Riak
    nodes.

  4. The machine promptly rebooted, checked file system state, and
    attempted to restart Riak.

  5. Riak restarted with most/all? Bitcask instances complaining about
    truncated data & hint files but appeared to start them successfully.

  6. However, file descriptor leaks caused eleveldb instances to crash
    with emfile errors.

This fix introduces a new parameter to scan_key_files(), telling
it to close files (or not) as it folds across each data file.

The two test functions, t0/0 and t1/0, are added to show the
effect of the file descriptor leakage. The former shows how
the leak at bitcask:start() is addressed by this patch. Leaks
via merge or delete as shown by t1/0 are not addressed. Both
functions are not really demos and not yet eunit tests (due to
relying on platform-specific utilities like seq and lsof).

@slfritchie slfritchie Fix file descriptor leak after start() after AC power failure
A customer reports an eleveldb problem (`emfile` errors due to file
descriptor leaking) that was apparently the result of the twisted
flaming wreckage of:

1. A large ring size on a small cluster of machines (exact ring
size unknown, cluster size likely 5 or 7 nodes)

2. Under an extremely heavy Riak CS workload, which was under a
high stress workload involving exclusively files at 10+ MByte and
thus writing much more data to Bitcask than a high stress workload
involving 10 byte files.

3. The power plug was pulled/tripped/popped on one of the Riak
nodes.

4. The machine promptly rebooted, checked file system state, and
attempted to restart Riak.

5. Riak restarted with most/all? Bitcask instances complaining about
truncated data & hint files but appeared to start them successfully.

6. However, file descriptor leaks caused eleveldb instances to crash
with `emfile` errors.

This fix introduces a new parameter to `scan_key_files()`, telling
it to close files (or not) as it folds across each data file.

The two test functions, `t0/0` and `t1/0`, are added to show the
effect of the file descriptor leakage.  The former shows how
the leak at `bitcask:start()` is addressed by this patch.  Leaks
via merge or delete as shown by `t1/0` are not addressed.  Both
functions are not really demos and not yet eunit tests (due to
relying on platform-specific utilities like `seq` and `lsof`).
c174af6
@gburd

There is a TODO in the NIF on close (check for EIO on the fd), maybe this is a good time to address that as it becomes more likely with more file handles closing for different reasons.

@gburd

Looks good as is. +1

@slfritchie slfritchie merged commit acf981e into from
@engelsanchez engelsanchez deleted the branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Oct 17, 2012
  1. @slfritchie

    Fix file descriptor leak after start() after AC power failure

    slfritchie authored
    A customer reports an eleveldb problem (`emfile` errors due to file
    descriptor leaking) that was apparently the result of the twisted
    flaming wreckage of:
    
    1. A large ring size on a small cluster of machines (exact ring
    size unknown, cluster size likely 5 or 7 nodes)
    
    2. Under an extremely heavy Riak CS workload, which was under a
    high stress workload involving exclusively files at 10+ MByte and
    thus writing much more data to Bitcask than a high stress workload
    involving 10 byte files.
    
    3. The power plug was pulled/tripped/popped on one of the Riak
    nodes.
    
    4. The machine promptly rebooted, checked file system state, and
    attempted to restart Riak.
    
    5. Riak restarted with most/all? Bitcask instances complaining about
    truncated data & hint files but appeared to start them successfully.
    
    6. However, file descriptor leaks caused eleveldb instances to crash
    with `emfile` errors.
    
    This fix introduces a new parameter to `scan_key_files()`, telling
    it to close files (or not) as it folds across each data file.
    
    The two test functions, `t0/0` and `t1/0`, are added to show the
    effect of the file descriptor leakage.  The former shows how
    the leak at `bitcask:start()` is addressed by this patch.  Leaks
    via merge or delete as shown by `t1/0` are not addressed.  Both
    functions are not really demos and not yet eunit tests (due to
    relying on platform-specific utilities like `seq` and `lsof`).
  2. @slfritchie
This page is out of date. Refresh to see the latest.
Showing with 62 additions and 6 deletions.
  1. +62 −6 src/bitcask.erl
View
68 src/bitcask.erl
@@ -461,7 +461,8 @@ merge1(Dirname, Opts, FilesToMerge) ->
%% load all available data into the keydir in case another
%% reader/writer comes along in the same VM. Note that we
%% won't necessarily merge all these files.
- AllFiles = scan_key_files(readable_files(Dirname), LiveKeyDir, []),
+ AllFiles = scan_key_files(readable_files(Dirname), LiveKeyDir, [],
+ false),
%% Partition all files to files we'll merge and files we
%% won't (so that we can close those extra files once
@@ -767,9 +768,9 @@ put_state(Ref, State) ->
reverse_sort(L) ->
lists:reverse(lists:sort(L)).
-scan_key_files([], _KeyDir, Acc) ->
+scan_key_files([], _KeyDir, Acc, _CloseFile) ->
Acc;
-scan_key_files([Filename | Rest], KeyDir, Acc) ->
+scan_key_files([Filename | Rest], KeyDir, Acc, CloseFile) ->
{ok, File} = bitcask_fileops:open_file(Filename),
F = fun(K, Tstamp, {Offset, TotalSz}, _) ->
bitcask_nifs:keydir_put(KeyDir,
@@ -780,7 +781,12 @@ scan_key_files([Filename | Rest], KeyDir, Acc) ->
Tstamp)
end,
bitcask_fileops:fold_keys(File, F, undefined, recovery),
- scan_key_files(Rest, KeyDir, [File | Acc]).
+ if CloseFile == true ->
+ bitcask_fileops:close(File);
+ true ->
+ ok
+ end,
+ scan_key_files(Rest, KeyDir, [File | Acc], CloseFile).
%%
%% Initialize a keydir for a given directory.
@@ -800,12 +806,12 @@ init_keydir(Dirname, WaitTime) ->
%% the data from disk. Build a list of all the bitcask data files
%% and sort it in descending order (newest->oldest).
SortedFiles = readable_files(Dirname),
- ReadFiles = scan_key_files(SortedFiles, KeyDir, []),
+ _ = scan_key_files(SortedFiles, KeyDir, [], true),
%% Now that we loaded all the data, mark the keydir as ready
%% so other callers can use it
ok = bitcask_nifs:keydir_mark_ready(KeyDir),
- {ok, KeyDir, ReadFiles};
+ {ok, KeyDir, []};
{error, not_ready} ->
timer:sleep(100),
@@ -1661,4 +1667,54 @@ truncate_file(Path, Offset) ->
ok = file:truncate(FH),
file:close(FH).
+%% About leak_t0():
+%%
+%% If bitcask leaks file descriptors for the 'touch'ed files, output is:
+%% Res = 920
+%%
+%% If bitcask isn't leaking the 'touch'ed files, output is:
+%% Res = 20
+
+leak_t0() ->
+ Dir = "/tmp/goofus",
+ os:cmd("rm -rf " ++ Dir),
+
+ Ref0 = bitcask:open(Dir),
+ os:cmd("cd " ++ Dir ++ "; sh -c 'for i in `seq 100 1000`; do touch $i.bitcask.data $i.bitcask.hint; done'"),
+ Ref1 = bitcask:open(Dir),
+
+ Cmd = lists:flatten(io_lib:format("lsof -nP -p ~p | wc -l", [os:getpid()])),
+ io:format("Res = ~s\n", [os:cmd(Cmd)]).
+
+leak_t1() ->
+ Dir = "/tmp/goofus",
+ NumKeys = 300,
+ os:cmd("rm -rf " ++ Dir),
+
+ Ref = bitcask:open(Dir, [read_write, {max_file_size, 10}]),
+ Used = fun() ->
+ Cmd = lists:flatten(io_lib:format("lsof -nP -p ~p | wc -l", [os:getpid()])),
+ %% io:format("Cmd = ~s\n", [Cmd]),
+ os:cmd(Cmd)
+ end,
+ [bitcask:put(Ref, <<X:32>>, <<"it's a big, big world!">>) ||
+ X <- lists:seq(1, NumKeys)],
+ io:format("After putting ~p keys, lsof says: ~s", [NumKeys, Used()]),
+
+ DelKeys = NumKeys div 2,
+ [bitcask:delete(Ref, <<X:32>>) || X <- lists:seq(1, DelKeys)],
+ io:format("After deleting ~p keys, lsof says: ~s", [DelKeys, Used()]),
+
+ bitcask:merge(Dir),
+ io:format("After merging, lsof says: ~s", [Used()]),
+
+ io:format("Q: Are all keys fetchable? ..."),
+ [{ok, _} = bitcask:get(Ref, <<X:32>>) ||
+ X <- lists:seq(DelKeys + 1, NumKeys)],
+ [not_found = bitcask:get(Ref, <<X:32>>) || X <- lists:seq(1, DelKeys)],
+ io:format("A: yes\n"),
+ io:format("Now, lsof says: ~s", [Used()]),
+
+ ok.
+
-endif.
Something went wrong with that request. Please try again.