Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: Skip invalid state manifests in GC bucket #964

Merged
merged 7 commits into from
Sep 9, 2014

Conversation

shino
Copy link
Contributor

@shino shino commented Sep 5, 2014

Addresses #827 partly.

At present, the root cause why active state manifests are entered in GC bucket.
This PR does not process such manifests but just skip and log in order to
GC can proceed to other manifests instead of abort.

It's noteworthy that if riak_cs_delete_fsm replies ok with the same total blocks and deleted blocks
to riak_cs_gc_worker, then the worker attempt to delete manifest entry in twop_set.
Doing it before deleting manfiests and blocks completely poses orphan manifests/blocks.
In this PR, riak_cs_delete_fsm replies with total blocks = 1 and deleted_blocks = 0.

To address #827 completely, further investigation on root cause needed. Information of logs
by this PR will help.

@shino shino added this to the 1.5.1 milestone Sep 5, 2014
@shino
Copy link
Contributor Author

shino commented Sep 5, 2014

Set milestone 1.5.1. If wrong, please correct me.

See #827

Logged information includes:

- Key in GC bucket
- UUID (in manifest)
- CS bucket and key (also in manifest)

It's noteworthy that if riak_cs_delete_fsm replies ok with the same
total blocks and deleted blocks to riak_cs_gc_worker, then the worker
attempt to delete manifest entry in twop_set.  Doing it before
deleting manfiests and blocks completely poses orphan
manifests/blocks.  In this PR, riak_cs_delete_fsm replies with total
blocks = 1 and deleted_blocks = 0.
@shino shino force-pushed the bugfix/skip-invalid-state-manifest-in-gc branch from 01cb351 to 7163919 Compare September 5, 2014 08:23
@shino shino force-pushed the bugfix/skip-invalid-state-manifest-in-gc branch from 7163919 to a0f62cc Compare September 5, 2014 08:30
@shino
Copy link
Contributor Author

shino commented Sep 5, 2014

This PR is to branch release/1.5, especially based on #949 .

If patch needed, another branch (cherry-picked and conflicts resolved) support/patch-skip-invalid-state-manifest might be better, depending on whether #949 is necessary or not.
Compare view: 1.5.0...support/patch-skip-invalid-state-manifest

…ests

Because GC buckets are divided among bags and distribution is
determined by manifests. Manifests at an certain bag goes to
riak-cs-gc bucket in the Riak cluster by delete.
@kuenishi
Copy link
Contributor

kuenishi commented Sep 9, 2014

I'm not sure why, but an eqc test didn't work for me:

...riak_cs_gc_single_run_eqc:84: eqc_test_...*failed*
::{function_clause,
    [{riak_cs_gc_single_run_eqc,dummy_start_delete_fsm,
         [testnode247683@nausicaa,
          [<0.27611.1>,
           {<<"UUIDno_error11">>,
            {lfs_manifest_v3,3,1048576,
                {<<"bucket">>,<<"no_e"...>>},
                undefined,"2014-09-09T03:50:22.000Z",undefined,...}},
           <0.27610.1>,<<"no_error1">>,[]]],
         [{file,"test/riak_cs_gc_single_run_eqc.erl"},{line,250}]},
     {riak_cs_delete_fsm_sup,start_delete_fsm,
         [testnode247683@nausicaa,
          [<0.27611.1>,
           {<<"UUIDno_error11">>,
            {lfs_manifest_v3,3,1048576,
                {<<"buck"...>>,<<...>>},
                undefined,
                [...],...}},
           <0.27610.1>,<<"no_error1">>,[]]],
         []},
     {riak_cs_gc_worker,initiating_file_delete,2,
         [{file,"src/riak_cs_gc_worker.erl"},{line,138}]},
     {gen_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,494}]},
     {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}

{error, notfound};
change_state_to_active(Pbc, TargetBKey, [GCKey|Rest]) ->
{ok, Obj0} = riakc_pb_socket:get(Pbc, ?GC_BUCKET, GCKey),
Manifests = twop_set:to_list(binary_to_term(riakc_obj:get_value(Obj0))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assume twop_set is in the path?

@shino
Copy link
Contributor Author

shino commented Sep 9, 2014

Nice catch 😄 .
Pushed two commits, which address two comments respectively.

total_blocks=BlockCount},
start_block_servers(NewState);
{error, invalid_state} ->
_ = lager:warning("Invalid state manifest in GC bucket at ~p: ~p",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking possible operations when this log line found, that might be backing up the object and try deleting it again. Or restore the whole historical manifests via inspector. Anyway I think it'd be more friendly to print out bucket and key name explicitly. UUID and Manifest could follow them or in a debug line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the bucket and key as S3 manner. GCKey should also be in the first log line.

@kuenishi
Copy link
Contributor

kuenishi commented Sep 9, 2014

Great work. Code itself looks pretty good to me.

borshop added a commit that referenced this pull request Sep 9, 2014
…-in-gc

Bugfix: Skip invalid state manifests in GC bucket

Reviewed-by: kuenishi
@shino
Copy link
Contributor Author

shino commented Sep 9, 2014

@borshop merge

@borshop borshop merged commit 24b1e05 into release/1.5 Sep 9, 2014
@shino shino deleted the bugfix/skip-invalid-state-manifest-in-gc branch September 9, 2014 07:04
borshop added a commit that referenced this pull request Sep 9, 2014
Prepare release notes (#964 is likely to merge)

Reviewed-by: shino
borshop added a commit that referenced this pull request Sep 10, 2014
Prepare release notes (#964 is likely to merge)

Reviewed-by: kuenishi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants