Repair Command #107

Merged
merged 7 commits into from Jun 19, 2012

Conversation

Projects
None yet
2 participants
@rzezeski
Contributor

rzezeski commented May 17, 2012

Add the ability to repair a Search partition. If a partition's data
is lost or corrupted this command may be used to rebuild it from
adjacent partitions. This is preferable over a list-keys + re-index
since it is more targeted, should complete more quickly and should put
less strain on the cluster.

The command can currently only be executed by attaching to Riak.

riak_search_vnode:repair(Partition).

@ghost ghost assigned jtuple May 29, 2012

+ {ok, C} = riak:local_client(),
+ {ok, Keys} = C:list_keys(?SCHEMA_BUCKET),
+ [S || {ok,S} <- [get_schema(K) || K <- Keys]].
+

This comment has been minimized.

Show comment Hide comment
@jtuple

jtuple Jun 5, 2012

Contributor

Any issue here with cached schemas vs raw schemas? We do a listkeys over the raw schema bucket and call get_schema which will return a cached schema, or trigger a gen_server:call to this server to retreive the raw schema, parse it, and write a cached version. So, all raw schema not already in the cache will be parsed and loaded.

On a related note, could we augment the schema cache somehow to avoid needing to do the listkeys?

@jtuple

jtuple Jun 5, 2012

Contributor

Any issue here with cached schemas vs raw schemas? We do a listkeys over the raw schema bucket and call get_schema which will return a cached schema, or trigger a gen_server:call to this server to retreive the raw schema, parse it, and write a cached version. So, all raw schema not already in the cache will be parsed and loaded.

On a related note, could we augment the schema cache somehow to avoid needing to do the listkeys?

This comment has been minimized.

Show comment Hide comment
@rzezeski

rzezeski Jun 11, 2012

Contributor

I've since patched the set-schema API to clear the cache on all nodes on modification. Now if that multi-call fails to hit a node then that node will still use the old version. Currently this is only used for repair so that it can get n_val which really shouldn't change but I suppose there is some very rare edge case here where a user changes n_val and some of the cache clear calls fail. Search also updates the bucket props when the n_val is changed so I could get it from there instead but I still need to do the list keys.

@rzezeski

rzezeski Jun 11, 2012

Contributor

I've since patched the set-schema API to clear the cache on all nodes on modification. Now if that multi-call fails to hit a node then that node will still use the old version. Currently this is only used for repair so that it can get n_val which really shouldn't change but I suppose there is some very rare edge case here where a user changes n_val and some of the cache clear calls fail. Search also updates the bucket props when the n_val is changed so I could get it from there instead but I still need to do the list keys.

src/riak_search_vnode.erl
#vstate{bmod=BMod,bstate=BState}=VState) ->
- bmod_response(BMod:fold(Fun, Acc, BState), VState).
+ {async, AsyncFoldFun} = BMod:fold(Fun, Acc, BState),
+ FinishFun =

This comment has been minimized.

Show comment Hide comment
@jtuple

jtuple Jun 5, 2012

Contributor

Should we play it safe and implement a backend capability approach similar to k/v? Thus, being able to verify the backend is actually async capable. Or perhaps just use async_fold here versus fold with some try/catch magic. I worry about a non-async search backend happily folding away here and blocking the vnode, just for the fold to return an accumulator that triggers a badmatch against the async tuple.

Related: the ETS backend is not async and will fail here.

@jtuple

jtuple Jun 5, 2012

Contributor

Should we play it safe and implement a backend capability approach similar to k/v? Thus, being able to verify the backend is actually async capable. Or perhaps just use async_fold here versus fold with some try/catch magic. I worry about a non-async search backend happily folding away here and blocking the vnode, just for the fold to return an accumulator that triggers a badmatch against the async tuple.

Related: the ETS backend is not async and will fail here.

This comment has been minimized.

Show comment Hide comment
@rzezeski

rzezeski Jun 11, 2012

Contributor

Eh, the only production backend for Search currently is merge_index. I just wanted to hard code that for now to save time. I think implementing capabilities like in KV might make sense going forward but don't think it is necessary for this first crack.

@rzezeski

rzezeski Jun 11, 2012

Contributor

Eh, the only production backend for Search currently is merge_index. I just wanted to hard code that for now to save time. I think implementing capabilities like in KV might make sense going forward but don't think it is necessary for this first crack.

This comment has been minimized.

Show comment Hide comment
@jtuple

jtuple Jun 11, 2012

Contributor

Are you certain no one internal or external to Basho ever sets the config value to the ETS backend? Even just for testing? I'm fine with no capability system, but unless you update the ETS backend, it will be impossible to use with this change. It will crash on every fold request.

@jtuple

jtuple Jun 11, 2012

Contributor

Are you certain no one internal or external to Basho ever sets the config value to the ETS backend? Even just for testing? I'm fine with no capability system, but unless you update the ETS backend, it will be impossible to use with this change. It will crash on every fold request.

This comment has been minimized.

Show comment Hide comment
@rzezeski

rzezeski Jun 12, 2012

Contributor

I modified fold to work as before if the backend is not merge_index_backend.

@rzezeski

rzezeski Jun 12, 2012

Contributor

I modified fold to work as before if the backend is not merge_index_backend.

@jtuple

This comment has been minimized.

Show comment Hide comment
@jtuple

jtuple Jun 5, 2012

Contributor

As discussed over chat, a general issue here is that while the search vnode now implements async folds, things still breakdown because merge_index_backend:async_fold_fun still calls merge_index:fold which is implemented through a synchronous and blocking gen_server:call to mi_server. During a "async fold", other folds, index calls, and lookups block on mi_server calls.

You mentioned an iterator approach to this problem. That sounds like a great solution to me.

Contributor

jtuple commented Jun 5, 2012

As discussed over chat, a general issue here is that while the search vnode now implements async folds, things still breakdown because merge_index_backend:async_fold_fun still calls merge_index:fold which is implemented through a synchronous and blocking gen_server:call to mi_server. During a "async fold", other folds, index calls, and lookups block on mi_server calls.

You mentioned an iterator approach to this problem. That sounds like a great solution to me.

rzezeski added some commits Mar 27, 2012

Add repair ability
Add the ability to repair a Search partition.  If a partition's data
is lost or corrupted this command may be used to rebuild it from
adjacent partitions.  This is preferable over a list-keys + re-index
since it is more targeted, should complete more quickly and should put
less strain on the cluster.
Let vnode crash on exit signal
If a linked process, such as the merge index process or worker pool,
crashes then restart the vnode since it's pid values will be obsolete.
Let core extract the chash
The fact that the chash is needed is a core detail so keep it there.
Factor general filter code into Core
Factor out the repair filter code shared by Search & KV into Core.

@ghost ghost assigned rzezeski Jun 11, 2012

rzezeski added some commits Jun 12, 2012

Make async fold truly async
Need to use the iterator API for `merge_index` to prevent blocking the
`mi_server` instance.
@rzezeski

This comment has been minimized.

Show comment Hide comment
@rzezeski

rzezeski Jun 12, 2012

Contributor

The following Merge Index PR is required for this to work: basho/merge_index#20

Contributor

rzezeski commented Jun 12, 2012

The following Merge Index PR is required for this to work: basho/merge_index#20

@rzezeski rzezeski merged commit 40ea394 into master Jun 19, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment