Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

non-blocking iterator over entire index #20

Merged
merged 4 commits into from Jun 19, 2012

Conversation

Projects
None yet
2 participants
Contributor

rzezeski commented Jun 12, 2012

This is needed for async folding which is needed for the new repair functionality which deals with only primary vnodes. You don't want repair to block a vnode from handling other requests.

rzezeski added some commits May 16, 2012

Add iterator API to iterate entire index in non-blocking fashion
This is used by async folds in order to allow repair without blocking
primary vnodes.

@ghost ghost assigned rzezeski Jun 12, 2012

@rzezeski rzezeski referenced this pull request in basho/riak_search Jun 12, 2012

Merged

Repair Command #107

src/mi_server.erl
+ %% build ordered iterator over all buffers + segments
+ %%
+ %% TODO: maybe don't bother with order here since this is for
+ %% repair?
@slfritchie

slfritchie Jun 12, 2012

Contributor

Unless the order is extremely expensive computation-wise or I/O-wise, I suggest removing the TODO comment: other functions will probably use this feature in later releases.

@rzezeski

rzezeski Jun 13, 2012

Contributor

Removed.

src/mi_server.erl
+ F2 = fun(Segment, Acc) ->
+ mi_locks:claim(mi_segment:filename(Segment), Acc)
+ end,
+ NewLocks1 = lists:foldl(F2, NewLocks, Segments),
@slfritchie

slfritchie Jun 12, 2012

Contributor

This acquisition of buffer & segment locks happens multiple times in similar/identical code, seems like a DRY refactoring opportunity?

  • lookup
  • range
  • iterator (the new addition)
@rzezeski

rzezeski Jun 13, 2012

Contributor

Been meaning to do this for a while, done.

Factor out buffer/segment locking code
There are multiple places where the buffers and segments need to be
locked.  Factor this code into a shared function.
Contributor

slfritchie commented Jun 13, 2012

How worried should we be that the async fold worker/servers's protocol for sending results to the client is strictly unidirectional and doesn't have any backpressure mechanism?

Contributor

rzezeski commented Jun 13, 2012

Scott, I think this is a concern since it's feeding the handoff
sender which has back-pressure and must send over the network.
This same pipeline is used during queries but at least there the
pipeline probably flows better (I'm making an assumption here as
this can also traverse network). If the index is large enough
and the disproportion between the handoff xfer and local iterator
is great enough then the mailbox/process heap on the vnode worker
will grow and potentially cause a crash

I could modify iterate2 to only send N result chunks before
waiting for a CONTINUE and make a new result_iterator that
sends a CONTINUE after each chunk it reads. The reason for N
chunks is so that some data is preemptively loaded so each call
to iterator doesn't have to wait for msg req/response cycle.
Does that make sense? What do you think?

Contributor

slfritchie commented Jun 13, 2012

Ryan, I think that your proposal is a good one. It isn't necessary for iterate2 and the result iterator thing to be in lock-step with the backpressure from the handoff receiver & sender. But I think it is necessary for the MI iterator to eventually stop sending to the handoff sender if the handoff sender stops sending.

Contributor

rzezeski commented Jun 13, 2012

Yep, I agree that they don't need to be lock-step. In fact, I'd consider back-pressure in MI to be orthogonal and good for all cases of iteration including queries. However, for now I'll only update the iterator API and leave a TODO in the code about adding back-pressure to all MI APIs. That way I can have more time to adequately test for any performance regressions whereas with repair/handoff I prefer robustness (queries already sorta have that since their scope is limited).

Add back pressure to iterator API
In order to prevent slower consumers from piling up result chunks add
back pressure to the iterator API.  This is important because this API
is used for handoff & repair both of which have back pressure and will
potentially not be able to consume data as fast as the local iterator
can produce it.

Currently after every 4 result chunks sent the MI server iterator will
wait until it sees a continue msg.  The number 4 was chosen
arbitrarily and the reason for not applying back pressure on a
per-chunk basis is to avoid some latency/overhead by reducing the
number of msgs that need to be sent.

@rzezeski rzezeski merged commit 7261550 into master Jun 19, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment