Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

2i pagination support #540

Merged
merged 23 commits into from

5 participants

@russelldb
Owner

Adds a max_results option to PB and HTTP 2i endpoints for limiting
the number of results returned for a 2i query. Will return an opaque
continuation if there are max_results results returned. A
subsequent query that includes the continuation as a parameter will
return the next max_results results.

The results will be returned sorted ascending. In the case of $key
and $bucket queries, and equals queries, the results will be sorted by
primary key In the case of range queries the results will be sorted by
index value, then primary key. Since the values must be returned from the backend for
sorting, there is a further option return_terms that will cause the
indexed terms to be returned to the user also, where relevant (range
queries on index fields.)

Sorting is performed by the module sms.erl originally written by Reid
Draper. It is used as a buffer by riak_kv_index_fsm.

A new vnode function stop_fold has been added so the fsm can tell
the backend to break out of the fold when enough results have been
gathered.

The option to stream 2i results has also been added. Adding the
stream option to either HTTP or PB queries will cause the API
endpoints to return chunks of results as they are received from the
fsm This can lead to much lower perceived latencies as the application
recieves the first results of a large set much sooner.

An example HTTP query:

curl "http://127.0.0.1:10018/buckets/2ibucket/index/field2_int/1/999?return_terms=true&max_results=100"

Will return the first 100 {value, key} pairs for the field2_int
index, sorted by value, then primary key. If there are 100 results returned, it will
also return an opaque continuation that may be sumbitted to continue
the query:

curl "http://127.0.0.1:10018/buckets/2ibucket/index/field2_int/1/999?return_terms=true&continuation=some_gibberish_looking_value&max_results=100"

A further continuation maybe returned with the results.


Riak tests can be found at https://github.com/basho/riak_test/tree/pt29-2i-pagination
Riak-Erlang-Client support at https://github.com/basho/riak-erlang-client/tree/pt29-2i-pagination

NOTE: requires
https://github.com/basho/riak_core/tree/pt29-2i-pagination
https://github.com/basho/riak_pb/tree/pt29-2i-pagination

russelldb added some commits
@russelldb russelldb Feature request, optionally return indexed values with keys
When a 2i query's results are a list of keys. This change adds
an optional request parameter to pb and http that, when set, will
cause pairs of `key:index_value` to be returned as the result of
a 2i query.

Has no effect on MapReduce. Uses capabilities for rolling upgrade
situations.
1b98b9a
@russelldb russelldb Add pagination to 2i results
Adds a `max_results` option to PB and HTTP 2i endpoints for limiting
the number of results returned for a 2i query.  Will return an opaque
`continuation` if there are at least `max_results` results returned. A
subsequent query that includes the `continuation` as a parameter will
return the next `max_results` results.

The results will be returned sorted ascending. In the case of $key
and $bucket queries, and equals queries, the results will be sorted by
primary key In the case of range queries the results will be sorted by
index value. Since the values must be returned from the backend for
sorting, there is a further option `return_terms` that will cause the
indexed terms to be returned to the user also, where relevant (range
queries on index fields.)

Sorting is performed by the module `sms.erl` originally written by Reid
Draper. It is used as a buffer by `riak_kv_index_fsm`.

A new vnode function `stop_fold` has been added so the fsm can tell
the backend to break out of the fold when enough results have been
gathered.

The option to stream 2i results has also been added. Adding the
`stream` option to either HTTP or PB queries will cause the API
endpoints to return chunks of results as they are received from the
fsm This can lead to much lower perceived latencies as the application
recieves the first results of a large set much sooner.

An example HTTP query: `curl
"http://127.0.0.1:10018/buckets/2ibucket/index/field2_int/1/100?return_terms=true"`
Will return the first 100 {value, key} pairs for the `field2_int`
index, sorted by value.  If there are 100 results returned, it will
also return an opaque `continuation` that may be sumbitted to continue
the query: `curl
"http://127.0.0.1:10018/buckets/2ibucket/index/field2_int/1/100?return_terms=true&continuation=some_gibberish_looking_value"`
A further `continuation` maybe returned with the results.
b0bc9fb
@russelldb russelldb Remove lager info message
Spotted lager info message left over from dev, excised it.
81409cf
@russelldb russelldb Streaming 2i results for HTTP was broken
I have no idea how. Tests passed, then they didn't. I guess a
rebase went wrong.
95db737
@russelldb russelldb WIP fold objects over bucket for riak cs 65e1adb
@russelldb russelldb WIP add return_body support for riak_cs ed534dd
@russelldb russelldb WIP fix typo of indexes / index that meant $key/$bucket never match a8fcd50
@russelldb russelldb Don't populate index record with bucket name as start and end term
We use keys for the range on buckets
73af44b
@russelldb russelldb Add return body option for riak cs bucket folds 238a7d8
@russelldb russelldb was assigned
russelldb added some commits
@russelldb russelldb Remove stray debugging lager info message 58bdeb1
@russelldb russelldb Fix bug where legacy queries never match in eleveldb
The wrong paramter was used to match in the function head for
to_first _key
81e6c6d
@russelldb russelldb Upgrade legacy queries to cut down the number of `index` fun heads 8f31ae3
@russelldb russelldb Add cs return body support to `memory` backend
Even though not required, backend feature parity makes sense,
more so if the features are to be added to `backend_eqc`.

Refactor `index | object _key_in_range` funs for common use.
Removed stray debugging `lager:info/1` message.
ac1755d
@evanmcc evanmcc was assigned
@engelsanchez engelsanchez was assigned
@engelsanchez engelsanchez commented on the diff
include/riak_kv_index.hrl
@@ -0,0 +1,13 @@
+%% Index query records
@engelsanchez Collaborator

Don't forget to stick a licence header or the lawyers will get you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_index.erl
((4 lines not shown))
+%% v2 2i queries
+%% @doc Create an index query of the current highest version supported by
+%% cluster capability.
+%% This 6 arity version is for the new (temporarily unsupported) `return_body' feature
+%% for CS.
+%% @see riak_kv_pb_csbucket
+to_index_query(IndexField, Args, Continuation, ReturnBody, {Start, StartInc}, {End, EndInc}) ->
+ BaseQuery = ?KV_INDEX_Q{return_body=ReturnBody,
+ start_key=Start, start_inclusive=StartInc,
+ end_term=End, end_inclusive=EndInc},
+ to_index_query(IndexField, Args, Continuation, BaseQuery).
+
+%% @doc Create an index quey of the current highest version supported by
+%% cluster capability.
+%% `IndexField' is either a user supplied term or one of
+%% the inbuilt indexes (`<<"$key">>' and <<"$bucket">>').
@engelsanchez Collaborator

A little mismatched edoc quote here around bucket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_kv_app.erl
@@ -144,6 +145,10 @@ start(_Type, _StartArgs) ->
[v1, v0],
v0),
+ riak_core_capability:register({riak_kv, '2i_version'},
@engelsanchez Collaborator

Consider using an atom name that does not require quotes. It would be one more thing that requires Erlang knowledge when overriding this in app.config. You can see how that could be a pain in the neck for users. Also, not sure 2i is prominent as a name in the docs. secondary_index_version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_index.erl
((17 lines not shown))
+%% cluster capability.
+%% `IndexField' is either a user supplied term or one of
+%% the inbuilt indexes (`<<"$key">>' and <<"$bucket">>').
+%% `Args' is a list or either a start and end for a range, or a single
+%% value for equality.
+%% `Continuation' is the opaque continuation that may have been
+%% returned from a previous query, or `undefined'.
+%% @see make_continuation/1
+to_index_query(IndexField, Args, Continuation) ->
+ to_index_query(IndexField, Args, Continuation, ?KV_INDEX_Q{}).
+
+to_index_query(IndexField, Args, Continuation, BaseQuery) ->
+ Version = riak_core_capability:get({riak_kv, '2i_version'}, v1),
+ Query = to_index_query(IndexField, Args),
+ case {Version, Query} of
+ {_Any, {error, _Reson}=Error} -> Error;
@engelsanchez Collaborator

Little harmless typo: _Reson -> _Reason

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_index.erl
((130 lines not shown))
+
+%% @doc To enable pagination.
+%% returns an opaque value that
+%% must be passed to
+%% `to_index_query/3' to
+%% get a query that will "continue"
+%% from the given last result
+-spec make_continuation(list()) -> continuation().
+make_continuation([]) ->
+ undefined;
+make_continuation(L) ->
+ Last = lists:last(L),
+ base64:encode(term_to_binary(Last)).
+
+%% @doc decode a continuation received from the outside world.
+-spec decode_contintuation(continuation() | undefined) -> last_result() | undefined.
@engelsanchez Collaborator

Props for being consistent throughout with this typo! decode_contintuation -> decode_continuation

@russelldb Owner

:blush: Sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_index.erl
((39 lines not shown))
+%% @doc upgrade a V1 Query to a v2 Query
+make_v2_query({eq, ?BUCKETFIELD, _Bucket}, Q) ->
+ Q?KV_INDEX_Q{filter_field=?BUCKETFIELD, return_terms=false};
+make_v2_query({eq, ?KEYFIELD, Value}, Q) ->
+ Q?KV_INDEX_Q{filter_field=?KEYFIELD, start_key=Value, start_term=Value,
+ end_term=Value, return_terms=false};
+make_v2_query({eq, Field, Value}, Q) ->
+ Q?KV_INDEX_Q{filter_field=Field, start_term=Value, end_term=Value, return_terms=false};
+make_v2_query({range, ?KEYFIELD, Start, End}, Q) ->
+ Q?KV_INDEX_Q{filter_field=?KEYFIELD, start_term=Start, start_key=Start,
+ end_term=End, return_terms=false};
+make_v2_query({range, Field, Start, End}, Q) ->
+ Q?KV_INDEX_Q{filter_field=Field, start_term=Start,
+ end_term=End};
+make_v2_query(V1Q, _) ->
+ {error, {invalid_v1_query, V1Q}}.
@engelsanchez Collaborator

The usual Erlang controversy: crash or return an error here. Notice how this is directly passed to apply_continuation above, which would end up in a continuation error. Also, the result of this function is sometimes directly passed to some folding functions with side effects quickly getting out of control. Consider throwing or letting it crash instead.

@russelldb Owner

This is just the existing behaviour, isn't it?

@engelsanchez Collaborator

make_v2_query is new. I'm talking about how it returns either a new query record or an error tuple, and it's used without matching in places like this: https://github.com/basho/riak_kv/pull/540/files#L2R306 or here through update_query https://github.com/basho/riak_kv/pull/540/files#L5R692
Versus returning something like {ok, NewVal} | {error, Reason} and matching on that and failing right at that point if something went wrong. Instead, the error would not crash immediately, but would be passed down and end up wrapped in some other error down the line.

@russelldb Owner

I see. {ok, Q} | {error, Reason} then {ok, Q} = make_v2_query(V1Q, V2Q) where it is used for bad match crashing?

@engelsanchez Collaborator

Yes, exactly. That would be more idiomatic and should fail with a message close to the original problem.

@russelldb Owner

Word! Will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_index.erl
((85 lines not shown))
+ when FF =:= ?KEYFIELD;
+ FF =:= ?BUCKETFIELD ->
+ true;
+return_body(_) ->
+ false.
+
+%% @doc is an index key in range for a 2i query?
+index_key_in_range({Bucket, Key, Field, Term}=IK, Bucket,
+ ?KV_INDEX_Q{filter_field=Field,
+ start_key=StartKey,
+ start_inclusive=StartInc,
+ start_term=StartTerm,
+ end_term=EndTerm})
+ when Term >= StartTerm,
+ Term =< EndTerm ->
+ in_range(gt(StartInc, Key, StartKey), true, IK);
@engelsanchez Collaborator

It might be worth doing some performance timing of this function and its used of in_range() and gt(). They sound like good candidates to be inlined or become macros instead, since they could be used quite a lot. I'll do some of that after reviewing the code, but I was wondering if you have already.

@russelldb Owner

I haven't, no. I did a little b_b on the whole PR, but no micro benching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
engelsanchez added some commits
@engelsanchez engelsanchez Add EQC test for sms (merge sort) module b568d76
@engelsanchez engelsanchez Fix 2 edge cases found during testing
* Allow add_results to be called repeatedly with done, it's harmless.
* Allow sms to be called when all sources are done without crashing.
cc4d295
src/riak_kv_vnode.erl
((13 lines not shown))
FilterVNodes, Sender, Opts, State);
false ->
{reply, {error, {indexes_not_supported, Mod}}, State}
end.
%% Convenience for handling both v3 and v4 coverage-based key fold operations
-handle_coverage_keyfold(Bucket, ItemFilter, ResultFun,
+handle_coverage_keyfold(Bucket, ItemFilter, Query,
+ FilterVNodes, Sender, State,
+ ResultFunFun) ->
+ handle_coverage_keyfold(fold_keys, Bucket, ItemFilter, Query,
+ FilterVNodes, Sender, State, ResultFunFun).
+
+%% Until a bit of a refactor can occur to better abstract
+%% index operations, allow the ModFun for folding to be declared
+%% to support index operations that can return objects
+handle_coverage_keyfold(FoldType, Bucket, ItemFilter, ResultFun,
@engelsanchez Collaborator

We now have a bit of a misnomer here. Consider changing to something like handle_coverage_fold instead now that we do both key only and key/value folds?

@russelldb Owner

I'm not sure. I get your meaning…but…hmmm…I think really a deeper refactor is needed. The query should dictate the fold type, probably the backend should decide. I see this as a hack / halfway house. Changing the name might even add to the confusion? I totally agree with you, but think there is some more discussion to be had.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_kv_wm_index.erl
@@ -45,9 +45,13 @@
client, %% riak_client() - the store client
riak, %% local | {node(), atom()} - params for riak client
bucket, %% The bucket to query.
- index_query %% The query..
+ index_query, %% The query..
+ max_results :: all | pos_integer(), %% maximum nunber of 2i results to return, the page size.
@engelsanchez Collaborator

Little typo here, nunber

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@engelsanchez engelsanchez commented on the diff
src/riak_kv_wm_index.erl
@@ -89,24 +93,57 @@ malformed_request(RD, Ctx) ->
IndexField = list_to_binary(riak_kv_wm_utils:maybe_decode_uri(RD, wrq:path_info(field, RD))),
Args1 = wrq:path_tokens(RD),
Args2 = [list_to_binary(riak_kv_wm_utils:maybe_decode_uri(RD, X)) || X <- Args1],
+ ReturnTerms0 = wrq:get_qs_value(?Q_2I_RETURNTERMS, "false", RD),
+ ReturnTerms = normalize_boolean(string:to_lower(ReturnTerms0)),
@engelsanchez Collaborator

Here normalize_boolean considers anything not in (0,no,false) to be true. This easily ends up frustrating clients due to confusion. Consider having a white list. That is, allowing only "1", "yes", "true" only as true values. I have not looked around to see what if any other boolean values we are using through HTTP here. Have you? It would be good to stay standard if we have any other parameters of the same kind.

@seancribbs Owner

:+1: @engelsanchez Please use a single value there. Our client interfaces are leaky enough already.

@engelsanchez Collaborator

Looking around, it seems we use strict "true", "false" (and a third "default") in riak_kv_wm_object (see riak_kv_wm_object:malformed_boolean_param) for booleans. It would be better if we stick to that I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@engelsanchez engelsanchez commented on the diff
src/riak_kv_wm_index.erl
@@ -127,19 +164,121 @@ encodings_provided(RD, Ctx) ->
%% @spec produce_index_results(reqdata(), context()) -> {binary(), reqdata(), context()}
%% @doc Produce the JSON response to an index lookup.
produce_index_results(RD, Ctx) ->
- %% Extract vars...
+ case wrq:get_qs_value("stream", "false", RD) of
+ "true" ->
@engelsanchez Collaborator

Whereas here stream has to be true to work, and anything else is considered false. We should try to stay consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_kv_wm_index.erl
((46 lines not shown))
+ {ReqID, {results, Results}} ->
+ %% JSONify the results
+ JsonResults = encode_results(ReturnTerms, Results),
+ Body = ["\r\n--", Boundary, "\r\n",
+ "Content-Type: application/json\r\n\r\n",
+ JsonResults],
+ LastResult1 = last_result(Results),
+ Count1 = Count + length(Results),
+ {iolist_to_binary(Body),
+ index_stream_helper(ReqID, Boundary, ReturnTerms, MaxResults, LastResult1, Count1)};
+ {ReqID, Error} ->
+ lager:error("Error in index wm: ~p", [Error]),
+ Body = ["\r\n--", Boundary, "\r\n",
+ "Content-Type: text/plain\r\n\r\n",
+ "there was an error..."],
+ {iolist_to_binary(Body), done}
@engelsanchez Collaborator

I'm no HTTP expert so I'm not sure what the best way to return an error while streaming index results is. Did you check with Mr. @seancribbs ?

@engelsanchez Collaborator

Discussed with Sean Cribbs offline. The recommendation is to definitely return a JSON response here. Something reasonable would be:
{ "error" : "Some reason" }

@seancribbs Owner

loese

@russelldb Owner

behoove always wins

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@engelsanchez engelsanchez commented on the diff
src/riak_kv_wm_index.erl
((95 lines not shown))
+ mochijson2:encode(JsonKeys2);
+encode_results(false, Results, Continuation) ->
+ JustTheKeys = filter_values(Results),
+ JsonKeys1 = {struct, [{?Q_KEYS, JustTheKeys}] ++ mochify_continuation(Continuation)},
+ mochijson2:encode(JsonKeys1).
+
+mochify_continuation(undefined) ->
+ [];
+mochify_continuation(Continuation) ->
+ [{?Q_2I_CONTINUATION, Continuation}].
+
+filter_values([]) ->
+ [];
+filter_values([{_, _} | _T]=Results) ->
+ [K || {_V, K} <- Results];
+filter_values(Results) ->
@engelsanchez Collaborator

What other type of results are you expecting here?

@russelldb Owner

Erm…asking the past…maybe just keys from a backend that isn't v2 yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@engelsanchez engelsanchez commented on the diff
src/riak_kv_index_fsm.erl
@@ -77,14 +82,58 @@ req(Bucket, ItemFilter, Query) ->
%% should cover, the service to use to check for available nodes,
%% and the registered name to use to access the vnode master process.
init(From={_, _, _}, [Bucket, ItemFilter, Query, Timeout]) ->
+ %% http://erlang.org/doc/reference_manual/expressions.html#id77404
+ %% atom() > number()
@engelsanchez Collaborator

I'd recommend not to leave this note here. I won't stop you from tattooing this on your wrist though.

@russelldb Owner

Previous reviewer recommended adding it, you guys have me in a spin!

@engelsanchez Collaborator

Ha! Bah, ignore me then. It doesn't matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_kv_index_fsm.erl
((16 lines not shown))
+plan(CoverageVNodes, State) ->
+ {ok, State#state{merge_sort_buffer=sms:new(CoverageVNodes)}}.
+
+process_results(_VNode, {error, Reason}, _State) ->
+ {error, Reason};
+process_results(_VNode, {From, _Bucket, _Results}, State=#state{max_results=X, results_sent=Y}) when Y >= X ->
+ riak_kv_vnode:stop_fold(From),
+ {done, State};
+process_results(VNode, {From, Bucket, Results}, State) ->
+ {ok, State2} = process_results(VNode, {Bucket, Results}, State),
+ riak_kv_vnode:ack_keys(From),
@engelsanchez Collaborator

I'm not sure if I got this right, but wouldn't it make sense to check for Count reaching max results here and calling stop_fold instead of ack_keys in that case here? It seems we are waiting for the next batch of results to check the count.

@russelldb Owner

I guess that would be an optimisation, yes. I'll do it!

@russelldb Owner

Better still would be to tell all vnodes to stop_fold at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_kv_index_fsm.erl
((18 lines not shown))
+ {ok, State#state{merge_sort_buffer=sms:new(CoverageVNodes)}}.
+
+process_results(_VNode, {error, Reason}, _State) ->
+ {error, Reason};
+process_results(_VNode, {From, _Bucket, _Results}, State=#state{max_results=X, results_sent=Y}) when Y >= X ->
+ riak_kv_vnode:stop_fold(From),
+ {done, State};
+process_results(VNode, {From, Bucket, Results}, State) ->
+ {ok, State2} = process_results(VNode, {Bucket, Results}, State),
+ riak_kv_vnode:ack_keys(From),
+ {ok, State2};
+process_results(VNode, {_Bucket, Results}, State) ->
+ #state{merge_sort_buffer=MergeSortBuffer,
+ from={raw, ReqId, ClientPid}, results_sent=ResultsSent, max_results=MaxResults} = State,
+ %% add new results to buffer
+ BufferWithNewResults = sms:add_results(VNode, lists:reverse(Results), MergeSortBuffer),
@engelsanchez Collaborator

So the FSM reverses results as they are received, and the client reverses the entire list once it has received everything (https://github.com/basho/riak_kv/pull/540/files#L1R584). Without really thinking too much about the details, I wonder if we can optimize the code to minize that.

@russelldb Owner

I'm open to suggestions. I admit it seems sort of wasteful and is apparently suboptimal…I haven't measured the cost.

@engelsanchez Collaborator

I think it's OK to put my little optimization notes in the backburner. Once we verify this as correct and are happy with the test coverage, we can try to squeeze the extra cycles in a different PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_kv_index_fsm.erl
((31 lines not shown))
+ from={raw, ReqId, ClientPid}, results_sent=ResultsSent, max_results=MaxResults} = State,
+ %% add new results to buffer
+ BufferWithNewResults = sms:add_results(VNode, lists:reverse(Results), MergeSortBuffer),
+ ProcessBuffer = sms:sms(BufferWithNewResults),
+ {NewBuffer, Sent} = case ProcessBuffer of
+ {[], BufferWithNewResults} ->
+ {BufferWithNewResults, 0};
+ {ToSend, NewBuff} ->
+ DownTheWire = case (ResultsSent + length(ToSend)) > MaxResults of
+ true ->
+ lists:sublist(ToSend, MaxResults - ResultsSent);
+ false ->
+ ToSend
+ end,
+ ClientPid ! {ReqId, {results, DownTheWire}},
+ {NewBuff, length(DownTheWire)}
@engelsanchez Collaborator

As a minor optimization you can avoid computing the O(n) length of the list here again (pass the length in the case statement above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_kv_pb_csbucket.erl
@@ -0,0 +1,122 @@
+%% -------------------------------------------------------------------
+%%
+%% riak_kv_pb_index: Expose secondary index queries to Protocol Buffers
+%%
+%% Copyright (c) 2012 Basho Technologies, Inc. All Rights Reserved.
@engelsanchez Collaborator

Already outdated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_kv_pb_csbucket.erl
((23 lines not shown))
+%% This covers the following request messages:</p>
+%%
+%% <pre>
+%% 40 - RpbCSBucketReq
+%% </pre>
+%%
+%% <p>This service produces the following responses:</p>
+%%
+%% <pre>
+%% 41 - RpbCSBucketResp
+%% </pre>
+%% @end
+
+-module(riak_kv_pb_csbucket).
+
+-include_lib("../../riak_pb/include/riak_kv_pb.hrl"). %% @TODO reset this
@engelsanchez Collaborator

You should be able to use -include_lib("riak_pb/include/riak_kv_pb.hrl"). here, no?

@russelldb Owner

yes, see the TODO, I shoulda todone it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/riak_kv_pb_csbucket.erl
((74 lines not shown))
+ {error, {format, Reason}, State};
+maybe_perform_query({ok, Query}, Req, State) ->
+ #rpbcsbucketreq{bucket=Bucket, max_results=MaxResults} = Req,
+ #state{client=Client} = State,
+ {ok, ReqId} = Client:stream_get_index(Bucket, Query, [{max_results, MaxResults}]),
+ {reply, {stream, ReqId}, State#state{req_id=ReqId, req=Req}}.
+
+
+%% @doc process_stream/3 callback. Handle streamed responses
+process_stream({ReqId, done}, ReqId, State=#state{req_id=ReqId,
+ continuation=Continuation,
+ req=Req,
+ result_count=Count}) ->
+ %% Only add the continuation if there may be more results to send
+ #rpbcsbucketreq{max_results=MaxResults} = Req,
+ Resp = case is_integer(MaxResults) andalso Count =:= MaxResults of
@engelsanchez Collaborator

I believe this is correct, but consider doing Count >= MaxResults for total future proofing paranoia. Same with riak_kv_pb_index actually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@engelsanchez engelsanchez commented on the diff
src/riak_kv_pb_csbucket.erl
((101 lines not shown))
+ Results = [encode_result(Bucket, {K, V}) || {o, K, V} <- Results0],
+ {reply, #rpbcsbucketresp{objects=Results},
+ State#state{continuation=Continuation, result_count=Count2}};
+process_stream({ReqId, Error}, ReqId, State=#state{req_id=ReqId}) ->
+ {error, {format, Error}, State#state{req_id=undefined}};
+process_stream(_,_,State) ->
+ {ignore, State}.
+
+encode_result(B, {K, V}) ->
+ RObj = riak_object:from_binary(B, K, V),
+ Contents = riak_pb_kv_codec:encode_contents(riak_object:get_contents(RObj)),
+ VClock = pbify_rpbvc(riak_object:vclock(RObj)),
+ GetResp = #rpbgetresp{vclock=VClock, content=Contents},
+ #rpbindexobject{key=K, object=GetResp}.
+
+pbify_rpbvc(Vc) ->
@engelsanchez Collaborator

This will need revisiting once the Vclock encoding is merged! Add a TODO note somewhere. We're going to merge that one today.

@engelsanchez Collaborator

OK, vclock encoding is on master now. Take a look at riak_pb_object.erl for the new way of doing this (basically use riak_object encode_vclock/decode_vclock).

@russelldb Owner

But that means merging with master to test. Sheesh! Or maybe…should I rebase this branch and push??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@engelsanchez
Collaborator

There is one funny side effect of the way max results and continuations work: Currently we make no attempt to deduplicate results from a 2i query. So, if item 1 is indexed on field f1 for values 1,2,3, and you do a range query from 1 to 3, you will get the same item 3 times. Now, if you specify max results as 2, you get the value twice, plus a continuation. If you query the continuation, you get nothing. Since we are now sorting the results, deduplicating becomes very cheap. I'm wondering if we should start doing it. On the other hand, if users rely on this property and use the number of duplicates for something, continuations might break that. Thoughts?

@engelsanchez
Collaborator

More on the duplicates being dropped behaviour: if return_terms is true the duplicates that we are dropping may contain information useful to the client. Or at least it's easier to imagine they might care about that, since some terms will sometimes be retrieved, sometimes lost.

@russelldb
Owner

I don't understand. Can you provide a test or some minimal set of data and a query that reproduce the condition, please?

@engelsanchez
Collaborator

https://gist.github.com/engelsanchez/5630275
While I was playing with this I noticed the above: a single entry indexed by 3 values (1,2,3). A range query from 1-3 will return 3 items. With return_terms, each item is tagged with one of (1,2,3). With max results = 2, you get two of those, but of course the continuation won't get that key again, so you don't get the third one.

I'm not saying this is bad or unintended. Just that it might interfere with what the users of return_terms might want to do with that. Maybe ask them directly if it matters?

@russelldb
Owner

It is bad and unintended. I really hadn't noticed it or thought of it. I'm not sure what the answer it, except to expose the start_key_incl option to the end user, even then it is a bit of a guess on their part.

@russelldb
Owner

OK. I think this is a straight up bug from considering the key and the index value in isolation. Let me work on it this morning, I have some ideas.

@russelldb russelldb Consider key and term in concert when deciding if an item is in range
A bug found by @EngelSanchez, where `start_incl` being false
(when using a continuation, for example) led to in range values
being excluded as the key was checked in isolation.
c2d6d2e
@russelldb
Owner

Pooshed a fix for the dropped-duplicate-key-after-continuation bug, (good catch BTW! Glad we didn't ship that.) Pooshed test for same to riak_test.

Would you say now is a good time to address your other comments (i.e. are you through with the majority of the review?)

@engelsanchez
Collaborator

I've verified the fix works on my example. Good work man! Simple, small and effective fix. Will keep banging on it to see what else I can find.

russelldb added some commits
@russelldb russelldb Address many review comments (see body of commit message)
Clean-up and slightly optimise the index fsm to not recalculate
the same list length more than once. Shortcut sending the `stop_fold`
message where possible (i.e. when we notice the have enough results).

Add JSON error to 2i HTTP strem
Update all copyright notices
Normalise boolean paramters to strict true | false
94f14fd
@russelldb russelldb Rename key fold to general fold, since it does objects too now 533771c
@engelsanchez
Collaborator

A couple more things: remember to look at my branch adding an EQC test for sms, which also contains a couple of proposed edge case fixes: https://github.com/basho/riak_kv/commits/eas-add-sms-eqc

And you will need to merge from master before the PR gets its green button again (there is a little conflict in riak_kv_app where capabilities are added). Please do that before the final :+1: so that I can go over the final version of the code.

@engelsanchez
Collaborator

I'm having trouble using max_results and return_terms together through the HTTP interface

curl http://localhost:10018/buckets/bucket/index/i1_int/1/4?return_terms=true&max_results=2
{"results":[{"1":"1"},{"2":"1"},{"2":"2"},{"3":"1"},{"3":"2"},{"4":"2"}]}
curl http://localhost:10018/buckets/bucket/index/i1_int/1/4?max_results=2&return_terms=true
{"keys":["1","1"],"continuation":"g2gCYQJtAAAAATE="}

Also, validation doesn't seem to kick in for the second parameter. Maybe I'm doing something stupid, but I haven't figured it out. I'll debug this more right after lunch.

@russelldb
Owner

This is why I hate changes on review. I tested this stuff in the past. Ok. I'll add some more tests to riak_test and see what is up. Thanks.

@russelldb
Owner

Huh. Works for me. I enquoted my url+query params thussly:

curl "http://localhost:8098/buckets/mybucket/index/field1_int/1/4?max_results=2&return_terms=true"
{"results":[{"1":"1"},{"2":"2"}],"continuation":"g2gCYQJtAAAAATI="}

and

curl "http://localhost:8098/buckets/mybucket/index/field1_int/1/4?return_terms=true&max_results=2"
{"results":[{"1":"1"},{"2":"2"}],"continuation":"g2gCYQJtAAAAATI="}

and

curl "http://localhost:8098/buckets/mybucket/index/field1_int/1/4?max_results=2&return_terms=bum"
Invalid "return_terms". "bum" is not a boolean

and

curl "http://localhost:8098/buckets/mybucket/index/field1_int/1/4?return_terms=false&max_results=hippo"
Invalid "max_results". "hippo" is not a positive integer

Are all your branches up-to-date? I'll push the latest, which merges master and your EQC test (thanks for that, great work!)

/me shrugs

@engelsanchez
Collaborator

Yep. I tried on the browser and it works fine too. Nothing to see here, move along, nevermind... Sorry

@russelldb
Owner

Heh, better safe, no worries.

@engelsanchez
Collaborator

Ok, I think we can call it a day. I believe all comments were addressed. New testing has not uncovered any new problems. Ship it!
:+1: :dancer: :boat:

russelldb and others added some commits
@russelldb russelldb Fix bad match in query upgrade 2ccaae9
@russelldb russelldb Merge branch 'master' into pt29-2i-pagination
Conflicts:
	src/riak_kv_app.erl
	src/riak_kv_wm_raw.hrl
ee08440
@slfritchie slfritchie Fix test expectation, after index query changes return term in Key's …
…place

Failing query test case:

    V1 = backend_eqc:init_backend(riak_kv_eleveldb_backend,false, [{data_root,"test/eleveldb-backend"}]).
    riak_kv_eleveldb_backend:put(<<"b2">>,<<"k1">>, [{add,<<"k_int">>,0}], <<110>>, V1).
    riak_kv_eleveldb_backend:fold_keys(fun(B, K, Acc) -> [{B, K}|Acc] end, [],

Return value:

    [{index,<<"b2">>,{range,<<"k_int">>,0,0}}], V1).

@rdb Please check to see how much effort vs. value might be required
to address the TODO comment.  If it's too much work now, then please
create an issue for fixing & removing that TODO item so that the matter
doesn't get lost.

After this commit, tested via:

    P = backend_eqc : property ( riak_kv_eleveldb_backend , false , [ { data_root , "test/eleveldb-backend" } ] ).
    eqc:quickcheck(eqc:testing_time(4*300, P)).
    [...]
    OK, passed 9670 tests
6934d65
@slfritchie
Owner

From the commit log:

@rdb Please check to see how much effort vs. value might be required
to address the TODO comment. If it's too much work now, then please
create an issue for fixing & removing that TODO item so that the matter
doesn't get lost.

@slfritchie
Owner

Oh, and the bug fixed by commit 6934d65 is something that only pops up every few minutes, in my experience with the backend_eqc test. The namespace (bucket + key + index) is likely too big to hit a case where a bucket + key + index is written and then sometime later a fold_keys is attempted with a query on the exact same index.

So, whenever a developer plays with areas that would affect the code that this test exercises, it would be extremely wise to run the backend_eqc test for a much longer period of time than the usual "make test" does. See my comment on that commit for an example.

@slfritchie slfritchie merged commit 6934d65 into from
@russelldb
Owner

Awesome @slfritchie. Thanks for finding and fixing. I think I'll fix that test properly during this testing cycle (add all the new query types, make it use V2 queries etc.) Thanks again. It was on the list to update the test, but time is cruel.

@engelsanchez engelsanchez deleted the branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on May 1, 2013
  1. @russelldb

    Feature request, optionally return indexed values with keys

    russelldb authored
    When a 2i query's results are a list of keys. This change adds
    an optional request parameter to pb and http that, when set, will
    cause pairs of `key:index_value` to be returned as the result of
    a 2i query.
    
    Has no effect on MapReduce. Uses capabilities for rolling upgrade
    situations.
  2. @russelldb

    Add pagination to 2i results

    russelldb authored
    Adds a `max_results` option to PB and HTTP 2i endpoints for limiting
    the number of results returned for a 2i query.  Will return an opaque
    `continuation` if there are at least `max_results` results returned. A
    subsequent query that includes the `continuation` as a parameter will
    return the next `max_results` results.
    
    The results will be returned sorted ascending. In the case of $key
    and $bucket queries, and equals queries, the results will be sorted by
    primary key In the case of range queries the results will be sorted by
    index value. Since the values must be returned from the backend for
    sorting, there is a further option `return_terms` that will cause the
    indexed terms to be returned to the user also, where relevant (range
    queries on index fields.)
    
    Sorting is performed by the module `sms.erl` originally written by Reid
    Draper. It is used as a buffer by `riak_kv_index_fsm`.
    
    A new vnode function `stop_fold` has been added so the fsm can tell
    the backend to break out of the fold when enough results have been
    gathered.
    
    The option to stream 2i results has also been added. Adding the
    `stream` option to either HTTP or PB queries will cause the API
    endpoints to return chunks of results as they are received from the
    fsm This can lead to much lower perceived latencies as the application
    recieves the first results of a large set much sooner.
    
    An example HTTP query: `curl
    "http://127.0.0.1:10018/buckets/2ibucket/index/field2_int/1/100?return_terms=true"`
    Will return the first 100 {value, key} pairs for the `field2_int`
    index, sorted by value.  If there are 100 results returned, it will
    also return an opaque `continuation` that may be sumbitted to continue
    the query: `curl
    "http://127.0.0.1:10018/buckets/2ibucket/index/field2_int/1/100?return_terms=true&continuation=some_gibberish_looking_value"`
    A further `continuation` maybe returned with the results.
  3. @russelldb

    Remove lager info message

    russelldb authored
    Spotted lager info message left over from dev, excised it.
  4. @russelldb

    Streaming 2i results for HTTP was broken

    russelldb authored
    I have no idea how. Tests passed, then they didn't. I guess a
    rebase went wrong.
  5. @russelldb
  6. @russelldb
  7. @russelldb
  8. @russelldb

    Don't populate index record with bucket name as start and end term

    russelldb authored
    We use keys for the range on buckets
Commits on May 3, 2013
  1. @russelldb
Commits on May 7, 2013
  1. @russelldb
  2. @russelldb

    Fix bug where legacy queries never match in eleveldb

    russelldb authored
    The wrong paramter was used to match in the function head for
    to_first _key
  3. @russelldb
Commits on May 9, 2013
  1. @russelldb

    Add cs return body support to `memory` backend

    russelldb authored
    Even though not required, backend feature parity makes sense,
    more so if the features are to be added to `backend_eqc`.
    
    Refactor `index | object _key_in_range` funs for common use.
    Removed stray debugging `lager:info/1` message.
Commits on May 21, 2013
  1. @engelsanchez
  2. @engelsanchez

    Fix 2 edge cases found during testing

    engelsanchez authored
    * Allow add_results to be called repeatedly with done, it's harmless.
    * Allow sms to be called when all sources are done without crashing.
Commits on May 23, 2013
  1. @russelldb

    Consider key and term in concert when deciding if an item is in range

    russelldb authored
    A bug found by @EngelSanchez, where `start_incl` being false
    (when using a continuation, for example) led to in range values
    being excluded as the key was checked in isolation.
Commits on May 24, 2013
  1. @russelldb

    Address many review comments (see body of commit message)

    russelldb authored
    Clean-up and slightly optimise the index fsm to not recalculate
    the same list length more than once. Shortcut sending the `stop_fold`
    message where possible (i.e. when we notice the have enough results).
    
    Add JSON error to 2i HTTP strem
    Update all copyright notices
    Normalise boolean paramters to strict true | false
  2. @russelldb
  3. @russelldb
  4. @russelldb

    Merge branch 'master' into pt29-2i-pagination

    russelldb authored
    Conflicts:
    	src/riak_kv_app.erl
  5. @russelldb
  6. @russelldb

    Merge branch 'master' into pt29-2i-pagination

    russelldb authored
    Conflicts:
    	src/riak_kv_app.erl
    	src/riak_kv_wm_raw.hrl
Commits on May 25, 2013
  1. @slfritchie

    Fix test expectation, after index query changes return term in Key's …

    slfritchie authored
    …place
    
    Failing query test case:
    
        V1 = backend_eqc:init_backend(riak_kv_eleveldb_backend,false, [{data_root,"test/eleveldb-backend"}]).
        riak_kv_eleveldb_backend:put(<<"b2">>,<<"k1">>, [{add,<<"k_int">>,0}], <<110>>, V1).
        riak_kv_eleveldb_backend:fold_keys(fun(B, K, Acc) -> [{B, K}|Acc] end, [],
    
    Return value:
    
        [{index,<<"b2">>,{range,<<"k_int">>,0,0}}], V1).
    
    @rdb Please check to see how much effort vs. value might be required
    to address the TODO comment.  If it's too much work now, then please
    create an issue for fixing & removing that TODO item so that the matter
    doesn't get lost.
    
    After this commit, tested via:
    
        P = backend_eqc : property ( riak_kv_eleveldb_backend , false , [ { data_root , "test/eleveldb-backend" } ] ).
        eqc:quickcheck(eqc:testing_time(4*300, P)).
        [...]
        OK, passed 9670 tests
Something went wrong with that request. Please try again.