Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MR/bucket types incompatibility #703

Merged
merged 5 commits into from Oct 22, 2013

Conversation

engelsanchez
Copy link
Contributor

This adds a special case to keep the old format for bkey streams when
using the default bucket. It also changes the order of jsonified lists
for bkey/keydata items that have a bucket type, moving the bucket type
last. That should make all existing JS and Erlang MR reduce functions
compatible as long as they don't have to deal with bucket types (within
the same bucket, etc). The functions will only need updating to deal
with bucket types by accessing the 4th item in the list. This is
compatible with the old format, as it will be null in pre-bucket types
Riak. That way no complicated checks for list length are needed to
determine the order of the items. Keydata, the 3rd item in the list,
will be null when a bucket type is present as a 4th member. Erlang
reduce functions will need a new clause if bucket types will be present,
but not if only default. All MR riak tests should pass with this now.

/cc @Vagabond @beerriot

This adds a special case to keep the old format for bkey streams when
using the default bucket.  It also changes the order of jsonified lists
for bkey/keydata items that have a bucket type, moving the bucket type
last.  That should make all existing JS and Erlang MR reduce functions
compatible as long as they don't have to deal with bucket types (within
the same bucket, etc). The functions will only need updating to deal
with bucket types by accessing the 4th item in the list. This is
compatible with the old format, as it will be null in pre-bucket types
Riak. That way no complicated checks for list length are needed to
determine the order of the items.  Keydata, the 3rd item in the list,
will be null when a bucket type is present as a 4th member. Erlang
reduce functions will need a new clause if bucket types will be present,
but not if only default. All MR riak tests should pass with this now.
@beerriot
Copy link

I am generally in favor of this change. Keeping the current array assignments is quite nice. Just a couple things:

If we're going to use [Bucket, Key, KeyData, Type] as output, we should also accept it as input: https://github.com/basho/riak_kv/blob/develop/src/riak_kv_mapred_json.erl#L153

It looks like we may have also missed adding the 4-element representation to riak_kv_pipe_get the first time around: https://github.com/basho/riak_kv/blob/develop/src/riak_kv_pipe_get.erl#L157-175

@seancribbs
Copy link
Contributor

Umm, in other places we've assumed that Type would come first, specifically in the inputs: {{{Type,Bucket},Key},Keydata} Does this break things?

@engelsanchez
Copy link
Contributor Author

Sean, notice that we are leaving the erlang representation the same. it goes across nodes in a cluster and such, and it's handled internally, functions don't see it unless there is no map phase. For that, we are just simplifying it if bucket is default. The order would be for JS function compatibility, when they become JS lists.

@seancribbs
Copy link
Contributor

Ok, but it's still inconsistent with the input format used in the JSON,
which puts the Type first. I don't feel strongly one way or the other, I
just think we should pick one.

On Thu, Oct 17, 2013 at 4:35 PM, Engel A. Sanchez
notifications@github.comwrote:

Sean, notice that we are leaving the erlang representation the same. it
goes across nodes in a cluster and such, and it's handled internally,
functions don't see it unless there is no map phase. For that, we are just
simplifying it if bucket is default. The order would be for JS function
compatibility, when they become JS lists.


Reply to this email directly or view it on GitHubhttps://github.com//pull/703#issuecomment-26517877
.

Sean Cribbs sean@basho.com
Software Engineer
Basho Technologies, Inc.
http://basho.com/

@engelsanchez
Copy link
Contributor Author

I'm going to disagree on this instance. I would love to achieve consistency, but my argument here is that it is more important to have a seamless transition to 2.0. Messing with existing MR function libraries now doesn't seem sane. Especially since we are de-emphasizing it a bit and bucket types probably doesn't get much for most current uses.

Fix parsing of not found including bucket type.
Fix erlification of JS lists containing bucket type.
@engelsanchez
Copy link
Contributor Author

I messed around with MR and bucket types today and added some fixes. This should address Bryan's concerns and also handle not founds involving buckets with types. I'm adding stuff to riak test here: basho/riak_test#417

I'll try to beef the coverage of that more tomorrow.

Changing from type to bucket_type and ensuring serialization round trip
is included in the unit test.
There is a bucket type cleanup item to make sure we don't use the word
type as a property of an object, but bucket_type instead. This does that
for the JS objects used in MR.
@Vagabond
Copy link
Contributor

I'm grabbing this from the vacationing Bryan.

@@ -319,6 +319,12 @@ parse_query(Invalid, _Accum) ->
" ",mochijson2:encode(Invalid),"\n"]}.

dejsonify_not_found({struct, [{<<"not_found">>,
{struct, [{<<"bucket_type">>, BType},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any fields are added or the ordering changes all these pattern matches will break...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I guess other clauses were already like this...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, existing code does the same. I believe this is only internal decodec, so we know the order well.

@Vagabond
Copy link
Contributor

riak_test http_bucket_types fails with this patch on line 321. A bad_type is being returned from somewhere and it is causing a badmatch:

2013-10-21 17:58:46 =ERROR REPORT====
Error in process <0.4058.0> on node 'dev1@127.0.0.1' with exit value: {function_clause,[{proplists,lookup,[chash_keyfun,{error,no_type}],[{file,"proplists.erl"},{line,143}]},{riak_core_util,chash_key,1,[{file,"src/riak_core_util.erl"},{line,235}]},{riak_pipe_vnode,queue_work,4,[{file,"src/ri...

2013-10-21 17:58:46 =SUPERVISOR REPORT====
     Supervisor: {local,riak_pipe_builder_sup}
     Context:    child_terminated
     Reason:     {function_clause,[{proplists,lookup,[chash_keyfun,{error,no_type}],[{file,"proplists.erl"},{line,143}]},{riak_core_util,chash_key,1,[{file,"src/riak_core_util.erl"},{line,235}]},{riak_pipe_vnode,queue_work,4,[{file,"src/riak_pipe_vnode.erl"},{line,235}]},{riak_kv_mrc_pipe,'-send_inputs/3-lc$^0/1-0-',2,[{file,"src/riak_kv_mrc_pipe.erl"},{line,580}]},{riak_kv_mrc_pipe,send_inputs,3,[{file,"src/riak_kv_mrc_pipe.erl"},{line,580}]},{riak_kv_mrc_pipe,'-send_inputs_async/3-fun-0-',3,[{file,"src/riak_kv_mrc_pipe.erl"},{line,557}]}]}
     Offender:   [{pid,<0.4054.0>},{name,undefined},{mfargs,{riak_pipe_builder,start_link,undefined}},{restart_type,temporary},{shutdown,brutal_kill},{child_type,worker}]

Except that the type in question exists and is listed in riak-admin bucket-type list.

Also, this does not fail over the pb interface, which does the same test.

@engelsanchez
Copy link
Contributor Author

@Vagabond I forgot to look in the Riak Http client to make it follow this PR. You will need basho/riak-erlang-http-client#26 to test this.

@Vagabond
Copy link
Contributor

+1

engelsanchez added a commit that referenced this pull request Oct 22, 2013
…lity

Fix MR/bucket types incompatibility
@engelsanchez engelsanchez merged commit d956422 into develop Oct 22, 2013
@engelsanchez engelsanchez deleted the bugfix/bucket-types-mr-incompatibility branch October 22, 2013 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants