ReQL proposal: `r.merge` should accept a function #1345

Closed
coffeemug opened this Issue Aug 22, 2013 · 13 comments

Projects

None yet

4 participants

@coffeemug
Member

A rather common use case is to go through a sequence and for each document, merge another document based on a subquery that depends on the original document. E.g.:

r.table('foo').merge{|x| { foo => r.table('bar').get(x['some_id']) } }

Currently merge doesn't accept a function, which means the query above has to be rewritten with a map and an inner merge. I ran into this a few times and it has been quite annoying. I propose allowing merge to accept a function.

@neumino
Member
neumino commented Aug 22, 2013

Slightly related to #1089

@mlucy mlucy was assigned Oct 24, 2013
@mlucy
Member
mlucy commented Nov 27, 2013

I agree that we should do this. I'll mark this as API_settled at the end of the week if nobody objects.

@mlucy
Member
mlucy commented Nov 28, 2013

So, even though the API isn't settled I went ahead and put this up in review 1063 (I won't merge it in (haw!) until we've settled on an API). I put it up early because I had some free time, and because there are actually a few subtleties in the implementation related to accepting r.literal in function bodies and not compiling functions n times.

@coffeemug
Member

Since we're talking about this stuff elsewhere, I think r.merge should have an implicit default. Hides under a desk

@mlucy
Member
mlucy commented Dec 3, 2013

What should that implicit default be?

@jdoliner
Contributor
jdoliner commented Dec 3, 2013

What @coffeemug said doesn't actually make any sense but I'm guessing he's thinking about dropping rows from merge that are missing an attribute. We really shouldn't do that. It's pretty nice to have the invariant that merge doesn't change the size of a stream. This is the same reason we don't drop rows from pluck I believe.

@coffeemug
Member

Here's precisely what I mean:

r.expr([{a: 1}, {b: 2}]).merge(function(doc) { return { c: doc('a').add(1) } })
=> [{a: 1, c: 2}
    {b: 2}]

I.e. we don't drop the row, we drop the "merge", so the default should be an empty object.

@mlucy
Member
mlucy commented Dec 4, 2013

That's not a totally unreasonable idea, but I would rather open a separate issue for it. (I would also probably not put it into 1.12, since my guess is that it will require some discussion, whereas the basic "make merge polymorphic" proposal has broad support.)

@coffeemug
Member

Since I added this to the API late, I don't mind opening a separate issue and moving it out of 1.12. But could we do a quick straw poll first? Is anyone opposed to this/would like to discuss this further?

@mlucy
Member
mlucy commented Dec 4, 2013

I would definitely like to discuss it further; my initial reaction is skepticism. If you try to reference a non-existent field in e.g. an update you get an error. This is important because if we just silently skip merging elements like this, people might think that they're performing a write when in fact nothing gets written to disk. I might be able to be convinced otherwise, though.

Specifically, I'm worried about your semantics for queries like:

table.update{|row| {c: row['a']+1} # error
table.update{|row| {subrows: row['subrows'].merge{|subrow| {c: subrow['a']+1}}}} # not an error
@coffeemug
Member

Ok, see #1737 for discussion on this. Consider it out of the scope of this issue.

@mlucy
Member
mlucy commented Dec 5, 2013

There seems to be agreement on the scope of this. Marking it as settled.

@mlucy
Member
mlucy commented Dec 5, 2013

This is in next.

@mlucy mlucy closed this Dec 5, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment