Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a ReQL command to interate over an array. #2574

Closed
robert-zaremba opened this issue Jun 19, 2014 · 60 comments
Closed

Add a ReQL command to interate over an array. #2574

robert-zaremba opened this issue Jun 19, 2014 · 60 comments

Comments

@robert-zaremba
Copy link

The use case is to map values of an array knowing both index of an element and element itself.

Let's say we want to merge two arrays by summing appropriate elements, something we could do in Python:

def sumLists(l1, l2):
    return [l1[i] + l2[i] for i in range(len(l1))]

Some idea might be:

r.map(l1, function(idx, elem){...})

//or
r.map(L1, L2, function(elemL1, elemL2){...})
@srh
Copy link
Contributor

srh commented Jun 19, 2014

I agree. However, I think the name map should not be overloaded for either of these things. Also, I think the parallel-mapping version shouldn't be variadic -- it should only take two sequences. This is to keep things simple. It could be painlessly extended later to support variadic arguments.

I think, also, an r.range(begin, end) command would be nice to have (especially as a developer inserting dummy test data into tables). It might be gross if it supported infinite ranges, though, and a finite-range-emitting version of the command would not, in combination of a parallel-mapping map command, be enough to implement the map-with-index command you've asked for.

However, if we wanted an infinitely-generating r.range command (that produces the ordered sequence 0, 1, 2, 3, ...), and if we had parallel-map, that would be one reason not to add map-with-index functionality (because it's redundant). (I think infinite sequences would be okay as long as we disallow a query from returning an infinite sequence. Perhaps unless the user says they're explicitly allowed.)

Also, sequence.mapWithNumericPosition(function(idx, elem){...}) is a bit more convenient than sequence.doAParallelMapPlease(r.range(0), function(elem, idx){...}). (We need some non-bogus ideas for what these commands would be named, too.)

Also, note that parallel-map only makes sense on arrays and other ordered sequences, but not on unordered result sets like an r.table('foo') that doesn't have an orderBy attached.

@mlucy
Copy link
Member

mlucy commented Jun 19, 2014

Ruby has an each_with_index function; we could have map_with_index by analogy. (Also, just to be clear, why do you think making map variadic would be bad? I'm not sure I disagree, but it isn't obvious to me.)

@srh
Copy link
Contributor

srh commented Jun 19, 2014

Also, just to be clear, why do you think making map variadic would be bad?

I think it might be bad. Unknown unknowns. r.args. The fact that many drivers aren't good at that sort of thing. The code would be more complicated. The fact that in all the Haskell use I've had, almost never have I needed wanted zip3. (Edit: wanted. You never need zip3.)

Instead of a two-argument map, we might just want a plain parallel-merge function that takes two sequences and merges them into a sequence of objects {left: ..., right: ...}. One sequence can be shorter than another, and this provides a non-ambiguous way to communicate that information, in the case where the sequences have unequal length, by making the last documents have only a {left: } or {right: } field.

@mlucy
Copy link
Member

mlucy commented Jun 19, 2014

Too bad we already use zip for the opposite of zip.

@mlucy mlucy modified the milestones: subsequent, 1.14 Jun 19, 2014
@mlucy
Copy link
Member

mlucy commented Jun 19, 2014

I'm putting this in subsequent since @coffeemug is out of town and I don't want to forget about it.

@mlucy mlucy modified the milestones: 1.14, subsequent Jun 20, 2014
@mlucy
Copy link
Member

mlucy commented Jun 20, 2014

Actually, moving this into 1.14. We might not get to it, but I'd like to talk about it this proposal-cycle because it's a major thing people can't do right now (I just saw another example of this today).

@coffeemug
Copy link
Contributor

Moving to 1.14-polish. This isn't as important as major features, and I'd like to agree on those first. We can tackle this later.

@coffeemug coffeemug modified the milestones: 1.14-polish, 1.14 Jun 24, 2014
@robert-zaremba
Copy link
Author

What is the chance that we will find it in 1.14? For now list manipulation is really awkward.

@neumino
Copy link
Member

neumino commented Jul 21, 2014

This work around is quite verbose, but I can't think of something better:

r.expr([10, 11, 12]).do(function(sequence) {
  return sequence.indexesOf(function() { return true }).map(function(index) {
    return sequence.nth(index).add(1)
  })
})

@coffeemug
Copy link
Contributor

Here is what I think a full proper solution to this would be:

  • Implement collapse (Proposal: r.collapse #1855) and drop zip
  • Redefine zip to mean what people expect (i.e. take two sequences and zip them into a single sequence)
  • Implement a range operator (Proposal: r.range #875)
  • At this point one could do this via seq.zip(r.range(0)).map(...) (assuming omitting the second argument to range gives a lazy infinite range)
  • Add a porcelain command (either as an optarg to map or a new command mapWithIndex) to make this easier for users.

Of course we don't have to wait for the early steps to do the last step. I'll see if we can get this into the next release.

@robert-zaremba
Copy link
Author

Will be great. I think people more and more relay on a database code (to go around consistency limits). In our project we are sometimes making a temporary solutions and waiting for more sophisticated ways to manipulate documents to keep some level of consistency.

@robert-zaremba
Copy link
Author

Other useful use case might be a Map map - the function which maps over an object/map/dictionary with access to it's key and value:

obj.mapObj(function(key, val){ return val.add(otherObj('key')); }

@neumino
Copy link
Member

neumino commented Jul 29, 2014

@robert-zaremba -- to map over the keys, you can for now do

obj.do(function(arg) {
   return arg.keys().map(function(key) {
      // arg("key") is obj[key]
   }) 
});

@thelinuxlich
Copy link

+1

@coffeemug coffeemug modified the milestones: 1.14-polish, 1.15-polish Aug 13, 2014
@coffeemug coffeemug modified the milestones: 1.15-polish, reql-discussion Aug 20, 2014
@mlucy mlucy removed the tp:active label Sep 19, 2014
@coffeemug
Copy link
Contributor

👍

@robert-zaremba
Copy link
Author

When we can expect this feature?

@mlucy
Copy link
Member

mlucy commented Sep 27, 2014

@robert-zaremba: my buest guess is "either in the next release or the one after".

@coffeemug coffeemug modified the milestones: reql-discussion, 1.16-polish Sep 29, 2014
@VeXocide
Copy link
Member

VeXocide commented Oct 6, 2014

Grabbing this one, thus with a little luck it should make it into 1.16.

@mlucy
Copy link
Member

mlucy commented Oct 14, 2014

Alright, an update: since r.range is already in and it's not totally trivial to check the arity of a JS function, we're leaving out the version where people pass a function of arity one higher than the number of streams. If people complain a lot about adding r.range, we can put that version back in but since we were on the fence to begin with I would guess we won't.

@VeXocide
Copy link
Member

We're currently implementing r.map in such a way that when it is passed a single stream it's evaluated lazily as before. However, when it's passed multiple streams we have to evaluate them in an eager fashion, which is slightly less efficient.

Since we can't trivially check the arity of a JS function the only option was to always pass the extra index stream, and JS will happily ignore it when it's superfluous. This would mean that every r.map which is passed a JS function would be evaluated eagerly though, and we're not willing to accept this performance hit at the moment.

Additionally, adding this functionality at a later date won't break backwards compatibility thus we're holding off on it for the moment.

@neumino
Copy link
Member

neumino commented Oct 14, 2014

In JavaSript, you can check the number of arguments with

function lenArgs() {
    return arguments.length
}
lenArgs() // return 0
lenArgs("foo") // return 1
lenArgs("foo", "bar") // return 2
lenArgs("foo", "bar", "buzz") // return 3

In CoffeeScript, you can use lenArgs = (args...) -> args.length.

@asakatida
Copy link
Contributor

The arity problem shows up in Lua as well. I am using the number of arguments to do to guess at arity. I expect the same hack can be applied to map after this is implemented.

@danielmewes
Copy link
Member

@neumino The problem here is actually determining the number of arguments from a string passed into r.js. As in table.map(r.js("function (x, c) { return "How many arguments does this function have?"; } "))

@neumino
Copy link
Member

neumino commented Oct 14, 2014

@danielmewes -- Angular calls (used to?) toString() on a function and then parse the number of arguments with a regex (that's how they are doing injection).
The data explorer does the same to keep track of the VAR terms.

@danielmewes
Copy link
Member

@neumino That's interesting. I'm not sure if we should do that though. It sounds like it would be a rather complex (and accordingly error-prone) regex.

Generally, as @VeXocide mentioned, we have the option of adding the implicit count-argument late without breaking compatiblity.

@VeXocide
Copy link
Member

In CR 2200 by @gchpaco.

@VeXocide
Copy link
Member

I'll create a new issue for the implicit index, this clearly needs to be given some more thought.

@marshall007
Copy link
Contributor

@neumino @danielmewes in JavaScript, a function's length property indicates how many arguments it takes. Example:

(function (x, c) { return "How many arguments does this function have?"; }).length // => 2

@robert-zaremba
Copy link
Author

Maybe the option will be, as discussed above, to add other function with explicit argument (mapWithIndex, zipWith ...)?

@VeXocide
Copy link
Member

Merged into next via commit 54fa558.

@AtnNn AtnNn modified the milestones: 1.16, 1.16-polish Oct 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests