You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 11, 2020. It is now read-only.
I'm trying to perform a mapreduce in which the output is stored to another collection. I've set the configuration for the 'out' option. The mapreduce works fine when I run execute, but what seems to happen is that it also tries to _find / retrieve_ the results from the results collection after executing the map/reduce. This is an unnecessary overhead.
Ideally, it should simply run the mapreduce and do nothing or return the statistics of the mapreduce (time taken, emits, etc). Instead, it queries for all the results of the mapreduce.
<?php// $qb is the query builder$qb->map('function(){ emit(this.from.userId, { sex: this.from.sex, circle: this.from.circle, age: this.from.age }); }')
->reduce('function(k, values){ return values[values.length - 1]; }')
->out(array('replace' => 'tmp.mr.ActiveUsers_foo'));
$query = $qb->getQuery();
var_dump($query->execute()); // is a LoggableCursor, why?
If I log the commands / queries executed, there's a MapReduce command _followed by a "find" on the collection tmp.mr.ActiveUsers_foo_. The second query shouldn't happen. Is this an intentional behavior? If so, how do I prevent it from happening?
I dug into the code and discovered this. I realized the ->find() function is being called after every mapreduce. What is the thought behind it? Read my comments inline in the code below.
<?phpprotectedfunctiondoMapReduce($map, $reduce, array$out, array$query, array$options)
{
// ..........if (isset($out['inline']) && $out['inline'] === true) {
returnnewArrayIterator($result['results']);
// ^^^ this is as expected since results are asked for inline
}
return$this->database->selectCollection($result['result'])->find();
// ^^^ why are we doing this find(..) ? the user has categorically mentioned that he wants the output go to a particular collection.
}
Don't you think it would be better to do away with the ->find(..) all together?
The text was updated successfully, but these errors were encountered:
@epicwhale: Having researched this, I don't think we should change the behavior in 1.x, as it'd be a significant BC break for anyone relying on this. That said, although the method does create a cursor, there should be no real overhead unless you start iterating on it (cursors don't actually hit MongoDB with a query until you request the first result).
Quoting @epicwhale from: doctrine/mongodb-odm#146
I'm trying to perform a mapreduce in which the output is stored to another collection. I've set the configuration for the 'out' option. The mapreduce works fine when I run execute, but what seems to happen is that it also tries to _find / retrieve_ the results from the results collection after executing the map/reduce. This is an unnecessary overhead.
Ideally, it should simply run the mapreduce and do nothing or return the statistics of the mapreduce (time taken, emits, etc). Instead, it queries for all the results of the mapreduce.
If I log the commands / queries executed, there's a MapReduce command _followed by a "find" on the collection tmp.mr.ActiveUsers_foo_. The second query shouldn't happen. Is this an intentional behavior? If so, how do I prevent it from happening?
I dug into the code and discovered this. I realized the ->find() function is being called after every mapreduce. What is the thought behind it? Read my comments inline in the code below.
In _Doctrine/MongoDB/Collection.php_ - Line _135_ ( https://github.com/doctrine/mongodb/blob/master/lib/Doctrine/MongoDB/Collection.php#L375 )
Don't you think it would be better to do away with the ->find(..) all together?
The text was updated successfully, but these errors were encountered: