-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support incremental map/reduce #1118
Comments
How bad would it be if incremental map/reduce jobs could only be registered on a single table? If we limited ourselves to that this would actually become a much simpler problem to solve in the backend. |
Hmm, I have to think about it. It might be sufficient for most real use cases, but at first glance this makes me feel really uneasy. One thing MongoDB does that people find extremely annoying is introduce features that don't work with other features of the database. For example, they have plenty of collection types that cannot be sharded, which makes the user experience really frustrating since it moves the burden from developers to users. People can't just use the features they want and have the confidence that they will work. (I don't think it necessarily means we should restrict functionality, just that this tradeoff comes with connotations frustrating for users, so we should think carefully before we choose to do it this way) |
Like, a single incremental job could only operate on data from a single table? Or that each database could only have one table on which incremental jobs could be registered? For the use case I had in mind when I asked the HN question that I think prompted this ticket, the former would be acceptable, but the latter wouldn't. I have no idea what other uses people have in mind, though. |
@apendleton the former was what I meant. To give people an idea of how much easier it is I think I could probably do the one table case in less than a month while the general case would probably as many as 4-5 months all told. I think it's a feature about on the same scale as secondary indexes which took about that long. I actually think we should ship the one table case sometime semi soon (I think post 2.0 probably), gauge people's response to it and then expand from there. Also if we had triggers then the one table limitation really wouldn't be that bad because you could write triggers to push data from where you wanted it in to your single table where it would get map reduced. We'd add some sugar on top of that it and could actually be really nice. On top of that a lot of the features for managing tables you actually want for this incremental map reduce stuff as well. Redundancy will make the computed value more available. Sharding can help it scale better. |
@jdoliner -- when you get the chance could you explain the design for each of the options? (i.e. single-table option and multi-table option). I'd like to understand how you envision each version would work and where the factor of four-five difference in complexity comes from. (Obviously not urgent since we aren't doing this now) |
@jdoliner yeah, that all sounds awesome. We have a currently-Postgres database that I think I want to eventually replace with something-not-Postgres TBD, and we build aggregates on a whole bunch of tables that are very expensive to compute, and currently recompute everything from scratch on updates (additions, deletions, and changes of records). There's occasionally inter-table stuff, but 90% or more is probably single-table, and if we could change records and get new aggregates without recomputing everything from scratch, that would be a huge boon. I think you're absolutely right, too, that that use case is probably much more common than a complicated multi-table MR situation, and that in the interest of 80%-20% solutions, getting the single-table case out the door early would be totally worthwhile. |
@coffeemug actually having thought about this a bit more I think the multi-table version of this is less a question of being complicated from an engineering perspective and more a question of being algorithmically untenable. You can imagine even a fairly simple multi-table mapreduce such as: I definitely agree that it's annoying to have 2 features which aren't compatible but I think the reality is this is a situation where you can't sugarcoat the algorithmic limitations. Doing so is just going to lead to people bumping in to the limitations as exponential runtimes which is clearly a lot worse. My conclusion here is that the easier thing of having map reduce jobs rooted on a single table is actually the right thing to do because it's something I know we can make fast and make in to a very useful feature. Also it's really a very doable thing because almost all the annoying parts of it are already written and "working" for secondary indexes and the system was designed to be easily extended to support incremental map reduce. I'll write up a full proposal for this at some point in the near future. |
@jdoliner -- this makes a lot of sense. I changed my mind -- I think it's ok to make this feature work on a single table and it's probably ok to never make it work on multiple tables. Actually, we already have precedent where we do best effort on non-deterministic queries, and generally handle them differently from deterministic ones. This would be no different. |
Moving to 1.14. We should debate the ReQL aspects of it, in case we decide to do it. Very roughly: r.table('users').avg('age').changes()
r.table('users').group('city').avg('age').changes()
r.table('users').group('city').reduce(reduction_fn, anireduction_fn).changes() #2542 has some discussion of what this should return. I think:
|
That doesn't seem like incremental map reduce to me. I would expect it to involve some kind of persisted thing that you can query on any connection, not something that requires a live changefeed to be open. |
@srh yes, that was what I meant when I asked about it on HN last year; it's what Couch has and refers to by that name. You basically register a map/reduce job and its results are kept up to date automatically as the records it ran over are changed/deleted/added to. |
For the moment, I'm shooting for something very different with this feature. The spec above would give people the ability to get instantaneous updates to values of many different types of queries. They wouldn't persist on restart (or even on disconnect), but for a variety of reasons, I think that's sufficient for v1. It would require a bunch of infrastructure work, and would leave the door open to later include persistent incremental map/reduce support (where the user would save the query), but I think we should do that separately in future releases. I've opened #2587 to track that. |
It's worth noting that for large tables, doing this without persistence will make it very hard to track changes on large tables unless you're 100% sure the client will never get disconnected. |
I think that's ok. We wouldn't market this feature as incremental map/reduce -- we'd market it as instantaneous updates to the result of a query (well, not quite like this, we'd have to find better wording, but you get the idea). Essentially, you pay the price of running a query, and then get any updates in realtime. We'll phrase it in such a way as to not confuse people, and not have them expect things that aren't quite true yet. We can then deal with large tables in #2587. |
Related to #2542. |
Talked to @mlucy in person:
|
What I find interesting about CouchDB's implementation is that they don't require an inverse reduction function. Instead they seem to store the intermediate reduction results. For example if your reduction function is
(i.e. build a binary reduction tree and store the intermediate results at each node) Now if we let's say update the first value from 1 to 10, they only have to recompute
This makes it more convenient for the user, since they don't have to come up with an inverse function (which might also be wrong, which we can't detect). It's definitely more difficult to implement. |
So, there are advantages to both designs. Here are my thoughts on maintaining a tree: Pros:
Cons:
I would lean toward the inverse solution because it's easier, it scales better, and I would guess most people will be using our aggregators ( For When we eventually make |
Yes, that's the idea :) |
If I could do a changefeed on an aggregate that calculates NPS, that would be amazing. I have data that looks something like this: const npsData = [
{
"component_id": 1,
"number": 10
},
{
"component_id": 1,
"number": 10
},
{
"component_id": 2,
"number": 8
},
{
"component_id": 1,
"number": 9
},
{
"component_id": 2,
"number": 2
},
...
]; And my query looks something like this: r.expr(npsData)
.group('component_id', 'number').count()
.ungroup()
.map((row) => {
const number = row('group').nth(1);
const ret = r.expr({
component_id: row('group').nth(0),
distribution: [{number: number, total: row('reduction')}],
total_answers: row('reduction'),
detractors: 0,
passives: 0,
promoters: 0
});
return r.branch(
number.eq(9).or(number.eq(10)),
ret.merge({promoters: ret('promoters').add(row('reduction'))}),
number.eq(7).or(number.eq(8)),
ret.merge({passives: ret('passives').add(row('reduction'))}),
ret.merge({detractors: ret('detractors').add(row('reduction'))})
);
})
.group('component_id')
.reduce((left, right) => ({
component_id: left('component_id'),
total_answers: left('total_answers').add(right('total_answers')),
detractors: left('detractors').add(right('detractors')),
passives: left('passives').add(right('passives')),
promoters: left('promoters').add(right('promoters')),
distribution: left('distribution').add(right('distribution')),
}))
.do((datum) => {
const passivesPercentage = datum('passives').div(datum('total_answers')).mul(100);
const promotersPercentage = datum('promoters').div(datum('total_answers')).mul(100);
const detractorsPercentage = datum('detractors').div(datum('total_answers')).mul(100);
return {
distribution: datum('distribution'),
passives_percentage: passivesPercentage,
promoters_percentage: promotersPercentage,
detractors_percentage: detractorsPercentage,
score: promotersPercentage.sub(detractorsPercentage)
};
})
.ungroup()
.map(row => ({
component_id: row('group'),
distribution: row('reduction')('distribution'),
passives_percentage: row('reduction')('passives_percentage'),
promoters_percentage: row('reduction')('promoters_percentage'),
detractors_percentage: row('reduction')('detractors_percentage'),
score: row('reduction')('score')
})); Could the above be achieved with this new api? |
@danielmewes Excellent ! |
@meenie It depends on whether or not you can express it as a |
@danielmewes So I wouldn't be able to use |
@meenie You might be able to rewrite the grouping into a reduction, in which case it would work. You would basically maintain an object |
@danielmewes: Ya, that makes total sense. For now, we need every bit of efficiency we can get, so I'll be experimenting with rewriting out queries to use changefeed's, but won't utilise this in production until it's on parity with speed. |
Since the reverse function is only needed because it's a changefeed, it seems like the first one makes sense. But then again it's not clear which function it's reversing if they're separated. I guess it feels kind of wrong to me to require something extra when you're doing (to clarify, I know why we have to do it in this case, but it pulls me towards option 1 over option 2) |
The way I think of it is that this is like needing to have the I don't like the first syntax because it seems limiting and different from what we do anywhere else. What if in the future we allow changefeeds on queries that contain multiple Or what if you have a query that looks like this: |
Yeah, it seems like the best way is to provide the reverse function to the reduce term. I'm assuming if you don't tack on .changes the reverse optarg will just be a no-op (vs. erroring)? |
Yeah that's what I thought. That way you can run the same query with and without |
There's a lot of discussion above. Here's my understanding the
A few other things:
Also, on the subject of implementation, it probably wouldn't actually be |
|
@mlucy Thanks for the summary of the current proposal. That matches what I had in mind for 2.4. I'd like to add The three extensions that you're suggesting
all sound really cool to me. As far as I can tell, 1 would be easy to implement even as a pure My impression is that 2 ( Since we have limited remaining development resources for 2.4 considering the other things we are working on, my suggestion would be that we agree on a minimal proposal, and keep extensions 2 and 3 out of the proposal for now. If we end up having extra time, we can still implement the more efficient algorithm (3) or discuss grouped changefeeds separately.
Great question. |
Also I would like to add that I'm extremely excited about this feature! It's going to be so amazing :-) |
Can we have 2 and 3 in 2.5? :D |
@v3ss0n I think so :) |
@danielmewes -- leaving 2 and 3 for later sounds good to me. I don't think 2's representation would be a particularly involved discussion, though -- I was imagining we'd just emit the entire grouped data every time it changed (so On |
Marking settled as:
For 2.4 we will implement the slower variant that performs the initial reduction on the parsing node rather than distributing it. |
This is gonna be great! Note that supporting |
This is in review 3714, except for coerce_to("array") |
As this has been transferred to milestone 2.4-polish (for obvious reasons), I just wanted to emphase what @danielmewes has written at this comment as a work-around for the time being. I suggest the following to support the other common aggregation operations:
Any thoughts on this would be appreciated. 😃 |
We had this on our radar for a while, but didn't have an issue to track it. Since some people have been asking for an official issue to track, I'm adding this to GitHub.
I'm going to write up a specific proposal a bit later. This is in backlog as it's obviously a medium-term priority feature.
The text was updated successfully, but these errors were encountered: