Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'mapReduce' support to Model #75

Closed
RagibHasin opened this issue May 15, 2017 · 9 comments
Closed

Add 'mapReduce' support to Model #75

RagibHasin opened this issue May 15, 2017 · 9 comments

Comments

@RagibHasin
Copy link
Contributor

mapReduce is a often used function of collection and other ODMs support it already.

It will be great to see it in this project. I'm ready to work out a PR for this (how to instantiate a Model form a MongoDB native collection?)

@notheotherben
Copy link
Member

Hi RagibHasin, thanks for pointing that out - I think it'd make a great addition if we can flesh out a decent API for the approach.

Specifically, mapReduce operates on a collection, accepts to functions and a query as arguments and outputs the results to another collection. In Iridium's world, you'd want to have a model for the destination collection type to provide all the type assistance a user expects for elements there, as well as a set of typed function arguments to the mapReduce function to ensure that their results matched what one would expect from the target collection.

Before we get started implementing anything, let's think about how we'd like to work with mapReduce in a TypeScript environment and the sort of API we wish to provide.

Personally, I'd love to be able to completely ignore the fact that there's an intermediary collection in most cases and simply run something like the following:

core.Orders.mapReduce(function() {
  emit(new Date(this.date.valueOf() % 86000000), this.total)
}, function(key, values) {
  return Array.sum(values)
}, { completed: true }).forEach(grossIncomeByDay => {
  console.log("%s  - $%d", grossIncomeByDay._id, grossIncomeByDay.value);
});

Of course, that approach doesn't map well to situations where you wish to memoize the results of your mapReduce...

Maybe something like this instead?

interface MapReducedDocument {
  _id: Date;
  value: number;
}

@Iridium.Collection("map_reduced")
@Iridium.MapReduce<Order>(function() {
  emit(new Date(this.date.valueOf() % 86000000), this.total)
}, function(key, values) {
  return Array.sum(values)
})
class MapReduced extends Iridium.Instance<MapReducedDocument, MapReduced> {
  _id: Date;
  value: number;
}

core.Orders.mapReduce(MapReduced, { completed: true });

core.MapReduced.find().forEach(grossIncomeByDay => {
  console.log("%s  - $%d", grossIncomeByDay._id, grossIncomeByDay.value);
});

What are your thoughts there? The latter one seems like the better solution for implementation purposes, but it obviously requires quite a bit more effort on the developer's part. The flip side of that is that they can interact with that collection using all the standard Iridium collection functionality and treat documents in it as normal Iridium instances if they wish.

@RagibHasin
Copy link
Contributor Author

RagibHasin commented May 15, 2017

The latter obviously seems a better option. I have made some progress in inline version of mapReduce but could not figure out how collection could be returned. Thanks. I am on it. Hopefully will make the PR soon.

BTW, I've found this project just today. So I may need help. At this moment...
Why are those { completed: true } there? Sorry for my ignorance.

@notheotherben
Copy link
Member

notheotherben commented May 15, 2017

So Iridium doesn't really move collections around, instead we've got a class which maps to a collection and provides type-safe methods through which to access it (as well as being aware of the wrappers we use and knowing how to do validation, hooks etc). That wrapper is the Iridium.Model class. On that, you'll see a collection property which you can use to access the underlying MongoDB collection object. You can also use the core if you need to get access to the underlying connection.

My recommendation, however, is to avoid returning a raw MongoDB Collection object because it immediately breaks the chain of type-safe return values that Iridium strives so hard to maintain. While it will likely work, it's not really a great developer experience.

As for { completed: true }, that's just a MongoDB query that, in my example, looks for orders that were completed (as opposed to those that got abandoned/cancelled). When running mapReduce you'll see that you can provide a query option, that's all the { completed: true } is.

Let me know if there's anything you'd like a hand with, there's plenty of examples for how to build decorators in the lib/Decorators.ts file and you'll see that most of them simply set static properties on the Instance type - that way someone who doesn't wish to use decorators can simply set static properties on their class to accomplish the same thing.

@RagibHasin
Copy link
Contributor Author

Should I commit to release or master branch?

@notheotherben
Copy link
Member

master if you don't mind, I really need to rename them at some point from release->master and master->develop

@RagibHasin
Copy link
Contributor Author

I'm thinking if we make mapReduce method to accept a generic parameter of Instance type, don't use a decorator to define map and reduce functions instead pass them directly to mapReduce.

Like

interface MapReducedDocument {
    _id: Date
    value: number
}

@Iridium.Collection("map_reduced")
class MapReduced extends Iridium.Instance<MapReducedDocument, MapReduced> {
    _id: Date
    value: number
}

core.Orders.mapReduce<MapReduced>(
    function () {
        emit(new Date(this.date.valueOf() % 86000000), this.total)
    },
    function (key, values) {
        return Array.sum(values)
    }, { query: { completed: true } }).
    then(reducedModel => {
        reducedModel.find().forEach(grossIncomeByDay => {
            console.log("%s  - $%d", grossIncomeByDay._id, grossIncomeByDay.value)
        })
    })

Besides with this manner inline mapReduce will be possible also by omitting the generic parameter. I think it will feel more natural.

@notheotherben
Copy link
Member

The tricky part is that the generic parameter isn't accessible at runtime, so you wouldn't be able to pass an actual instance of reducedModel: Iridum.Model<MapReducedDocument, MapReduced> to the promise. So to work around that, you probably need to accept the MapReduced model type as a parameter as well.

The next part is that a model like MapReduced should probably be pretty well defined in terms of how one generates it. Specifying that as a decorator on the model keeps that information "close" to the model's definition and therefore easier to maintain in the long run (someone can immediately see that a model is generated through a mapReduce operation and how that happens).

As a final point, we try to avoid dynamically generating models in Iridium because (while it is possible to do so) it hides type information during development time and makes it harder to reason about how the code works. I'd be somewhat worried that by returning a model from mapReduce() we break this guarantee and open the door to some added confusion.

@notheotherben
Copy link
Member

I've just pushed out v7.2.0 with your PR, thanks for the contribution and please let me know if there's anything that should be improved in the docs supporting it.

@RagibHasin
Copy link
Contributor Author

At this moment nothing. Thanks for being so informative and helpful. I am feeling grateful to be able contribute to community for the first time. Collaborating with you was just awesome.

And finally, thanks for this v7.2.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants