Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] populate indexes for has-Many associations #5477

Closed
mphasize opened this issue Jul 17, 2014 · 40 comments
Closed

[Feature Request] populate indexes for has-Many associations #5477

mphasize opened this issue Jul 17, 2014 · 40 comments

Comments

@mphasize
Copy link
Contributor

I want to propose a method that would populate hasMany associations with an array of primary keys instead of fully fetched objects.

Reason why?

  1. Performance: Using relational databases we only have to look up the keys in the join table without performing an expensive full join
  2. Standards: A lot of popular JSON API implementations accept or even expect relationships to be expressed as an array of primary keys, see JSON API specs or Ember Data's RESTAdapter

Current situation:
When using associations in Sails / Waterline and we query for records, we currently have two choices:
with populate: associated records are retrieved and embedded
without populate: belongsTo (model) associations are represented by their primary key and hasMany (collection) associations are ignored (in fact the JSON representation is even missing the properties)

Use case:
In order to build a performant solution for sideloading records, I want to query my Models with associations included as primary keys. Then I want to filter duplicated associations before fetching the full records for the associations.

Example:
I would like Group.find().populate("members") to return something like { topic: "...", members: [1,2,3,4] } instead of the fully fetched objects.

I am not sure if Waterline is already optimizing for duplicated records, but if it is, I couldn't find any exposure of this functionality to tap into.

@luislobo
Copy link
Contributor

+1 I second this one, I'm on the same situation

@blah238
Copy link

blah238 commented Aug 28, 2014

👍 * 💯

@thomasdashney
Copy link

+1

@mphasize
Copy link
Contributor Author

Any ideas where to begin? I assume this would have to be implemented in Waterline and all of the adapters?

@particlebanana Could you help out with a broad description where stuff would have to go? I'm thinking of starting a PR for this...
Thanks!

@particlebanana
Copy link
Contributor

This would touch a lot of pieces. I would suggest looking into https://github.com/balderdashy/waterline/blob/master/lib/waterline/query/deferred.js#L90 as a start. Might be best to use something other than populate.

@ghost
Copy link

ghost commented Oct 1, 2014

Hello,
For Ember Data users: since 1.0 beta9, Ember Data supports embedded records, which makes it much more Waterline-friendly.
http://emberjs.com/blog/2014/08/18/ember-data-1-0-beta-9-released.html

@tjwebb
Copy link
Contributor

tjwebb commented Nov 14, 2014

This would be a useful feature; Backbone Relational also expects this.

@particlebanana
Copy link
Contributor

Moving to the feature-request milestone.

@erkstruwe
Copy link

+1 because this would make caching in my current angular project A LOT easier.

@thomasdashney
Copy link

This is far from ideal, but a hack would be to override toJSON in a model where you want to always return an array of the association ids:

toJSON: function() {
  var obj = this.toObject();

  // get association-attributes
  var associations = [];
  var attributes = Object.keys(obj)
  for (var i = 0; i < attributes.length; i++) {
    var attribute = obj[attributes[i]];
    if ((attribute !== null && typeof attribute === 'object')
        && attribute.constructor === Array)
      associations.push(attributes[i]);
  }

  // for each association, make array of ids
  for (var j = 0; j < associations.length; j++) {
    obj[associations[j]] = obj[associations[j]].map(function(rel) {
      return rel.id;
    });
  }

  return obj;
}

Using this would require you to populate any to-many associations in your queries.

@anasyb
Copy link

anasyb commented Mar 2, 2015

@mphasize I think handling the association should go exactly after that _.each here which copies the attributes (but does nothing to associations).

Anyone knows if some work had already been done on this issue or not?

cc @particlebanana

@anasyb
Copy link

anasyb commented Mar 2, 2015

I am guessing this would simplify the deep populate, since foreign keys of 2nd level associations would be available.. it will become a recursion of populate much simpler than the current setup.

@anasyb
Copy link

anasyb commented Mar 29, 2015

isn't this related to (and would be closed by) the toObject issue https://github.com/balderdashy/waterline/issues/317#issuecomment-87274671

@dmarcelino
Copy link
Member

@mphasize, @anasyb, I like this. 👍

Have you thought about how to actually request find() to only provide the fkeys? Something like:

Group.find().populate("members", { fKeys: true } );

// or

Group.find().populateKeys("members");

Comes to mind//. I'm not a 100% sure if this won't need changes or additions in the adapter API to make it more performant.

@anasyb
Copy link

anasyb commented Apr 29, 2015

I definitely vote for this. This will solve multiple use cases.
Traversing relationships is a pain at the moment.. populating the whole record just to get the PK of a relationship is too much. Not to mention the freaky performance issue https://github.com/balderdashy/waterline/issues/944 that I never got the chance to look into. I had to access the join collections directly at multiple occasions (yuk!).
I was thinking more of the second notation Group.find().populateKeys("members");. For the first notation I think that it should be skipped in favor of balderdashy/waterline#952 ... which on the other hand should only access the join table in case the only picked attribute is the id.

@dmarcelino
Copy link
Member

@anasyb, great point regarding skipping the first notation in favor of select/pick which I agree with. Lets stick with the below if no one disagrees:

Group.find().populateKeys("members");

@devinivy
Copy link

I like this. I'd just like to add, it'd be nice to see adapters dealing with this in an efficient way. For example, sequel-interfaced adapters could get away with one join in many-to-many relationships rather than two. Any suggestions how to introduce a shim for this feature within waterline and simultaneously allow adapters to provide their own performant solutions?

@dmarcelino
Copy link
Member

Totally agree! I think this should be done in the same manner as .populate() / adapter.join() currently works:

  • If the adapter doesn't provide a custom .joinKeys() (or whatever) waterline core will provide the results, probably using some subset logic of the existing .populate;
  • If the adapter implements .joinKeys() waterline will call it.

@devinivy
Copy link

Okeedokes, sounds good!

@mphasize
Copy link
Contributor Author

Couldn't agree more and I'm also voting for the .populateKeys() style api. 👍

@billyshena
Copy link

Hello guys, any news about the .populateKeys feature?

Best,

@jardix22
Copy link

jardix22 commented Sep 6, 2015

i have the same question, any news about the .populateKeys feature ?

@sailsbot
Copy link

sailsbot commented Oct 7, 2015

Thanks for posting, @mphasize. I'm a repo bot-- nice to meet you!

It has been 30 days since there have been any updates or new comments on this page. If this issue has been resolved, feel free to disregard the rest of this message. On the other hand, if you are still waiting on a patch, please:

  • review our contribution guide to make sure this submission meets our criteria (only verified bugs with documented features, please; no questions, commentary, or bug reports about undocumented features or unofficial plugins)
  • create a new issue with the latest information, including updated version details with error messages, failing tests, and a link back to the original issue. This allows GitHub to automatically create a back-reference for future visitors arriving from search engines.

Thanks so much for your help!

@sailsbot sailsbot closed this as completed Oct 7, 2015
@devinivy
Copy link

devinivy commented Oct 7, 2015

This is clearly desired, so I'm reopening until someone PRs this into the roadmap file. Anybody feel free!

@devinivy devinivy reopened this Oct 7, 2015
@atiertant
Copy link

this is interresting, not only for primaryKey ... it would be faster than a populate, i vote for :

model.find(1).populate('child', {pluck: 'id'})

@sailsbot
Copy link

Thanks for posting, @mphasize. I'm a repo bot-- nice to meet you!

It has been 30 days since there have been any updates or new comments on this page. If this issue has been resolved, feel free to disregard the rest of this message. On the other hand, if you are still waiting on a patch, please:

  • review our contribution guide to make sure this submission meets our criteria (only verified bugs with documented features, please; no questions, commentary, or bug reports about undocumented features or unofficial plugins)
  • create a new issue with the latest information, including updated version details with error messages, failing tests, and a link back to the original issue. This allows GitHub to automatically create a back-reference for future visitors arriving from search engines.

Thanks so much for your help!

@jcowley
Copy link

jcowley commented Sep 8, 2016

This should be re-opened (and that damned bot should be shut down).

I love sails, but this is my number one complaint. If you're exposing a REST API, you really need the one-to-one and many-to-one ids that are populated directly on the record. You don't need to populate the associated data because clients can then make a second API call to fetch the data for the association if and when they need it.

My current workaround is to populate the relationships twice on every model: once as a 'model' association (for use in custom controller logic) and again as an 'integer' so that the id gets populated without the enormous overhead of populating the associations.

Before I go and fork Waterline and try to hack together a fix, has anyone else solved this?

@atiertant
Copy link

@jcowley you could use populate with select ['id'], but the way waterline and adapters are wrote doesn't let you populate childs ids without the "enormous overhead of populating the associations" ...
the problem is that sequel use union for populate and the cursor is too generic to be fast...
we forked waterline and wrote an adapter who use left join instead of union and a sql optimized cursor to optimize performance see Atlantis-Software/sails-hook-orm-offshore#5

@jcowley
Copy link

jcowley commented Sep 8, 2016

@atiertant: That's helpful, thanks. I'll check it out.

The thing that bothers me is that if you're only interested in the foreign keys on the record itself, you don't need to do a union or a join. There should be an optimization (like the default?) that allows you to forgo any population of relationships but still get the values of any foreign keys that are populated as columns on the records you fetch.

@atiertant
Copy link

@jcowley in a hasMany relation, the foreign keys are in the other side, you can't select parent record and child pk on the same table, so union, join, or a second request...

@jcowley
Copy link

jcowley commented Sep 9, 2016

@atiertant: Ah, I see now I posted to a topic that's specific to @hasmany relationships. As I mentioned, I'm just looking for the one-to-one or one-to-many ids, e.g.:

{ products: [
  {id: 1, category_id: 2, type_id, 5},
  {id: 1, category_id: 2, type_id, 5}
  ...
  ]
}

In the example, I'm talking about returning the category and type ids (one-to-many) without the overhead of joining those tables. I'm not even getting the ids back unless I have populate all on or manually populate them.

@atiertant
Copy link

@jcowley one-to-one and many-to-one are already showing id without need to populate

@jcowley
Copy link

jcowley commented Sep 9, 2016

@atiertant : I'm using @mphasize sails-generate-ember-blueprints for my blueprints (which is what led me to this issue). It's not populating the ids. Mistakenly thought it was a limitation of sails/waterline because of this issue, but it must be a bug in the blueprints I'm using.

@mphasize
Copy link
Contributor Author

mphasize commented Sep 9, 2016

@jcowley The sails-generate-ember-blueprints are pretty much out of date, I can't recommend to use them anymore. I added a custom populateIndexes function to the _actionUtils in conjunction with a hacky "many-to-many through" association described here https://gist.github.com/mphasize/aeaff696132357797e0b in order to make indexing work. Not sure if this is still compatible with the latest Sails.

@jcowley
Copy link

jcowley commented Sep 9, 2016

@mphasize: Yeah, I keep seeing your comments on the blueprints being "out of date", but many of us are still happily using them! The reality is that there is no sails JSON API blueprint yet that's production ready. So I've stayed with the REST adapter in ember and your blueprints — and you can see from the download stats on npm that others have as well. (The benefits of upgrading to a JSON API don't really justify the additional work and risk at this point IMHO.)

Appreciate your creating those blueprints. With the exception of the foreign key issue and a another minor issue that I patched in my fork own of the blueprints, they've been great. And by the way, they do work fine with the latest version of sails (0.12.4).

Thanks for the tip on populateIndexes. I'll take a look!

@mphasize
Copy link
Contributor Author

mphasize commented Sep 9, 2016

@jcowley Nice to hear, that the blueprints are still appreciated and honestly I feel kinda sad, that I don't have time to work on them, but I've moved away from Sails and Node as a backend for the time being. One of the main reasons for that were issues like the one we're talking about here, which I feel are important for a strong API story.

@manuel-reil
Copy link

manuel-reil commented Mar 6, 2017

In desperate need of having only the ids in an array, we wrote a fairly efficient function to only get the ids via the junction table (a bit different to populateIndexes as we went directly for mongo native aggregate).

However, when trying to set these arrays of ids to the model's property, we still get the embedded records magically. @mphasize also ran into this back then:

// get rid of the record's prototype ( otherwise the .toJSON called in res.send would re-insert 
embedded records) record = create( {}, record.toJSON() );

_actionUtils

Now we clone all the parent records via parse/stringify (_.cloneDeep now clones functions as well!). This of course is a bit blocking once we get a number of bigger records to clone.

Is there a better way to set/modify association fields (one and many) without cloning them?
This would be an intermediate fix without that performance penalty.

@luislobo
Copy link
Contributor

luislobo commented Mar 6, 2017

You could query the association table/collection itself.

To get the list of association models that you can use through sails:
run sails console

Object.keys(sails.models)

You get the list of all the models, including association models. You can query them as any other model, and make the join yourself.

@manuel-reil
Copy link

The query is not the problem. More setting the results to the model property.

Example:

  • Models Users and Pets, having a many-to-many relationship designed in their models, like in http://sailsjs.com/documentation/concepts/models-and-orm/associations/many-to-many.
  • We use our custom populateIds function to get all Pets for User "John". The result is something likevar populatedPetsIds = ["id1","id2","id3"];
  • When I try to set this result array to the association property of the User record of "John", it is not there when I log it to console or other functions where it is converted to JSON.

So, johnsUserRecord.pets = populatedPetsIds; does not work.

Instead, we do:

var clonedJohnsUserRecord = JSON.parse( JSON.stringify( johnsUserRecord )); // _.cloneDeep does now clone functions as well, so won't work
clonedJohnsUserRecord.pets =  populatedPetsIds;

That works of course, but doing this for 50 or more heavy records is blocking and more memory and GC-intense.

@luislobo
Copy link
Contributor

luislobo commented Mar 6, 2017

The issue why pets does not work is because it has associated getters and setters. You have to call toObject() on each object to remove that. And then you can assign anything to them. But, if you use model functions, then they will not work.

@raqem raqem transferred this issue from balderdashy/waterline May 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests