Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populating many-to-many is incredibly slow. #5541

Closed
ddallaire opened this issue Apr 10, 2015 · 16 comments
Closed

Populating many-to-many is incredibly slow. #5541

ddallaire opened this issue Apr 10, 2015 · 16 comments

Comments

@ddallaire
Copy link

I am having a problem with many-to-many associations in waterline/sails, it takes forever to execute my query when I populate a many-to-many relationship.

Here is what my query looks like :

Commit.findOne({ idCommit: req.param("id") })
 .populate('user')
 .populate('project')
 .populate('issues')
 .exec(function(err, commit) {
}

I am using the mysql adapter, the problem comes when I populate the "issues". The relation between commits and issues is many-to-many and issues have relations to other tables as feilds.
The line above is returning a single commit with 6 issues and it is taking 12000ms, this is huge. If I execute the same request on the database without waterline, it takes 1ms,

I tought about querying my commit and then finding its issues once the commit is found but I don't have a model for my commits_issues as it is automatically created by the relationship.

Is there a way to increase performance or bypass this problem? I tought about using a .query but it's not really clean with many-to-many relations.

@devinivy
Copy link

This is a known issue that's being worked-through. We'd love any help or results you've head! Related PR: #924. Related issues: https://github.com/balderdashy/sails-mysql/issues/161 and https://github.com/balderdashy/sails-postgresql/issues/141

@ddallaire
Copy link
Author

I will gladly look into the code when I find some time for it. I tested PR #924 with no improvements. I will need to find a way to make my query faster before looking into that, thanks.

@dmarcelino
Copy link
Member

Unfortunately PR #924 was reverted by PR #938 as it was breaking the API but any help would be appreciated.

@asleepinglion
Copy link

I'm unable to jump in and help here at this time, but I'm curious as to what the outlook is on remedying these performance issues? This alongside the current lack of the pick/omit feature, is really hurting the performance of my teams applications. Is this something that you think will be solved in the near future? Any idea on when? I really like waterline and want to keep using it, but my team is putting pressure on me. Keep up the great work guys! Sorry I can't be more helpful.

@dmarcelino
Copy link
Member

@asleepinglion, there is no pick/omit feature but there is a select, more about it here.

It's really hard to fix this without a benchmark of some sort to reproduce the problem and afterwards confirm it's fixed. So far nobody has come up with one...

@asleepinglion
Copy link

I was referring to the upcoming feature for 0.11: balderdashy/waterline#952. It did seem understood that by being able to pick or omit fields could have serious performance benefits.

As far as a benchmark; is this not already a known issue? Just above you said that PR #924 (which brought some improvements?) was reverted because it had breaking changes. Sorry if I'm reading it wrong, but I interpreted it as though you guys were aware of the issues, and currently working on improving the performance of populate? Which is also suggested in the related issue mentioned above: https://github.com/balderdashy/sails-mysql/issues/161

I also noticed this question on StackOverflow (http://stackoverflow.com/questions/23795389/one-to-many-associations-using-existing-join-table) where @particlebanana said, in response to a question about many to many joins with an existing join table: "Note however this is still in a beta release state so it's not going to be rock solid quite yet. We don't have the correct outer join calls being made on the mysql adapter so you will see three queries being made and then the results will be joined in memory at the application layer. This will be cleaned up to do a single sql query like you would expect once the criteria parser has been updated."

Maybe this isn't the best forum for the question, but I'm just trying to get an understanding of the projected timeline for these critical performance features to be implemented (critical from the standpoint of my team and our project). Don't get me wrong, again, I think you guys are doing great work. :)

@dmarcelino
Copy link
Member

Regarding #952, I can't see it being anymore performant than .find({ select: [ 'att1', 'att2'] })... And I believe the only performance benefit is querying less fields/columns from the database which shouldn't make a huge difference.

It's "known" in the sense that a few people have brought it up. :) Yes, PR #924 brought some improvements that we were able to measure through load tests (PR balderdashy/waterline-adapter-tests#49) but the load tests only shown the changes improved performance, the tests themselves don't run abnormally slow. In other words, the current load tests can tell us if a change improves certain things but don't tell us that there are performance issues.

That particular paragraph is related to sails-mysql adapter which I'm not using and can't comment on performance, in the previous 2 I was referring to waterline core.

Regarding sails-mysql I'm unaware of what efforts are being taken and it's probably best to discuss that in the sails-mysql project. But in any case we could really use better and more tests than the "load tests", something that we could all reproduce in our machines and say "that takes incredibly long". In my opinion that will be the first step to get to the bottom of this and that's where we could really use some help from people experiencing performance issues.

@blah238
Copy link

blah238 commented Apr 23, 2015

@dmarcelino I haven't hide time to look at it since #924 was rolled back and have since resorted to using Bookshelf/Knex instead of Waterline, but as I recall, just changing PETS_NUMBER from 100 to 1000 exponentially increased the time and memory usage for me on Windows and would eventually crash the tests. At 100 it's not a big deal but at 1000 the problem rears its ugly head. Maybe 1000 should be the default?

@asleepinglion
Copy link

Understood. We are using the mysql adapter in our project, and I will attempt to inquire on that project specifically. I just saw this issue and the ones it referenced, and it seemed related to the issues I'm experiencing. Fact of the matter is, the many to many join I'm working with is incredibly fast when I use custom queries, and even other libraries, but it takes several seconds when using Waterline for the same amount of data (talking tables with less than 20 records).

There is a lot I like about Waterline, which is why I'm still trying to take advantage of it, but I may have to look elsewhere. I'll try posting something on the sails-mysql project though. Thanks for quick responses @dmarcelino.

@dmarcelino
Copy link
Member

@blah238, the reason for 100 pets being the default is a pragmatic one. I'd like (dream?) for people to use the load tests frequently, namely when they are about to push changes and a 1000 pets makes it impractical. Anyone can change pets to a 1000 for more extreme testing. If I remember correctly your issues were also related with node 0.12, right?

@asleepinglion, I was just thinking that a good benchmark test for the SQL adapters would be to run some expensive join query with .query() and then run the equivalent with .find().populate(). and compare the time taken by the 2. That could be an additional integration test ran by those adapters.

You don't need to look elsewhere, help us profile, improve the performance and make waterline even better! 😃

@blah238
Copy link

blah238 commented Apr 24, 2015

@dmarcelino I agree it wouldn't be very nice to blow up people's machines every time they run the tests, but the thing is, querying 1000 records shouldn't be considered extreme at all in the first place! Should be sub-second for any of those adapters.

Bookshelf/Knex is able to do the same join that Waterline took 12+ minutes to do in less than 1 second and keeps the memory under 100MB where Waterline takes up 1.5+GB. Also I'm not using 0.12 anymore, 0.10 is all I have now (but there was no effective performance difference in my experience, just 0.12 took up a bit more memory).

For me the pragmatic approach is to just not use Waterline, unfortunately.

@asleepinglion
Copy link

@dmarcelino, I haven't given up and will certainly jump in and try to see what I can do in the near future. I've integrated waterline as a more abstracted adapter into a framework I'm building which is still in heavy development at the moment. And I'm developing the framework alongside the company project that's using it, so I don't have the time to jump into waterline at the moment.

I just built a Knex based adapter for my framework last nite that uses the same JSON based query language as waterline and nearly identical model structure and the performance is dramatically better. I'm probably going to use that instead for now so my company project can move forward. That said, I want waterline to be a solid option for my framework due to the versatility it provides. I should be able to find some time in a month or so.

The one thing about the Knex sql builder which is nice is that even though I can create the essential crud methods for find, update, etc; I can quickly drop down to create custom queries without having to write raw SQL. I think I saw somewhere that sails-mysql might switch to Knex, which would be a plus. Maybe when I'm a little further on my adapter, the code that I'm writing to translate the JSON query into SQL can be contributed. :)

@dmarcelino
Copy link
Member

@blah238, hopefully we'll get to the point where querying 1000 records takes less than a second, that's what we all want. I don't know of any active development on this currently and we appreciate all the help we can get.

On another note, I'm surprised Bookshelf/Knex takes less than 1 sec to returns 100 query requests, each one with a join operation, is it doing some sort of caching so it doesn't hit the DB 100 times? (100 is the number of requests the load tests do)

Yes @asleepinglion, that's issue #605, maybe we can re-use what you're doing as I believe the best approach would be to replace waterline-sequel (helper project that translates waterline inputs into SQL) with something using knex which we could call waterline-knex :) And, similarly to other adapters, we could expose knex to the user to give him more options.

@blah238
Copy link

blah238 commented Apr 26, 2015

@dmarcelino No, sorry, I was referring to the scenario in the original post on https://github.com/balderdashy/sails-postgresql/issues/141. Although I think it would be great to add a Bookshelf/Knex load test case to use as a baseline.

@lutsen
Copy link

lutsen commented Jul 16, 2015

I experienced the same problems, but I just updated my sails-mysql adapter to 0.11.0, and this significantly improves performance.

@sailsbot
Copy link

Thanks for posting, @ddallaire. I'm a repo bot-- nice to meet you!

It has been 60 days since there have been any updates or new comments on this page. If this issue has been resolved, feel free to disregard the rest of this message. On the other hand, if you are still waiting on a patch, please:

  • review our contribution guide to make sure this submission meets our criteria (only verified bugs with documented features, please; no questions, commentary, or bug reports about undocumented features or unofficial plugins)
  • create a new issue with the latest information, including updated version details with error messages, failing tests, and a link back to the original issue. This allows GitHub to automatically create a back-reference for future visitors arriving from search engines.

Thanks so much for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants