-
Notifications
You must be signed in to change notification settings - Fork 578
Fix performance of including relationships #1800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great find! However the tests must pass before this gets merged. Can you take a look at it? It's probably just a matter of changing the expected query.
@ricardograca Will do, I just got my environment all set up for testing |
@ricardograca All of the tests pass on my local machine, have you ever run into issues where tests differ between machines? |
That's weird. Looking at the failed test it seems like the two expected objects in the response array have their order swapped, even though they are correct individually. I've never seen this particular error before. Can you check if there could be some kind of race condition here? It's important that any async tests always return promises or call Also, what Node version are you using? |
I am using Node 8.9.2 |
Looking at the test more closely it's a bit weak since there shouldn't be any requirement (or guarantee) that the columns are retrieved in any particular order. I'll ignore that failed test, but I still need to review this more carefully to make sure no new bugs or breaking changes are introduced and I don't have time for that right now. |
It is very strange, the test that is failing here is passing consistently on my end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm merging this even with the tests failing because they only fail for PostgreSQL and only because the order of the expected results isn't as expected, but there's also no ORDER BY
clause on the queries, so some randomness isn't too surprising.
The issue of failing tests will be fixed with a later PR.
Apologies if this is not the place to put this - thank you for this great fix, but one (rare) GOTCHA is that My solution was to create a "computed" column using select *
from table
order by
case when col1 is not null then col1 else col2 end becomes select *,
case when col1 is not null then col1 else col2 end as order_col
from table
order by order_col (obviously all achieved via knex, not raw sql) |
@jwhitmarsh Can you share your actual code solution to the issue you describe? |
No probs. In the instance that I'm fixing it it's a fairly complicated query, but this is the key bit:
for reference, this used to look like:
The "fixed" way is actually much neater and, like I said, I only wanted to post this in case someone stumbled across it in the future (relevant XKCD https://xkcd.com/979/) |
The extra distinct table.* causes problems with JSON fields. "could not identify an equality operator for type json" See #1941 |
@ricardogama do you think this change is adequate? It seems to me that this particular query can be optimized with .query(). Or add DISTINCT only to queries generated by .through(). See also #1941 |
Including a relationship causes very slow performance. I first noticed this when including a relationship that had a
.through
, It was sometimes taking over 2 seconds to return from bookshelf. I did some digging and found that one of the queries that was supposed to return only 2 results, returned over 4k results. All of the 4k results were duplicate rows. Here is an example of a query that was generated for the following relationship:Since my invoices table had thousands of rows, this query returns many duplicates of the same supplier.
Making the request use the
distinct
keyword ensures that the result only includes the desired results without duplicates.