Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested Joins #10695

Closed
andrew-st-angelo-77media opened this issue Apr 3, 2019 · 15 comments
Closed

Nested Joins #10695

andrew-st-angelo-77media opened this issue Apr 3, 2019 · 15 comments
Labels
hard For issues and PRs. status: understood For issues. Applied when the issue is understood / reproducible. type: feature For issues and PRs. For new features. Never breaking changes.

Comments

@andrew-st-angelo-77media
Copy link

andrew-st-angelo-77media commented Apr 3, 2019

What is the Issue

Sequelize does not support nested JOIN statements. This can limit some of SQL's power to eliminate unwanted datasets, and it both places the burden on the application layer to weed out unwanted data, as well as tax the I/O and transport, in sending extra data.

As some may not be familiar with Nested Joins, one of the big advantages is the opportunity to include/exclude the subsequently-joined data ONLY IF the top level is joined. This really stands apart when your top-level JOIN is a LEFT OUTER JOIN and your nested joins are INNER JOINs.

I couldn't find a perfect write-up on nested joins (their usage and importance) but I found this blog article comparing their use to a reversed query with a RIGHT OUTER JOIN

Example in SQL -- This is what I'm talking about

Setup

** Note: I'm using temp tables for copy/paste purposes

CREATE TABLE Persons (id INT, name VARCHAR(50))
CREATE TABLE Phones (id INT, number VARCHAR(20))
CREATE TABLE PersonPhones (personId INT, phoneId INT) 

INSERT INTO Persons VALUES (1,'Adam'), (2, 'Bob'), (3, 'Carl')
INSERT INTO Phones VALUES (1, '555-1234'), (2, '555-2345'), (3, '555-3456')
INSERT INTO PersonPhones VALUES (1,1), (1,3), (2, 4)

-- The following are not the same queries

-- Query 1
SELECT *
FROM Persons p
LEFT JOIN PersonPhones pp 
    INNER JOIN Phones ph ON ph.id = pp.phoneID
    ON pp.personID = p.ID

-- Query 2
SELECT *
FROM Persons p
LEFT JOIN PersonPhones pp ON pp.personID = p.ID
INNER JOIN Phones ph ON ph.id = pp.phoneID

Results

Query #1

id name personId phoneId id number
1 Adam 1 1 1 555-1234
1 Adam 1 3 3 555-3456
2 Bob NULL NULL NULL NULL
3 Carl NULL NULL NULL NULL

Query #2

id name personId phoneId id number
1 Adam 1 1 1 555-1234
1 Adam 1 3 3 555-3456

As you can see, there is no need to validate Bob and Carl here. They simply are eliminated from the final dataset, based on the fact that their nested INNER JOINs failed.

What is Sequelize doing

Person.findAll({
  include: [{
    model: PersonPhone,
    required: false,
    include: [{
      model: Phone,
      required: true
    }]
  }]
});

This creates a query, such as Query 2, above.

Why this is not Arbitrary

Imagine a more complex model, with further tables joined downward. The join conditions on a lower-level join could dictate whether any of the data above it (heading back toward the top-level join) are returned.

Suggestion

Add a nested attribute to model(s) passed in include. These models will be nested below the parent model rather than "flatly" joined (for lack of a better term).


Dialect: mssql / any
Sequelize version: 3.30.4 (but after talking with @janmeier , on Slack, he didn't think Sequelize was currently capable of this)
Tested with latest release: No

@andrew-st-angelo-77media
Copy link
Author

Before this issue gets mothballed, I thought I'd check back in and see if anyone had any thoughts/interest?

@andrew-st-angelo-77media
Copy link
Author

This is still something I have interest in seeing a resolution to. I've noticed that, from time-to-time, queries will structure themselves in this manner. However, I have not found a way to force/ensure this behavior.

@papb
Copy link
Member

papb commented Oct 30, 2019

Hello, thank you for the well-written request :) I am convinced that this should definitely be supported.

@papb papb added hard For issues and PRs. status: understood For issues. Applied when the issue is understood / reproducible. type: feature For issues and PRs. For new features. Never breaking changes. labels Oct 30, 2019
@papb
Copy link
Member

papb commented Oct 30, 2019

In order to grasp how complex is this, can you show me an example with three levels of includes, instead of only two?

@papb papb added status: awaiting response For issues and PRs. OP must respond (or change something, if it is a PR). Maintainers have no action and removed status: understood For issues. Applied when the issue is understood / reproducible. labels Oct 30, 2019
@papb
Copy link
Member

papb commented Oct 30, 2019

I've noticed that, from time-to-time, queries will structure themselves in this manner. However, I have not found a way to force/ensure this behavior.

Hmmm, interesting, do you mean that sometimes sequelize does actually do what you want? But in a way that you didn't figure out how to reproduce yet?

@andrew-st-angelo-77media
Copy link
Author

I've noticed that, from time-to-time, queries will structure themselves in this manner. However, I have not found a way to force/ensure this behavior.

Hmmm, interesting, do you mean that sometimes sequelize does actually do what you want? But in a way that you didn't figure out how to reproduce yet?

That is correct. I can't give you any further information, as I've only seen it once. Ran my function call a few times to verify it wasn't an abortion. But since then, in my testing, I have not been able to structure a query this way on-purpose.

@andrew-st-angelo-77media
Copy link
Author

In order to grasp how complex is this, can you show me an example with three levels of includes, instead of only two?

I don't know if this is the most useful example or not. I was really just trying to extend what I had above. But the point is that the nuance of using a Nested Join allows you to return only those records that you care about, through whatever associations you may throw at it. This becomes a much more useful technique when you have large queries and large datasets.

CREATE TABLE Persons (personId INT, name VARCHAR(50), bossId INT NULL)
CREATE TABLE Phones (phoneId INT, number VARCHAR(20))
CREATE TABLE PersonPhones (personId INT, phoneId INT) 

INSERT INTO Persons VALUES (1,'Adam', 4), (2, 'Bob', 4), (3, 'Carl', 5), (4, 'Deborah', NULL), (5, 'Ellen', 4), (6, 'Frannie', NULL), (7, 'Gretta', 6)
INSERT INTO Phones VALUES (1, '555-1234'), (2, '555-2345'), (3, '555-3456'), (4, '555-4567'), (5, '555-5678')
INSERT INTO PersonPhones VALUES (1,1), (1,3), (2, 4), (3,9), (4, 4), (6,5), (7,2)

-- The following are not the same queries

-- Query 1
SELECT *
FROM Persons boss
LEFT JOIN PersonPhones bossPersonPhone 
    INNER JOIN Phones ph ON ph.phoneId = bossPersonPhone.phoneID
    ON bossPersonPhone.personID = boss.personId
LEFT JOIN Persons subordinate 
	INNER JOIN PersonPhones subordinatePersonPhone 
		INNER JOIN Phones ph2 ON ph2.phoneId = subordinatePersonPhone.phoneID
		ON subordinatePersonPhone.personID = subordinate.personId
	ON subordinate.bossId = boss.personId

-- Query 2
SELECT *
FROM Persons boss
LEFT JOIN PersonPhones bossPersonPhone 
    INNER JOIN Phones ph ON ph.phoneId = bossPersonPhone.phoneID
    ON bossPersonPhone.personID = boss.personId
LEFT JOIN Persons subordinate 
	LEFT JOIN PersonPhones subordinatePersonPhone 
		INNER JOIN Phones ph2 ON ph2.phoneId = subordinatePersonPhone.phoneID
		ON subordinatePersonPhone.personID = subordinate.personId
	ON subordinate.bossId = boss.personId

-- Query 3
SELECT *
FROM Persons boss
LEFT JOIN PersonPhones bossPersonPhone ON bossPersonPhone.personID = boss.personId
INNER JOIN Phones ph ON ph.phoneId = bossPersonPhone.phoneID
LEFT JOIN Persons subordinate ON subordinate.bossId = boss.personId
LEFT JOIN PersonPhones subordinatePersonPhone ON subordinatePersonPhone.personID = subordinate.personId
INNER JOIN Phones ph2 ON ph2.phoneId = subordinatePersonPhone.phoneID

DROP TABLE Persons
DROP TABLE Phones
DROP TABLE PersonPhones

Results

Query 1

personId name bossId personId phoneId phoneId number personId name bossId personId phoneId phoneId number
1 Adam 4 1 1 1 555-1234 NULL NULL NULL NULL NULL NULL NULL
1 Adam 4 1 3 3 555-3456 NULL NULL NULL NULL NULL NULL NULL
2 Bob 4 2 4 4 555-4567 NULL NULL NULL NULL NULL NULL NULL
3 Carl 5 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
4 Deborah NULL 4 4 4 555-4567 1 Adam 4 1 1 1 555-1234
4 Deborah NULL 4 4 4 555-4567 1 Adam 4 1 3 3 555-3456
4 Deborah NULL 4 4 4 555-4567 2 Bob 4 2 4 4 555-4567
5 Ellen 4 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
6 Frannie NULL 6 5 5 555-5678 7 Gretta 6 7 2 2 555-2345
7 Gretta 6 7 2 2 555-2345 NULL NULL NULL NULL NULL NULL NULL

Query 2

personId name bossId personId phoneId phoneId number personId name bossId personId phoneId phoneId number
1 Adam 4 1 1 1 555-1234 NULL NULL NULL NULL NULL NULL NULL
1 Adam 4 1 3 3 555-3456 NULL NULL NULL NULL NULL NULL NULL
2 Bob 4 2 4 4 555-4567 NULL NULL NULL NULL NULL NULL NULL
3 Carl 5 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
4 Deborah NULL 4 4 4 555-4567 1 Adam 4 1 1 1 555-1234
4 Deborah NULL 4 4 4 555-4567 1 Adam 4 1 3 3 555-3456
4 Deborah NULL 4 4 4 555-4567 2 Bob 4 2 4 4 555-4567
4 Deborah NULL 4 4 4 555-4567 5 Ellen 4 NULL NULL NULL NULL
5 Ellen 4 NULL NULL NULL NULL 3 Carl 5 NULL NULL NULL NULL
6 Frannie NULL 6 5 5 555-5678 7 Gretta 6 7 2 2 555-2345
7 Gretta 6 7 2 2 555-2345 NULL NULL NULL NULL NULL NULL NULL

Query 3

personId name bossId personId phoneId phoneId number personId name bossId personId phoneId phoneId number
4 Deborah NULL 4 4 4 555-4567 1 Adam 4 1 1 1 555-1234
6 Frannie NULL 6 5 5 555-5678 7 Gretta 6 7 2 2 555-2345
4 Deborah NULL 4 4 4 555-4567 1 Adam 4 1 3 3 555-3456
4 Deborah NULL 4 4 4 555-4567 2 Bob 4 2 4 4 555-4567

@papb
Copy link
Member

papb commented Nov 10, 2019

Thank you very much for taking the time to write all this. It will help a lot whoever manages to pick this :)

EDIT: By the way, would you be willing to work on a PR for this?

@papb papb added status: understood For issues. Applied when the issue is understood / reproducible. and removed status: awaiting response For issues and PRs. OP must respond (or change something, if it is a PR). Maintainers have no action labels Nov 10, 2019
@andrew-st-angelo-77media
Copy link
Author

Good new/Bad news: At least by version 5, this particular issue seems to be fixed (mostly). We skipped from Sequelize v3 to v5, and nested joins are supported. 🎉

The bad news is that there are situations in which there is a bug with this structure. I've notated this in #12058. We have a lot of paginated lists, requiring the distinct flag, limit attribute, and often conditions on nested joins. This precise combination is where the bug seems to manifest. ☹️

I'll close this ticket. @papb, if you have any time, please take a look at this new ticket and let me know if there's anything I could do to buff it out.

@piotr-pawlowski
Copy link

piotr-pawlowski commented Mar 9, 2021

@andrew-st-angelo-77media @papb
Are you able to make nested joins? It is fixed?

@andrew-st-angelo-77media
Copy link
Author

@piotr-pawlowski
As stated in my comment above, the upgrade from v3 to v5 resolved the issue but introduced a new one -- #12058 . I have not, yet, had the opportunity to test out the scenario in v6.

@piotr-pawlowski
Copy link

Sequelize V5 or V6 didn't resolve nested joins issue. There is no way to manage nesting of queries. There is only required flag to set as inner or left join.

@andrew-st-angelo-77media
Copy link
Author

@piotr-pawlowski
I have to disagree. Having looked at/debugged many of the raw queries that Sequelize v5 produces, I can assure you that placing an INNER JOIN (required: true) inside a LEFT JOIN (required: false), results in a nested-join structure; at least in MS SQL. It even goes so far as to place the nested JOIN statements inside parentheses, which is just an interesting syntax. Not invalid, just quirky.

Now, if you are placing them on the same topographic level, they will obviously not be nested. But that is as expected.

If you are still running into issues structuring your queries, feel free to ask for assistance on StackOverflow or create a new Issue with examples, if you think there is a bug. I have also found a lot of good advice on the Slack channel.

@piotr-pawlowski
Copy link

Ok, but placing left join inside left join leads to flat structure instead of nested joins. I think there should be a flag to manage that behavior. It is very important as you noticed in your first post.

@alihassan161820
Copy link

any one solved this nested relation issue ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hard For issues and PRs. status: understood For issues. Applied when the issue is understood / reproducible. type: feature For issues and PRs. For new features. Never breaking changes.
Projects
None yet
Development

No branches or pull requests

4 participants