Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit is broken for findAll and findAndCountAll with include #7344

Closed
Alexsey opened this issue Mar 8, 2017 · 61 comments
Closed

Limit is broken for findAll and findAndCountAll with include #7344

Alexsey opened this issue Mar 8, 2017 · 61 comments
Labels
topic: limit & offset For issues and PRs. Things that involve queries with limit and/or offset. type: bug

Comments

@Alexsey
Copy link
Contributor

Alexsey commented Mar 8, 2017

Consider we have a two simple models "Events" and "Tags" with many-to-many relation and we want to get all events with specific tag. And we want to limit maximum number of response items.

(Consider you are using NodeJS 7.6+ with async/await or Babel with async/await plugin)

'use strict'

const Sequelize = require('sequelize')

const dbConfig = {
    type: 'mysql', user: 'root', password: 'root',
    database: 'db', host: 'localhost'
}

;(async () => {
    const sequelize = new Sequelize(
        dbConfig.database, dbConfig.user, dbConfig.password, {
            host: dbConfig.host,
            dialect: dbConfig.type,
            pool: {max: 100, min: 0, idle: 10000},
            dialectOptions: {
                charset: 'utf8_general_ci',
            }
        }
    )
    await sequelize.authenticate()

    const Events = sequelize.define('events', {
        id: {type: Sequelize.INTEGER, primaryKey: true, autoIncrement: true},
        title: {type: Sequelize.STRING}
    }, {timestamps: false})

    const Tags = sequelize.define('tags', {
        name: {type: Sequelize.STRING}
    }, {timestamps: false})

    const EventsTags = sequelize.define('events_tags', {}, {timestamps: false})

    Events.belongsToMany(Tags, {through: EventsTags})
    Tags.belongsToMany(Events, {through: EventsTags})

    await sequelize.sync()

    await Promise.all([
        Events.bulkCreate([
            {title: 'event_1'}, {title: 'event_2'},
            {title: 'event_3'}, {title: 'event_4'}
        ]),
        Tags.bulkCreate([
            {name: 'tag_1'}, {name: 'tag_2'},
            {name: 'tag_3'}, {name: 'tag_4'},
        ])
    ])
    await EventsTags.bulkCreate([
        {eventId: 1, tagId: 1}, {eventId: 2, tagId: 2},
        {eventId: 3, tagId: 3}, {eventId: 4, tagId: 1}
    ])

    const eventsWithTag1 = await Events.findAndCountAll({
        limit: 3,
        include: [{
            model: Tags,
            where: {name: 'tag_1'}
        }]
    })

    console.log('found events with tag_1:', eventsWithTag1.rows.length) // 1
    console.log('count events with tag_1:', eventsWithTag1.count) // 2

    await sequelize.close()
})()

This query will found only one event:
console.log('found events with tag_1:', eventsWithTag1.rows.length) is logging 1

but it should found two:
console.log('found events with tag_1:', eventsWithTag1.rows.length) must log 2
because there are two of them and limit is 3

If we remove limit or set it to 4+ or change the sequence of EventsTags creation to

EventsTags.bulkCreate([
  {eventId: 1, tagId: 1}, {eventId: 2, tagId: 2},
  {eventId: 4, tagId: 1}, {eventId: 3, tagId: 3}
])

results in that the request will found both events
console.log('found events with tag_1:', eventsWithTag1.rows.length) will log 2

So limit should limit the maximum number of records in the result, but it limits the number of records in intermediate query and final result is unpredictable.

If you put findAll instead of findAndCountAll you will face the same problem

This is generated SQL for the last select query:

SELECT
  `events`.*, `tags`.`id` AS `tags.id`, `tags`.`name` AS `tags.name`,
  `tags.events_tags`.`eventId` AS `tags.events_tags.eventId`,
  `tags.events_tags`.`tagId` AS `tags.events_tags.tagId`
FROM
  (
    SELECT
      `events`.`id`, `events`.`title`
    FROM
      `events` AS `events`
    WHERE
      (
        SELECT
          `events_tags`.`eventId`
        FROM
          `events_tags` AS `events_tags` 
        INNER JOIN
            `tags` AS `tag`
          ON
            `events_tags`.`tagId` = `tag`.`id`
        WHERE
          (
            `events`.`id` = `events_tags`.`eventId`
          ) LIMIT 1 
      ) IS NOT NULL LIMIT 3
  ) AS `events`
INNER JOIN
    (
      `events_tags` AS `tags.events_tags`
      INNER JOIN
          `tags` AS `tags`
        ON
          `tags`.`id` = `tags.events_tags`.`tagId`
    )
  ON
      `events`.`id` = `tags.events_tags`.`eventId`
    AND
      `tags`.`name` = 'tag_1';

I have try this on MySQL 5.7 and Sequelize 4.0.0-2 and 3.30.2

@Alexsey Alexsey changed the title Limit is broken for findAll and findAllAndCount with include Limit is broken for findAll and findAndCountAll with include Mar 9, 2017
@Alexsey
Copy link
Contributor Author

Alexsey commented Mar 9, 2017

Hm... it looks like to fix the query we need to move AND `tags`.`name` = 'tag_1' from the tail to the inner query before the LIMIT 1 (renaming `tags` to `tag`)

Looks like this

SELECT `events`.*, `tags`.`id` AS `tags.id`, `tags`.`name` AS `tags.name`, 
            `tags.events_tags`.`eventId` AS `tags.events_tags.eventId`, 
            `tags.events_tags`.`tagId` AS `tags.events_tags.tagId` 
FROM (SELECT `events`.`id`, `events`.`title` 
    FROM `events` AS `events` WHERE (
        SELECT `events_tags`.`eventId` 
        FROM `events_tags` AS `events_tags` 
        INNER JOIN `tags` AS `tag` ON `events_tags`.`tagId` = `tag`.`id` 
        WHERE (`events`.`id` = `events_tags`.`eventId`) AND `tag`.`name` = 'tag_1' LIMIT 1 
    ) IS NOT NULL LIMIT 2
) AS `events` 
INNER JOIN (`events_tags` AS `tags.events_tags` 
    INNER JOIN `tags` AS `tags` ON `tags`.`id` = `tags.events_tags`.`tagId`)
ON `events`.`id` = `tags.events_tags`.`eventId`

is the right query. Some one who know the code may fix this. I think it is the same issue as #6601 and may be some others

@Alexsey
Copy link
Contributor Author

Alexsey commented Mar 10, 2017

I have looked a little bit in the source code and bug is probable place of bug is in this function, somewhere in include processing. Owner of the code base should understand it better. Just run my example (it is complete and dependableness)

@lassombra
Copy link

This also affected me. As a workaround I did an unlimited, unsorted initial query and then introduced pagination on the result set, which eliminates any server performance benefits from pagination, especially if we get to scale (I'm expecting to stop around 500 records at any time, but this could really slow down with much more than that, especially with nested filters)

@Alexsey
Copy link
Contributor Author

Alexsey commented Jun 8, 2017

Checked on 4.0.0 (final) - bug is still there

@Alexsey
Copy link
Contributor Author

Alexsey commented Jun 9, 2017

As mentioned in #4716 - workaround is to add subQuery: false. Also #5715 is looks like related

@rrjohnson85
Copy link

@Alexsey subQuery: false can help, but can lead to other issues. For example, I'm noticing an incorrect count for some of my queries now.

@jasonray716
Copy link

I am also getting same issue with limit in findAll() even though I have added subQuery: false. Are there any ways for me to get correct number of results?

@Alexsey
Copy link
Contributor Author

Alexsey commented Jun 28, 2017

@nicholasli137 when I met this bug I have ended up writing raw SQL query. If you will find any other solution - please write here about it

@Alexsey
Copy link
Contributor Author

Alexsey commented Jun 28, 2017

As I mentioned above the problem is that limit is applying to subquery instead of query. @mickhansen if we change in your code

// Add LIMIT, OFFSET to sub or main query
const limitOrder = this.addLimitAndOffset(options, mainTable.model);
if (limitOrder && !options.groupedLimit) {
  if (subQuery) {
    subQueryItems.push(limitOrder);
  } else {
    mainQueryItems.push(limitOrder);
  }
}

to

// Add LIMIT, OFFSET to main query
const limitOrder = this.addLimitAndOffset(options, mainTable.model);
if (limitOrder && !options.groupedLimit) {
  mainQueryItems.push(limitOrder);
}

the bug goes away. Can you please comment may this be the right solution or will this lead to a new bugs?

@VladimirPal
Copy link

Hi guys! Any news about this issue?

VladimirPal added a commit to web-pal/sequelize that referenced this issue Aug 16, 2017
@Ieremchuk
Copy link

bump

@DavidVeloso
Copy link

Hi guys! Any news about this issue?

@austinksmith
Copy link

I am also encountering this issue, it's becoming a real pain point with the current application stack I am working on, query is necessary to have the includes for proper tagging and filtering functionality but this throws off the limit param to essentially be useless and random. Removing subQuery: false will resolve the limit issues but will then introduce issues with the tag include not working and returning invalid counts. Eg. Count: 1 but no rows.

File.findAndCountAll({
		distinct: true,
		subQuery: false, // This option is required for tag (include) functionality to work, 100% undocumented in sequalize. - Austin K. Smith 10/9/17
		offset: (req.query.start ? parseInt(req.query.start, 10) : 0),
		limit: (req.query.number ? parseInt(req.query.number, 10) : 12),
		where: _where,
		order: [
			[req.query.sort, req.query.order]
		],
		include: [
			{ 
				model: FileCategory, 
				as: 'categories', 
				through: 'fileFileCategories', 
				foreignKey: 'fileId', 
				where: _filters.category ? {
					id: _filters.category.id 
				} : null },
			{ 
				model: FileTag, 
				as: 'tags', 
				through: 'fileFileTags', 
				foreignKey: 'fileId',
				where: _filters.tag ? {
					id: _filters.tag.id 
				} : null },
			{ 
				model: Role, 
				as: 'roles', 
				through: 'fileRoles', 
				foreignKey: 'fileId', 
				where: _roleWhere, 
				required: false 
			}
		]
	}).then(function (files) {
		if (!files) {
			return res.status(404).send({
				message: 'No files found'
			});
		} else {
			return res.json(files);
		}
	}).catch(function (err) {
		winston.error('files.server.controller', err);
		// not secure send only generic message
		return res.status(500).send('files.server.controller error');
	});

Sub Query false is not an acceptable solution as this also causes problems with distinct: true not working properly and thus ending up with duplicate records that I have to manually strip out of the final result set, this is both inefficient and problematic as it then makes the pagination "remaining" count inaccurate.

@austinksmith
Copy link

Submitted a PR for this, let's see if we can get this rolled in. I think? @Alexsey 's solution will work just fine.

@fabienvauchelles
Copy link

+1

2 similar comments
@DavidVeloso
Copy link

+1

@lucasmendes2105
Copy link

+1

@pbrink231
Copy link

referencing my fix here that was more a temporary solution to close that issue. I am still having the issue.
#9563
I am currently on v4.37.10

@radoslavpetranov
Copy link

radoslavpetranov commented Apr 19, 2019

I am facing the same issue. I took a quick look at the sequelize code and it seems to me the easiest way to allow us to set a limit on the main query instead of the inner subquery is just an extra option, say isGlobalLimit: boolean.

If you take a look here

https://github.com/sequelize/sequelize/blob/master/lib/dialects/abstract/query-generator.js#L1329

notice how if subquery is true (which is the case when we have associations) we add the limit to the subQueryItems and if it's false, we add it to mainQueryItem.

The easiest way to resolve this problem seems to just change that check to something like

if (subQuery && !options.isGlobalLimit) {
    subQueryItems.push(limitOrder);
} else {
    mainQueryItems.push(limitOrder);
}

so that if the user added isGlobalLimit: true to their options, the limit would still get applied to the main query because that's the only possible way you can get pagination working right with sequelize if you have any non-trivial filtering and sorting conditions.

@kimobrian
Copy link

kimobrian commented May 8, 2019

My solution was to use a WITH operator on the main/parent model before the joins. I have a model called Board which stores bulletin board posts and a post can have a reactions(emoji reaction). I was getting less results so I ended up using a query that looked like this:

    WITH p as (
    SELECT "Board"."id" as id
    FROM "Boards" AS "Board"
    ORDER BY "Board"."createdAt" DESC
    LIMIT ${limit}
        OFFSET ${offset}
        )
    SELECT "Board"."id",
           "Board"."post",
           "Board"."postBy",
           "Board"."createdAt",
           "Board"."updatedAt",
           "creator"."id" AS "creator.id",
           "creator"."profilePic" AS "creator.profilePic",
           "creator"."name" AS "creator.name",
           "reactions"."id" AS "reactions.id",
           "reactions"."reaction" AS "reactions.reaction",
           "reactions"."reactionBy" AS "reactions.reactionBy",
           "reactions->creator"."id" AS "reactions.creator.id",
           "reactions->creator"."profilePic" AS "reactions.creator.profilePic",
           "reactions->creator"."name" AS "reactions.creator.name",
           "photos"."id" AS "photos.id",
           "photos"."photoUrl" AS "photos.photoUrl",
           "links"."id" AS "links.id",
           "links"."link" AS "links.link",
           "property"."id" AS "property.id",
           "property"."name" AS "property.name"
    FROM p  
    INNER JOIN "Boards" AS "Board" on p.id = "Board"."id"
    LEFT OUTER JOIN "Users" AS "creator" ON "Board"."postBy" = "creator"."id"
    LEFT OUTER JOIN "Reactions" AS "reactions" 
    ON "Board"."id" = "reactions"."postId"
    LEFT OUTER JOIN "Users" AS "reactions->creator"
    ON "reactions"."reactionBy" = "reactions->creator"."id"
    LEFT OUTER JOIN "Photos" AS "photos" ON "Board"."id" = "photos"."postId"
    LEFT OUTER JOIN "Links" AS "links" ON "Board"."id" = "links"."postId"
    LEFT OUTER JOIN "Properties" AS "property" 
    ON "Board"."propertyId" = "property"."id"
    GROUP BY "Board"."id",
             "reactions"."id",
             "reactions->creator"."id",
             "photos"."id",
             "links"."id",
             "property"."id",
             "creator"."id"
    ORDER BY "Board"."createdAt" DESC

then continue with the rest of the logic. All reactions are included and I also get the specified number of posts:

const options = {
      hasJoin: true,
      include
    };

    Board._validateIncludedElements(options);

    return sequelize.query(query, options).then(result).... // query is the query above

@iszlailorand
Copy link

iszlailorand commented Sep 24, 2019

If I understand well, your limit and offset is broken by the join, I think you can use group by Events.id and you will get a good result, but instead of count you will get an array of counts and the real count is the length of the array. something like this:

{ count: [ { count: '3' }, { count: '1' }, { count: '1' } ], rows: [ { dataValues: [Object], Tags: [Array] }, { dataValues: [Object], Tags: [Array] }, { dataValues: [Object], Tags: [Array] } ] }

I know I'm late to the party, but I think is the only simple way to solve this... I tried it on sequelize v5.10.0

@JamesMGreene
Copy link

There is a workaround in my issue that may be useful: #10557 (comment)

@yashdabi
Copy link

#9869 (comment)

@papb papb added the topic: limit & offset For issues and PRs. Things that involve queries with limit and/or offset. label Nov 11, 2019
@BartusZak
Copy link

BartusZak commented Dec 2, 2019

In my scenario adding subQuery: false helped.
That should be documented somehow coz I spent 6h looking for a issue...

EDIT:
ok, it fixed 500 Error but is not showing correct data (e.g { count: 1, rows: []}

EDIT2:
It is Related #11468

@radoslavpetranov
Copy link

Yeah for the record adding subQuery: false resolved my problem as well.

TimMensch added a commit to TimMensch/sequelize that referenced this issue Feb 1, 2020
* Calling count() for a query with distinct will generate invalid SQL. Sequelize should endeavor to _not_ generate invalid SQL.
* findAndCountAll() similarly calls count(), and will generate invalid SQL unless you pass `col` to
* Sequelize + FeathersJS, to take one example, will generate a call to findAndCountAll() with all rows. findAndCountAll() should just work as expected.

See:

sequelize#7344
sequelize#6418 (comment at the end)
sequelize#6404
sequelize#2713

And I seem to run into this issue every time I use Sequelize. Please merge ASAP and fix.
TimMensch added a commit to TimMensch/sequelize that referenced this issue Feb 1, 2020
Add code to count() to cause it to create valid SQL when distinct=true and col is unspecified or "*". Otherwise the SQL generated is COUNT(DISTINCT(*)), which is invalid. Issues referenced below were already closed, but the issue hadn't actually been fixed, and it _keeps coming up_.

Closes: sequelize#2713, sequelize#6404, sequelize#6418, sequelize#7344
TimMensch added a commit to TimMensch/sequelize that referenced this issue Feb 1, 2020
Add code to count() to cause it to create valid SQL when distinct=true and col is unspecified or "*". Otherwise the SQL generated is COUNT(DISTINCT(*)), which is invalid. Issues referenced below were already closed, but the issue hadn't actually been fixed, and it _keeps coming up_.

Closes: sequelize#2713, sequelize#6404, sequelize#6418, sequelize#7344
@vishavnokhwal
Copy link

vishavnokhwal commented Feb 11, 2020

Still an issue consider I have 3 tables Contact, Company, Individual

class Contact extends Sequelize.Model {
  static init(sequelize, DataTypes) {
    return super.init(
      {
        contactType: {
          type: DataTypes.STRING,
          allowNull: false,
        }
      },
      { sequelize }
    );
  }
  static associate(models){
    this.hasOne(models.Company);
    this.hasMany(models.Individual);
  }
}


class Company extends Sequelize.Model {
  static init(sequelize, DataTypes) {
    return super.init(
      {
        name: {
          type: DataTypes.STRING
          },
        compType: {
          type: DataTypes.STRING
        },
        affType: {
          type: DataTypes.STRING
        },
        lsNo: {
          type: DataTypes.STRING
        }
      },
      { sequelize }
    );
  }
  static associate(models){
    this.belongsTo(models.Contact);
  }
}


class Individual extends Sequelize.Model {
  static init(sequelize, DataTypes) {
    return super.init(
      {
        firstName: {
          type: DataTypes.STRING
          }
      },
      { sequelize }
    );
  }
  static associate(models){
    this.belongsTo(models.Contact);
  }
}

and this is my code

const results= await Contact.findAndCountAll({
  where:{
    [Op.and]:[
    {
      contactType:{
      [Op.iLike]:{
        [Op.any]:[
        "abc",
        "def"
        ]
      }
    },
    },
    {
      [Op.or]:[
      sequelize.where(sequelize.fn('lower',sequelize.cast(sequelize.col('Contact.name'), 'TEXT')),{ [Op.like]: `%as%` }),
      sequelize.where(sequelize.fn('lower',sequelize.cast(sequelize.col('Company.compType'), 'TEXT')),{ [Op.like]: `%as%` }),
      sequelize.where(sequelize.fn('lower',sequelize.cast(sequelize.col('Company.lsNo'), 'TEXT')),{ [Op.like]: `%as%` }),
      sequelize.where(sequelize.fn('lower',sequelize.cast(sequelize.col('Company.affType'), 'TEXT')),{ [Op.like]: `%as%` })
      sequelize.where(sequelize.fn('lower',sequelize.cast(sequelize.col('Individual.firstName'), 'TEXT')),{ [Op.like]: `%as%` })
      ]
    }
    ]
  },
col:['id'],
distinct:true,
limit:10,
offset:0,
include:[
{ 
  model:Company,
  attributes:['id','name','compType','affType','lsNo'],
  required:true,
  where:{
    affType:{
      [Op.iLike]:{
        [Op.any]:['abc','def']
      }
    }
  }
},
{
  model:Individual,
  attributes:['id','firstName']
}
],
order:[[
'Company',
'name',
'DESC'
]]
});

On execution of above code I am getting this error
missing FROM-clause entry for table "Individual"

Raw Query

SELECT          "Contact".*, 
                "Individuals"."id"        AS "Individuals.id", 
                "Individuals"."firstName" AS "Individuals.firstName" 
FROM            ( 
                           SELECT     "Contact"."contactType", 
                                      "Contact"."id", 
                                      "Contact"."createdAt", 
                                      "Contact"."updatedAt", 
                                      "Company"."id"            AS "Company.id",
                                      "Company"."name"          AS "Company.name",
                                      "Company"."compType"   AS "Company.compType",
                                      "Company"."affiliateType" AS "Company.affiliateType",
                                      "Company"."lsNo"     AS "Company.lsNo"
                           FROM       "Contacts"                 AS "Contact" 
                           INNER JOIN "Companies"         AS "Company"
                           ON         "Contact"."id" = "Company"."ContactId"
                           AND        "Company"."affiliateType" ilike ANY (array['abc','def'])
                           WHERE      ( 
                                                 "Contact"."contactType" ilike ANY (array['abc','def'])
                                      AND        ( 
                                                            lower(cast("Company"."name" AS text)) LIKE '%as%'
                                                 OR         lower(cast("Company"."compType" AS text)) LIKE '%as%'
                                                 OR         lower(cast("Company"."lsNo" AS text)) LIKE '%as%'
                                                 OR         lower(cast("Company"."affiliateType" AS text)) LIKE '%as%'
                                                 OR         lower(cast("Individuals"."firstName" AS text)) LIKE '%as%')) limit 10 offset 0) AS "Contact"
LEFT OUTER JOIN "Individuals"                                                                                                         AS "Individuals"
ON              "Contact"."id" = "Individuals"."ContactId" 
ORDER BY        "Company"."name" DESC;

On adding subQuery=false issue is resolved but limit is being applied on Individuals not on Contact
Here is raw query after adding subQuery=false:-

SELECT     "Contact"."contactType", 
                "Contact"."id", 
                "Contact"."createdAt", 
                "Contact"."updatedAt", 
                "Company"."id"            AS "Company.id", 
                "Company"."name"          AS "Company.name", 
                "Company"."compType"   AS "Company.compType", 
                "Company"."affType" AS "Company.affType", 
                "Company"."lsNo"     AS "Company.lsNo", 
                "Individuals"."id"        AS "Individuals.id", 
                "Individuals"."firstName" AS "Individuals.firstName" 
FROM            "Contacts"    AS "Contact" 
INNER JOIN      "Companies"         AS "Company" 
ON              "Contact"."id" = "Company"."ContactId" 
AND             "Company"."affType" ilike ANY (array['None']) 
LEFT OUTER JOIN "Individuals" AS "Individuals" 
ON              "Contact"."id" = "Individuals"."ContactId" 
WHERE           ( 
                                "Contact"."contactType" ilike ANY (array['abc','def'])
                AND             ( 
                                                lower(cast("Company"."name" AS text)) LIKE '%as%'
                                OR              lower(cast("Company"."compType" AS text)) LIKE '%as%'
                                OR              lower(cast("Company"."lsNo" AS text)) LIKE '%as%'
                                OR              lower(cast("Company"."affType" AS text)) LIKE '%as%'
                                OR              lower(cast("Individuals"."firstName" AS text)) LIKE '%as%'))
ORDER BY    "Company"."name" DESC limit 10 offset 0;

@Alexsey
Copy link
Contributor Author

Alexsey commented Feb 11, 2020

@vishavnokhwal you should better create a new issue with the bug report if you expect it to be processed

@mopcweb
Copy link

mopcweb commented Feb 18, 2020

After about a week of hell found acceptable workaround for my case. Believe it would be helpful as found a lot of unanswered topics/issues on github.

TL;DR; actual solution is at the end of post, just the last piece of code.

The main idea is that Sequelize builds correct SQL query, but when having left joins we produce carthesian product, so there will be a lot of rows as query result.

Example: A and B tables. Many to many relation. If we want to get all A joined with B we will receive A * B rows, so there will be a lot of rows for each record from A with different values from B.

CREATE TABLE IF NOT EXISTS a (
	id INTEGER PRIMARY KEY NOT NULL,
	title VARCHAR
)

CREATE TABLE IF NOT EXISTS b (
	id INTEGER PRIMARY KEY NOT NULL,
	age INTEGER
)

CREATE TABLE IF NOT EXISTS ab (
	id INTEGER PRIMARY KEY NOT NULL,
	aid INTEGER,
	bid INTEGER
)

SELECT *
FROM a
LEFT JOIN (ab JOIN b ON b.id = ab.bid) ON a.id = ab.aid

In sequelize syntax:

class A extends Model {}
A.init({
	id: {
      type: Sequelize.INTEGER,
      autoIncrement: true,
	  primaryKey: true,
	},
	title: {
      type: Sequelize.STRING,
	},
});

class B extends Model {}
B.init({
	id: {
	  type: Sequelize.INTEGER,
      autoIncrement: true,
	  primaryKey: true,
	},
	age: {
      type: Sequelize.INTEGER,
	},
});

A.belongsToMany(B, { foreignKey: ‘aid’, otherKey: ‘bid’, as: ‘ab’ });
B.belongsToMany(A, { foreignKey: ‘bid’, otherKey: ‘aid’, as: ‘ab’ });

A.findAll({
	distinct: true,
	include: [{ association: ‘ab’ }],
})

Everything works ok.

So, imagine i want to receive 10 records from A with mapped to them records from B.
When we put LIMIT 10 on this query, Sequelize build correct query but LIMIT is applied to whole query and as result we receive only 10 rows , where all of them could be for only one record from A. Example:

A.findAll({
	distinct: true,
	include: [{ association: ‘ab’ }],
	limit: 10,
})

Which will be converted into:

SELECT *
FROM a
LEFT JOIN (ab JOIN b ON b.id = ab.bid) ON a.id = ab.aid
LIMIT 10

id  |  title    |   id  |  aid  |  bid  |  id   |  age
--- |  -------- | ----- | ----- | ----- | ----- | -----
1	|	first	| 	1	|	1	|	1   |	1   |	1
1	|	first	| 	2	|	1	|	2   |	2   |	2
1	|	first	| 	3	|	1	|	3   |	3   |	3
1	|	first	| 	4	|	1	|	4   |	4   |	4
1	|	first	| 	5	|   1	|	5   |	5   |	5
2	|	second	| 	6	|	2	|	5   |	5   |	5
2	|	second	| 	7	|	2	|	4   |	4   |	4
2	|	second	| 	8	|	2	|	3   |	3   |	3
2	|	second	| 	9	|	2	|	2   |	2   |	2
2	|	second	| 	10	|	2	|	1   |	1   |	1

After output is received, Seruqlize as ORM will make data mapping and over query result in code will be:

[
 {
  id: 1,
  title: 'first',
  ab: [
   { id: 1, age:1 },
   { id: 2, age:2 },
   { id: 3, age:3 },
   { id: 4, age:4 },
   { id: 5, age:5 },
  ],
 },
  {
  id: 2,
  title: 'second',
  ab: [
   { id: 5, age:5 },
   { id: 4, age:4 },
   { id: 3, age:3 },
   { id: 2, age:2 },
   { id: 1, age:1 },
  ],
 }
]

Obviously NOT what we wanted. I wanted to receive 10 records for A, but received just 2, while i know that there are more in database.

So we have correct SQL query but still received incorrect result.

Ok, i had some ideas but the easiest and most logical is:

  1. Make first request with joins, and group results by source table (table on which we are making query and to which making joins) 'id' property. Seems easy.....

    To make so we need to provide 'group' property to Sequelize query options. Here we have some problems. First - Sequelize makes aliases for each table while generating SQL query. Second - Sequelize puts all columns from JOINED table into SELECT statement of its query and passing 'attributes' = [] won't help. In both cases we'll receive SQL error.

    To solve first we need to convert Model.tableName to singluar form of this word (this logic is based on Sequelize). Just use pluralize.singular(). Then compose correct property to GROUP BY:

    const tableAlias = pluralize.singular('Industries') // Industry
    
    {
     ...,
     group: [`${tableAlias}.id`]
    }

    To solve second (it was the hardest and the most ... undocumented). We need to use undocumented property 'includeIgnoreAttributes' = false. This will remove all columns from SELECT statement unless we specify some manually. We should manually specify attributes = ['id'] on root query.

  2. Now we will receive correctly output with only necessary resources ids. Then we need to compose seconf query WITHOUT limit and offset, but specify additional 'where' clause:

{
 ...,
 where: {
  ...,
  id: Sequelize.Op.in: [array of ids],
 }
}
  1. With query about we can produce correct query with LEFT JOINS.

Solution
Method receives model and original query as arguments and returns correct query + additionally total amount of records in DB for pagination. It also correctly parse query order to provide ability to order by fields from joined tables:

/**
   *  Workaround for Sequelize illogical behavior when quering with LEFT JOINS and having LIMIT / OFFSET
   *
   *  Here we group by 'id' prop of main (source) model, abd using undocumented 'includeIgnoreAttributes'
   *  Sequelize prop (it is used in its static count() method) in order to get correct SQL request
   *  Witout usage of 'includeIgnoreAttributes' there are a lot of extra invalid columns in SELECT statement
   *
   *  Incorrect example without 'includeIgnoreAttributes'. Here we will get correct SQL query
   *  BUT useless according to business logic:
   *
   *  SELECT "Media"."id", "Solutions->MediaSolutions"."mediaId", "Industries->MediaIndustries"."mediaId",...,
   *  FROM "Medias" AS "Media"
   *  LEFT JOIN ...
   *  WHERE ...
   *  GROUP BY "Media"."id"
   *  ORDER BY ...
   *  LIMIT ...
   *  OFFSET ...
   *
   *  Correct example with 'includeIgnoreAttributes':
   *
   *  SELECT "Media"."id"
   *  FROM "Medias" AS "Media"
   *  LEFT JOIN ...
   *  WHERE ...
   *  GROUP BY "Media"."id"
   *  ORDER BY ...
   *  LIMIT ...
   *  OFFSET ...
   *
   *  @param model - Source model (necessary for getting its tableName for GROUP BY option)
   *  @param query - Parsed and ready to use query object
   */
  private async fixSequeliseQueryWithLeftJoins<C extends Model>(
    model: ModelCtor<C>, query: FindAndCountOptions,
  ): IMsgPromise<{ query: FindAndCountOptions; total?: number }> {
    const fixedQuery: FindAndCountOptions = { ...query };

    // If there is only Tenant data joined -> return original query
    if (query.include && query.include.length === 1 && (query.include[0] as IncludeOptions).model === Tenant) {
      return msg.ok({ query: fixedQuery });
    }

    // Here we need to put it to singular form,
    // because Sequelize gets singular form for models AS aliases in SQL query
    const modelAlias = singular(model.tableName);

    const firstQuery = {
      ...fixedQuery,
      group: [`${modelAlias}.id`],
      attributes: ['id'],
      raw: true,
      includeIgnoreAttributes: false,
      logging: true,
    };

    // Ordering by joined table column - when ordering by joined data need to add it into the group
    if (Array.isArray(firstQuery.order)) {
      firstQuery.order.forEach((item) => {
        if ((item as GenericObject).length === 2) {
          firstQuery.group.push(`${modelAlias}.${(item as GenericObject)[0]}`);
        } else if ((item as GenericObject).length === 3) {
          firstQuery.group.push(`${(item as GenericObject)[0]}.${(item as GenericObject)[1]}`);
        }
      });
    }

    return model.findAndCountAll<C>(firstQuery)
      .then((ids) => {
        if (ids && ids.rows && ids.rows.length) {
          fixedQuery.where = {
            ...fixedQuery.where,
            id: {
              [Op.in]: ids.rows.map((item: GenericObject) => item.id),
            },
          };
          delete fixedQuery.limit;
          delete fixedQuery.offset;
        }

        /* eslint-disable-next-line */
        const total = (ids.count as any).length || ids.count;

        return msg.ok({ query: fixedQuery, total });
      })
      .catch((err) => this.createCustomError(err));
  }

TimMensch added a commit to TimMensch/sequelize that referenced this issue Feb 20, 2020
Add code to count() to cause it to create valid SQL when distinct=true and col is unspecified or "*". Otherwise the SQL generated is COUNT(DISTINCT(*)), which is invalid. Issues referenced below were already closed, but the issue hadn't actually been fixed, and it _keeps coming up_.

Closes: sequelize#2713, sequelize#6404, sequelize#6418, sequelize#7344
adzialocha added a commit to adzialocha/hoffnung3000 that referenced this issue Mar 25, 2020
We have a problem with Sequelize and how it handles distinct SQL queries causing slots (JOIN table) to be empty
when fetching many events. This breaks the calendar.

Related issue: sequelize/sequelize#7344
Possible PR/Fix: sequelize/sequelize#11946
@phil714
Copy link

phil714 commented Feb 7, 2022

This is still a problem in sequelize v6 and v7, any updates on this issue ?

@ephys
Copy link
Member

ephys commented Feb 11, 2022

Is this the same issue as #9940?

@Tapendrakmr
Copy link

I am still facing the same issue with sequelize v"^6.26.0", I have used the findAndCountAll method which returns the correct count but the wrong numbers of row data. I passed the limit for 10 rows but it returns only 4 row

const users=await A.findAndCountAll({
        include:[
            {model:B,attributes:['id', 'name'] },
            {model:C,attributes:['id', 'name']},
            {model: D,as:'Area', attributes:['id', 'name'],
                through: {attributes: ['id'],
                }
            }
        ],
        where:whereCondition,
        order:[sortOrder],
        distinct: true,
        subQuery: false,
        offset: 0, limit: 10,
      })

@Tapendrakmr
Copy link

Is this the same issue as #9940?

yes

@taylanyildiz
Copy link

#9940 Same Issue,

Job.belongstoMany(
    TechnicianUser,{
        as:'technicians',
        through: TechnicianJob,
        onDelete:'',
        onUpdate:'',
});

TechnicianUser.belongsToMany(
    Job,{
        as:'jobs',
        through: TechnicianJob,
        onDelete:...,
        onUpdate:...,
});

/// Find jobs
Job.findCountAll({
    limit:limit && parseInt(limit),
    offset:offset && parseInt(limit),
    distinct:true,
    col:'id',
    subquery:false,
    where:{
        '$technicians.first_name$':{[Op.subString]:keyword},
    },
    include:[
        {
            model:TechnicianUser,
            as:'technicians',
            through:{
                ....
            }
        }
    ]
});

-> Expected result

  •  count:6,
      rows:[
          {...},<!-- 1 -->
          {...},<!-- 2 -->
          {...},<!-- 3 -->
          {...},<!-- 4 -->
          {...},<!-- 5 -->
      ]
    

-> Actual result

  •  count:6,
      rows:[
          {...},<!-- 1 -->
          {...},<!-- 2 -->
          {...},<!-- 3 -->
          {...},<!-- 4 -->
      ]
    

Some fileds missing but count same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: limit & offset For issues and PRs. Things that involve queries with limit and/or offset. type: bug
Projects
None yet
Development

No branches or pull requests