Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struggling to Reliably Count Associated Records #5777

Closed
bookercodes opened this issue Apr 21, 2016 · 9 comments
Closed

Struggling to Reliably Count Associated Records #5777

bookercodes opened this issue Apr 21, 2016 · 9 comments

Comments

@bookercodes
Copy link
Contributor

What you are doing?

I am building a poll similar to Straw Poll. I have three tables: Poll, PollOption and Vote. Here are their definitions:

var Poll = connection.define('Poll', {
  question: {
    type: Sequelize.STRING,
    allowNull: false
  }
}, {
  timestamps: false
})

var PollOption = connection.define('PollOption', {
  text: Sequelize.STRING
}, {
  timestamps: false
})

var Vote = connection.define('Vote', {
}, {
  timestamps: false
})

Poll.hasMany(PollOption)
PollOption.belongsTo(Poll)
PollOption.hasMany(Vote)
Vote.belongsTo(PollOption)

And here is the EER diagram:

I can create a sample Poll, some PollOptions and a Vote:

function createTestPoll() {
  return Poll.create({
    question: 'What is the best Node.js ORM?'
  }).then(function() {
    return PollOption.bulkCreate([
      { PollId: 1, text: 'Sequelize' },
      { PollId: 1, text: 'Knex' }
    ])
  }).then(function() {
    return Vote.create({
      PollOptionId: 1
    })
  })
}

The problem occurs when I want to query that information back again:

function queryTestPoll() {
  return Poll.findOne({
    where: {
      id: 1
    },
    group: ['PollOptions.Votes.PollOptionId'],
    attributes: [
      'question',
    ],
    include: {
      attributes: [
        'text',
        [Sequelize.fn('count', 'PollOptions.Votes.PollOptionId'), 'votes']
      ],
      model: PollOption,
      include: {
        model: Vote,
        attributes: []
      }
    }
  }).then(function(foundPoll) {
    console.log(JSON.stringify({
      question: foundPoll.dataValues.question,
      options: foundPoll.dataValues.PollOptions.map(function(option) {
        return option.dataValues
      })
    }, null, 2))
  })
}

What do you expect to happen?

Based on what I learned from #222 I wanted:

{
  "question": "What is the best Node.js ORM?",
  "options": [
    {
      "text": "Knex",
      "votes": 0
    },
    {
      "text": "Sequelize",
      "votes": 1
    }
  ]
}

What is actually happening?

But the output was:

{
  "question": "What is the best Node.js ORM?",
  "options": [
    {
      "text": "Knex",
      "votes": 1
    },
    {
      "text": "Sequelize",
      "votes": 1
    }
  ]
}

votes is reported to be 1 for both records even though I only voted for one of them 😕.

If I do not store a Vote in the database the result is:

{
  "question": "What is the best Node.js ORM?",
  "options": [
    {
      "text": "Sequelize",
      "votes": 2
    }
  ]
}

... Which is also not what you would expect 😢

Extra

I can make the code work by omitting the group and attributes fields but the code is inefficient:

const foundPoll = await db.models.Poll.findOne({
  where: {
    id: req.params.pollId
  },
  include: {
    model: db.models.PollOption,
    include: {
      model: db.models.Vote
    }
  }
})

 console.log(JSON.stringify({
  question: foundPoll.dataValues.question,
  options: foundPoll.dataValues.PollOptions.map(function (option) {
    return {
      text: option.text,
      votes: option.Votes.length // TODO: Calcualte in query
    }
  })
}

SSCCE

var Sequelize = require('sequelize')

var connection = new Sequelize('sequelize_question_schema', 'root', '')

var Poll = connection.define('Poll', {
  question: {
    type: Sequelize.STRING,
    allowNull: false
  }
}, {
  timestamps: false
})

var PollOption = connection.define('PollOption', {
  text: Sequelize.STRING
}, {
  timestamps: false
})

var Vote = connection.define('Vote', {
}, {
  timestamps: false
})

Poll.hasMany(PollOption)
PollOption.belongsTo(Poll)
PollOption.hasMany(Vote)
Vote.belongsTo(PollOption)

function createTestPoll() {
  return Poll.create({
    question: 'What is the best Node.js ORM?'
  }).then(function() {
    return PollOption.bulkCreate([
      { PollId: 1, text: 'Sequelize' },
      { PollId: 1, text: 'Knex' }
    ])
  }).then(function() {
    return Vote.create({
      PollOptionId: 1
    })
  })
}

function queryTestPoll() {
  return Poll.findOne({
    where: {
      id: 1
    },
    group: ['PollOptions.Votes.PollOptionId'],
    attributes: [
      'question',
    ],
    include: {
      attributes: [
        'text',
        [Sequelize.fn('count', 'PollOptions.Votes.PollOptionId'), 'votes']
      ],
      model: PollOption,
      include: {
        model: Vote,
        attributes: []
      }
    }
  }).then(function(foundPoll) {
    console.log(JSON.stringify({
      question: foundPoll.dataValues.question,
      options: foundPoll.dataValues.PollOptions.map(function(option) {
        return option.dataValues
      })
    }, null, 2))
  })
}

connection.sync({ force: true })
  .then(createTestPoll)
  .then(queryTestPoll)

Dialect: mysql
Database version: 15.1
Sequelize version: 3.21.0

@bookercodes bookercodes changed the title Struggling to Relaible Count Related Records Struggling to Reliably Count Related Records Apr 21, 2016
@bookercodes bookercodes changed the title Struggling to Reliably Count Related Records Struggling to Reliably Count Associated Records Apr 21, 2016
@mickhansen
Copy link
Contributor

You're counting and grouping on the same column, that would generally be an error.
I'm guessing you probably want to count on votes.id.

It would be great to have runnable/shareable snippets, doesn't look like runnable.com supports postgres though.

@bookercodes
Copy link
Contributor Author

You're counting and grouping on the same column, that would generally be an error.

I don't know 😢

I think one of the reasons I'm confused is because I wrote a query that sortof works:

USE sequelize_question_schema;

SELECT
    `Poll`.`question` AS Question,
    `PollOption`.`id` AS `PollOptionId`,
    `PollOption`.`text` AS `PollOptionText`,
    COUNT(`Vote`.`PollOptionId`) AS `Votes`
FROM `Polls` AS `Poll`
JOIN `PollOptions` AS `PollOption`
    ON `PollOption`.`PollId` = `Poll`.`id`
LEFT OUTER JOIN `Votes` AS `Vote`
    ON `Vote`.`PollOptionId` = `PollOption`.`id`
WHERE `Poll`.`id` = 1
GROUP BY `Vote`.`PollOptionId`

I say sortof because it works when there's at least one Vote in the table:

But when there are no Votes it fails:

I really appreciate your input, @mickhansen. You are probably correct but I'm confused because the above query counts and groups by the same column and works... sortof.

It would be great to have runnable/shareable snippets, doesn't look like runnable.com supports postgres though.

Yeah, that would be ideal. In the mean time, I like to post a Short, Self Contained, Correct (Compilable), Example (SSCCE). It should be possible to download and run mine in Node as-is.

What do you think?

Thanks again!

@bookercodes
Copy link
Contributor Author

By the way, I tried to count PollOptions.Votes.id:

But I get an error:

Note: I tried counting PollOptions.Votes.id, PollOption.Votes.id, PollOptions.Vote.id, PollOption.Vote.id Votes.id, Vote.id, etc. and I get the same error 😕

@mickhansen
Copy link
Contributor

I'm working on a small sscce-template we can put in the project and run with docker.
I'll run your SSCCE after and see if i can find a quick fix.

@bookercodes
Copy link
Contributor Author

I really appreciate that, thank you!

@mickhansen
Copy link
Contributor

MySQL is acting weird in this, i haven't tried to count joined rows in mysql in a while (primarily use postgres) and i don't have anymore time to debug this right now.

You might want to see if you can make it work with a raw query first and then i can help you convert that to a sequelize call.

@bookercodes
Copy link
Contributor Author

@mickhansen I totally appreciate that. I'm working on the query now. Thanks.

@bookercodes
Copy link
Contributor Author

Woohoo 🎉

In response to you, @mickhansen, I got the query working:

USE sequelize_question_schema;

SELECT
    `Poll`.`question` AS Question,
    `PollOption`.`id` AS `PollOptionId`,
    `PollOption`.`text` AS `PollOptionText`,
    COUNT(`Vote`.`id`) AS `Votes`
FROM `Polls` AS `Poll`
LEFT JOIN `PollOptions` AS `PollOption`
    ON `PollOption`.`PollId` = `Poll`.`id`
LEFT JOIN `Votes` AS `Vote`
    ON `Vote`.`PollOptionId` = `PollOption`.`id`
WHERE `Poll`.`id` = 1
GROUP BY `PollOption`.`id`

I wanted to have a stab at this myself before asking for help and voila:

function queryTestPoll() {
  return Poll.findOne({
    where: {
      id: 1
    },
    group: ['PollOptions.id'],
    attributes: [
      'question',
    ],
    include: {
      attributes: [
        'text',
        [connection.fn('COUNT', connection.col('PollOptions.Votes.id')), 'votes']
      ],
      model: PollOption,
      include: {
        model: Vote
      }
    }
  }).then(function(foundPoll) {
    console.log(JSON.stringify({
      question: foundPoll.dataValues.question,
      options: foundPoll.dataValues.PollOptions.map(function(option) {
        return option.dataValues
      })
    }, null, 2))
  })
}
{
  "question": "What is the best Node.js ORM?",
  "options": [
    {
      "text": "Sequelize",
      "votes": 1,
      "Votes": [
        {
          "id": 1,
          "PollOptionId": 1
        }
      ]
    },
    {
      "text": "Knex",
      "votes": 0,
      "Votes": []
    }
  ]
}

Really it came down to grouping and counting by the right columns so thanks a lot @mickhansen ❤️.

Just in case someone else is struggling with something similar in the future, here are some resources I used:

@mickhansen
Copy link
Contributor

Great :) I tried a similar group but still got weird results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants