New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many to many relationships problem #104

Closed
LaurentZuijdwijk opened this Issue Oct 25, 2011 · 12 comments

Comments

5 participants
@LaurentZuijdwijk

LaurentZuijdwijk commented Oct 25, 2011

Hi,

I am encountering a problem with many to many relationships. If we have for example the following:

var Media = sequelize.import(__dirname +'/models/media');
var Project = sequelize.import(__dirname +'/models/subChapter');
Project.hasMany(Media);
Media.hasMany(Project);

Then a table called MediaProjects will be created.

When I want to add a Media object to my project, it will read out every Media object, delete them all and then add them back to the database, including the new one. I think that a single insert would begood enough...?

Could you please look into this? This performance penalty is just too great when this happens.

I noticed that this table will have primary keys defined for MediaId and for ProjectId. Why is this?

keep up the good work.

Cheers,
Laurent

@sdepold

This comment has been minimized.

Member

sdepold commented Oct 25, 2011

is it reading every media object or every mediaprohects object?

Am 25.10.2011 um 12:21 schrieb Laurent Zuijdwijk reply@reply.github.com:

Hi,

I am encountering a problem with many to many relationships. If we have for example the following:

var Media = sequelize.import(__dirname +'/models/media');
var Project = sequelize.import(__dirname +'/models/subChapter');
Project.hasMany(Media);
Media.hasMany(Project);

Then a table called MediaProjects will be created.

When I want to add a Media object to my project, it will read out every Media object, delete them all and then add them back to the database, including the new one. I think that a single insert would begood enough...?

Could you please look into this? This performance penalty is just too great when this happens.

I noticed that this table will have primary keys defined for MediaId and for ProjectId. Why is this?

keep up the good work.

Cheers,
Laurent

Reply to this email directly or view it on GitHub:
https://github.com/sdepold/sequelize/issues/104

@LaurentZuijdwijk

This comment has been minimized.

LaurentZuijdwijk commented Oct 25, 2011

every media object. Then deleting every object, and adding them again...

@LaurentZuijdwijk

This comment has been minimized.

LaurentZuijdwijk commented Oct 26, 2011

I wouldn't mind helping ou fix this issue, but I might need a quick overview of the architecture, since I am a bit short on time.

@sdepold

This comment has been minimized.

Member

sdepold commented Oct 26, 2011

i will take a closer look at it soon.

@LaurentZuijdwijk

This comment has been minimized.

LaurentZuijdwijk commented Oct 28, 2011

By the way, the same happens when deleting objects. You can understand that this is a very very slow way to delete or add objects, especially when there will be thousands or millions of rows in a table...

@sdepold

This comment has been minimized.

Member

sdepold commented Oct 28, 2011

yap you are right

Am 28.10.2011 um 16:20 schrieb Laurent Zuijdwijk reply@reply.github.com:

By the way, the same happens when deleting objects. You can understand that this is a very very slow way to delete or add objects, especially when there will be thousands or millions of rows in a table...

Reply to this email directly or view it on GitHub:
https://github.com/sdepold/sequelize/issues/104#issuecomment-2556567

@christopheretcheverry

This comment has been minimized.

christopheretcheverry commented Nov 19, 2012

What is the status of this? I am experiencing the same problem...

@greghart

This comment has been minimized.

greghart commented Apr 18, 2013

Any updates on this? Would be really helpful, in the meantime am having to run custom queries whenever this becomes a performance bottleneck.

@janmeier

This comment has been minimized.

Member

janmeier commented Apr 19, 2013

@greghart do you still experience this issue? As per #204 this should be fixed. I just did a quick test with: Create two media, create a project, associate one medium to the project (addMedium), associate both media to the project (setMedia). Associating one created one query, and so did associating both (since the other association was already created).

If this is still an issue, could you please provide some code to exhibit the error?

@ghost ghost assigned janmeier Apr 19, 2013

@greghart

This comment has been minimized.

greghart commented Apr 19, 2013

Not sure if it's the exact same issue, but similar enough that I thought this was the right place. Let me know if it's a separate issue that I missed in my searches. Incoming code segment (System/Host models taken from other example on here)

var Sequelize = require("sequelize")

var sequelize = new Sequelize('test', 'test', 'test', {
        logging: console.log
        , define: {
        host: 'localhost'
                , underscored: true
                , freezeTableName: true
                , syncOnAssociation: false
                , charset: 'utf8'
                , collate: 'utf8_general_ci'
        }
        , sync: {force: true}
});

var System = sequelize.define("system"
, {
        id: {
                type: Sequelize.INTEGER
                , primaryKey: true
                , autoIncrement: true
        }
        , name: Sequelize.STRING
}
, {
});

var Host = sequelize.define("host"
, {
        id: {
                type: Sequelize.INTEGER
                , primaryKey: true
                , autoIncrement: true
        }
        , name: Sequelize.STRING
}
, {
});

System.hasMany(Host, {joinTableName: 'system_host_manifest'});
Host.hasMany(System, {joinTableName: 'system_host_manifest'});

sequelize.sync({force:true}).complete(function(err, result){
    s = System.build({name: 'system'});
    h = Host.build({name: 'host'});
    h2 = Host.build({name: 'host2'});
    h3 = Host.build({name: 'host3'});

    chainer = new Sequelize.Utils.QueryChainer
    chainer.add(s.save());
    chainer.add(h.save());
    chainer.add(h2.save());
    chainer.add(h3.save());

    chainer.run().complete( function(err, result) {
        // Extraneous inserts (one per association, just do an insert VALUES?)
        s.setHosts([h, h2, h3]).complete(function(err, result){
            // Now setting where there already are associations results in extraneous SELECTs and DELETEs 
            s.setHosts([h]).complete(function(err, result){ console.log("Done"); });
        });
    });
});

On the initial setHosts, sequelize performs a SELECT (presumably to get a diff, but is there a reason it needs to get the whole model?), and then 3 separate INSERTs. Wouldn't ideal situation be one INSERT with 3 separate VALUES?

SELECT `host`.* FROM `host`, `system_host_manifest` WHERE `system_host_manifest`.`system_id`=1 AND `system_host_manifest`.`host_id`=`host`.`id`;
Executing: INSERT INTO `system_host_manifest` (`host_id`,`system_id`,`created_at`,`updated_at`) VALUES (1,1,'2013-04-19 18:57:45','2013-04-19 18:57:45');
Executing: INSERT INTO `system_host_manifest` (`host_id`,`system_id`,`created_at`,`updated_at`) VALUES (2,1,'2013-04-19 18:57:45','2013-04-19 18:57:45');
Executing: INSERT INTO `system_host_manifest` (`host_id`,`system_id`,`created_at`,`updated_at`) VALUES (3,1,'2013-04-19 18:57:45','2013-04-19 18:57:45');

Then, when there already are associations it performs a SELECT on each entry that it is about to DELETE. Why are the SELECTs performed at all, rather than just one DELETE FROM WHERE system_id=1 AND host_id IN (2,3)?

Executing: SELECT `host`.* FROM `host`, `system_host_manifest` WHERE `system_host_manifest`.`system_id`=1 AND `system_host_manifest`.`host_id`=`host`.`id`;
Executing: SELECT * FROM `system_host_manifest` WHERE `system_host_manifest`.`system_id`=1 AND `system_host_manifest`.`host_id`=2 LIMIT 1;
Executing: SELECT * FROM `system_host_manifest` WHERE `system_host_manifest`.`system_id`=1 AND `system_host_manifest`.`host_id`=3 LIMIT 1;
Executing: DELETE FROM `system_host_manifest` WHERE `host_id`=2 AND `system_id`=1 LIMIT 1
Executing: DELETE FROM `system_host_manifest` WHERE `host_id`=3 AND `system_id`=1 LIMIT 1

Thanks for your help!

@janmeier

This comment has been minimized.

Member

janmeier commented Jun 9, 2013

You are totally correct @greghart, doing a single query is way more clever! Thanks to our bulk create / update functionality I have now made a PR for it :)

@janmeier

This comment has been minimized.

Member

janmeier commented Jun 11, 2013

Closed with the merging of #690

@janmeier janmeier closed this Jun 11, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment