Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #47: avoid query where possible in Garner::Mixins::Mongoid::Identity. #48

Closed

Conversation

dblock
Copy link
Contributor

@dblock dblock commented Jul 11, 2013

Try to simplify the conditions depending on how many items we have in mongoid_identity_fields and what they are.

@fancyremarker
Copy link
Contributor

Thanks a ton for catching and fixing this. It was a serious performance problem on larger collections, and great to have it fixed.

I'm inclined to suggest that we stick with just the one-line solution for now, as it provides 99.9...% of the performance improvement, and I'm irrationally cautious against introducing more code changes that could have unanticipated effects that we haven't thought of. Here are the explain() results for the original code, the code that eliminates the sort but keeps the $or, and the code that eliminates both:

# Case 1: Super Slow
> db.artworks.find({"$or":[{_id:"4d8b93ba4eb68a1b2c001c5b"},{_slugs:"51ae0d698b3b81a30200007d"}]}).sort({ _id: 1 }).limit(1).explain()
{
    "cursor" : "BtreeCursor _id_",
    "nscanned" : 85531,
    "nscannedObjects" : 85531,
    "n" : 1,
    "millis" : 206,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "isMultiKey" : false,
    "indexOnly" : false,
    "indexBounds" : {
        "_id" : [
            [
                {
                    "$minElement" : 1
                },
                {
                    "$maxElement" : 1
                }
            ]
        ]
    }
}

# Case Two: Super Fast
> db.artworks.find({"$or":[{_id:"4d8b93ba4eb68a1b2c001c5b"},{_slugs:"51ae0d698b3b81a30200007d"}]}).limit(1).explain()
{
    "clauses" : [
        {
            "cursor" : "BtreeCursor _id_",
            "nscanned" : 0,
            "nscannedObjects" : 0,
            "n" : 0,
            "millis" : 0,
            "nYields" : 0,
            "nChunkSkips" : 0,
            "isMultiKey" : false,
            "indexOnly" : false,
            "indexBounds" : {
                "_id" : [
                    [
                        "4d8b93ba4eb68a1b2c001c5b",
                        "4d8b93ba4eb68a1b2c001c5b"
                    ]
                ]
            }
        },
        {
            "cursor" : "BtreeCursor _slugs_1",
            "nscanned" : 1,
            "nscannedObjects" : 1,
            "n" : 1,
            "millis" : 0,
            "nYields" : 0,
            "nChunkSkips" : 0,
            "isMultiKey" : true,
            "indexOnly" : false,
            "indexBounds" : {
                "_slugs" : [
                    [
                        "51ae0d698b3b81a30200007d",
                        "51ae0d698b3b81a30200007d"
                    ]
                ]
            }
        }
    ],
    "nscanned" : 1,
    "nscannedObjects" : 1,
    "n" : 1,
    "millis" : 0
}

// Case Three: Slightly Super Faster
> db.artworks.find({_slugs:"51ae0d698b3b81a30200007d"}).limit(1).explain()
{
    "cursor" : "BtreeCursor _slugs_1",
    "nscanned" : 1,
    "nscannedObjects" : 1,
    "n" : 1,
    "millis" : 0,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "isMultiKey" : true,
    "indexOnly" : false,
    "indexBounds" : {
        "_slugs" : [
            [
                "51ae0d698b3b81a30200007d",
                "51ae0d698b3b81a30200007d"
            ]
        ]
    }
}

@dblock
Copy link
Contributor Author

dblock commented Jul 11, 2013

Two index lookups are exactly twice as slow. I don't really see why we would want to do it, but I also see how this looks a big hacky. I'll do what you say and keep the fix here as a monkey patch in our project.

@fancyremarker
Copy link
Contributor

Technically, thanks to the limit(1), it's only slower when the first condition (either :_id or :_slugs, whichever is first in Garner.config.mongoid_identity_fields) is not matched. When the first is a hit, it's exactly as fast:

> db.artworks.find({"$or":[{_slugs:"51ae0d698b3b81a30200007d"},{_id:"4d8b93ba4eb68a1b2c001c5b"}]}).limit(1).explain()
{
    "cursor" : "BtreeCursor _slugs_1",
    "nscanned" : 1,
    "nscannedObjects" : 1,
    "n" : 1,
    "millis" : 0,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "isMultiKey" : true,
    "indexOnly" : false,
    "indexBounds" : {
        "_slugs" : [
            [
                "51ae0d698b3b81a30200007d",
                "51ae0d698b3b81a30200007d"
            ]
        ]
    }
}

Also, it's not actually "twice as slow" because an { nscanned: 0 } query is faster than an { nscanned: 1 } query.

@dblock
Copy link
Contributor Author

dblock commented Jul 11, 2013

Closing for #49.

@dblock dblock closed this Jul 11, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants