Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Complex Structures #28

Closed
andrewshell opened this issue Oct 7, 2016 · 19 comments
Closed

Optimize Complex Structures #28

andrewshell opened this issue Oct 7, 2016 · 19 comments

Comments

@andrewshell
Copy link
Contributor

I'm working on a site that has a very complex interrelated database structure.

https://github.com/andrewshell/pen-paper-2/tree/0.1.0

Here is the most complex query in my application:
CreatorsAtlasRepository::getCreatorById

The complexity comes in the creator join tables. They can't really be manyToMany because there are multiple relationships off of them.

Creator─RpgBookCreator─┬─RpgBook─┬─Publisher
                       │         │
                       │         └─GameLine
                       │
                       └─Credit

In this particular query, Publisher is a sub-table under 4 different tables. GameLine is also under 4 different tables.

In total, there are 20 different tables that will need to be queried in order to assemble this structure.

I installed DebugBar and on a page like Creator Monte Cook (username pen-paper & password hoopla), it shows that there were 484 queries. Many of those are selecting single rows from publishers or game_lines.

I understand that this is an extreme example of a query with Atlas and completely understand if your response is "This is a weird edge case that I'm not supporting".

But since Paul is the author of "Solving The N+1 Problem In PHP" I thought he'd want to know. ;-)

I would also like to note that even with all those queries the page (with query cache disabled) renders in about 1.6 seconds. 👍

Most likely these pages wouldn't change very often and would be cached so it's not an emergency or anything.

@pmjones
Copy link
Contributor

pmjones commented Oct 7, 2016

Here is the most complex query in my application:
CreatorsAtlasRepository::getCreatorById

I don't know whether to be impressed or horrified. ;-)

Counting up the number of "relationships" being asked for, I would guess there should be only about 40 queries total, not 484. Any chance of sending along a query log so I can see what's going on?

@harikt
Copy link
Contributor

harikt commented Oct 7, 2016

@pmjones Did you missed to login to http://pen-paper.geekity.com/creator/2/ . It has the debug bar which shows the sql queries. Or are you looking for the real sql query log?

@pmjones
Copy link
Contributor

pmjones commented Oct 7, 2016

@harikt I did miss it, but looking at it now, it might still be easier to look at the real log (if possible) -- copying the text out of the debug bar is not as readable as I'd like.

@andrewshell
Copy link
Contributor Author

Is this what you're looking for?

https://dl.dropboxusercontent.com/u/2696023/creator.log

If you need something different let me know. :-)

@pmjones
Copy link
Contributor

pmjones commented Oct 7, 2016

Yes thanks! I might go so far as to install pen-and-paper and try to work through what Atlas is doing myself. Thanks for bringing this to my attention. (Also, I'm happy that Atlas actually works in that very complex scenario.)

@harikt
Copy link
Contributor

harikt commented Oct 7, 2016

@andrewshell my head is whirling.. So it looks first probably need to understand the DB better. There is some where something got missing. The publisher and game liner etc is the one repeating.

@pmjones
Copy link
Contributor

pmjones commented Oct 7, 2016

Yes, @harikt is on the right track here. If I comment out everything but rpg_book_creators, I get 194 queries. If I then comment out the innermost portion of that ...

                'rpg_book_creators' => function ($books) {
                    $books->with([
                        'rpg_book' => function ($book) {
                            // $book->with([
                            //     'publisher',
                            //     'game_line',
                            // ]);
                        },
                        'credit',
                    ]);
                },

... then I get only 4 queries, which is just what I'd expect: the master, plus 3 relateds (rpg_book_creators, rpg_book, credit).

So there's something going on with publisher and game_line.

@andrewshell
Copy link
Contributor Author

Reading through the query log this is what's happening:

First, it queries for the single Creator which only has 1 creator_id (creator_id = 2).
It would then fetch a recordSet for all RpgBookCreators with creator_id = 2;
Then it fetches a recordSet of all RpgBooks from the RpgBookCreators recordSet.
Then it starts going through each RpgBook querying each Publisher and GameLine individually.
Then it fetches a recordSet of all Credits from the RpgBookCreators recordSet.

@andrewshell
Copy link
Contributor Author

I think it's how the relationships work:

Creator 1:* RpgBookCreator *:1 RpgBook *:1 Publisher

Atlas can look up all of the RpgBookCreators which is a oneToMany from Creator.
It can then look up all the RpgBooks which is a manyToOne from RpgBookCreators.
But it gets stuck with Publishers which are a manyToOne from a RpgBook.

There is a single Publisher for a single RpgBook and there is only a single RpgBook for each RpgBookCreator.

@pmjones
Copy link
Contributor

pmjones commented Oct 7, 2016

I wonder if it's because the N+1 mitigation is working only on the layer "just below" the current Record or RecordSet. Something to do with RpgBook having only one Record, so it queries for the one Publisher, but then it goes on to the next RpgBook for RpgBookCreator, and issues a query for that RpgBook. I'll delve a bit more deeply.

@pmjones
Copy link
Contributor

pmjones commented Oct 7, 2016

Yes, that does appear to be what's going on. You get back one record for the first ManyToOne, as you should. But the subsequent ManyToOne is now operating on a single record, not a record set, so it issues a single query to get the subsequent one record.

I'm not sure how to mitigate that; it would require looking at results that have not been collected yet, from an arbitrarily distant subsequent query, and then collating the results into that arbitrarily distant set of objects.

I suspect we'd see the same result for 1:1 having a subsequent 1:1, or *:1 having 1:1, or 1:1 having *:1.

@andrewshell
Copy link
Contributor Author

I didn't think it would be an easy fix. I figured the current performance was already good enough and if I really needed an improvement I could either add a caching layer or do the stitching myself. If we can figure out a solution it would be a killer feature. :-)

@pmjones
Copy link
Contributor

pmjones commented Oct 7, 2016

Good enough. :-)

@pmjones
Copy link
Contributor

pmjones commented Oct 9, 2016

It's a more generalized problem. MapperSelect::fetchRecordsArray() is the culprit, and I will apply a fix. I do not expect an API break from it, at least not at the "outer" layers. Thanks @andrewshell for the report.

@andrewshell
Copy link
Contributor Author

Awesome!

@pmjones
Copy link
Contributor

pmjones commented Oct 10, 2016

@andrewshell Do me a favor: try out the "n1records" branch in your project, and let me know if my new fixes solve the N+1 problem for you.

@andrewshell
Copy link
Contributor Author

@pmjones Looks like it! No breakage and went from 484 queries to 32. Looks like a win!

@pmjones
Copy link
Contributor

pmjones commented Oct 10, 2016

SWEET. I'll merge back to the main branch later on. Thanks for the good reporting!

@pmjones
Copy link
Contributor

pmjones commented Oct 11, 2016

All merged to 0.x -- switch back to that at your convenience. Thanks again!

@pmjones pmjones closed this as completed Oct 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants