Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan for empty book series more efficiently #2103

Merged
merged 4 commits into from Sep 17, 2023

Conversation

selfhost-alt
Copy link
Contributor

@selfhost-alt selfhost-alt commented Sep 15, 2023

I've been trying to figure out why my server was starting so slowly after the Sqlite migration, and I found that the scan for empty book series seems to be the place where it spends the most time during startup.

I believe this slowness is due to the fact that it does a sub-query in the WHERE clause, which gets pretty inefficient when you have a large series table (as I do).

This is the original query that gets run before my change:

SELECT `id`, `name`, `nameIgnorePrefix`, `description`, `createdAt`, `updatedAt`, `libraryId`
FROM `series` AS `series`
WHERE (SELECT count(*) FROM bookSeries bs WHERE bs.seriesId = series.id) = 0;

On my dev container, against a snapshot of my DB (10k series entries, 36k bookSeries entries), this takes just under 10 minutes to complete.

With this PR, the query changes to the following:

SELECT
  `series`.`id`,
  `series`.`name`,
  `series`.`nameIgnorePrefix`,
  `series`.`description`,
  `series`.`createdAt`,
  `series`.`updatedAt`,
  `series`.`libraryId`,
  `bookSeries`.`id` AS `bookSeries.id`,
  `bookSeries`.`sequence` AS `bookSeries.sequence`,
  `bookSeries`.`createdAt` AS `bookSeries.createdAt`,
  `bookSeries`.`bookId` AS `bookSeries.bookId`,
  `bookSeries`.`seriesId` AS `bookSeries.seriesId`
FROM `series` AS `series`
LEFT OUTER JOIN `bookSeries` AS `bookSeries`
ON `series`.`id` = `bookSeries`.`seriesId`
WHERE `bookSeries`.`id` IS NULL;

On my dev container against the same DB snapshot, this query takes about 300ms.

@selfhost-alt
Copy link
Contributor Author

btw there are other similar slow queries that run at startup, like the one that checks for books with no media, but the empty series scan was the most painfully slow one for me, so I started with this.

@selfhost-alt selfhost-alt marked this pull request as draft September 15, 2023 20:20
@selfhost-alt
Copy link
Contributor Author

Actually, doing a COUNT doesn't quite do what we want. I will take another pass at this and see if I can come up with the right query.

@selfhost-alt selfhost-alt marked this pull request as ready for review September 15, 2023 20:35
@advplyr
Copy link
Owner

advplyr commented Sep 17, 2023

Thanks, I went and fixed the other 2 happening in that function. There are many more sql queries to optimize as well, unfortunately more complicated then those.

@advplyr advplyr merged commit 4e01722 into advplyr:master Sep 17, 2023
1 check passed
@selfhost-alt
Copy link
Contributor Author

selfhost-alt commented Sep 17, 2023 via email

@advplyr
Copy link
Owner

advplyr commented Sep 17, 2023

Your help with that would be great. I know that one of the main issues with the home page & library queries is they are not using the proper indexes. I've been using the EXPLAIN QUERY PLAN command to check the indexes

@BlackHoleFox
Copy link

@selfhost-alt You seem to be running into the same thing I am on the library home page if its that slow. Wrote about it here since it seemed most relevant but I suspect something is wrong with how cached images are getting served.

@thomcc
Copy link

thomcc commented Sep 18, 2023

I wonder if this could have also been fixed by just adding an index on seriesId (I suspect so, but it's hard to say. It's plausible that this change is faster than that regardless).

I've been using the EXPLAIN QUERY PLAN command to check the indexes

I used to work on Firefox, and the expert extension https://www.sqlite.org/src/dir?ci=trunk&name=ext/expert was a great help for optimizing the history/bookmarks (places) database.

I believe the extension is now integrated in the SQLite CLI via the .expert command, but I haven't had a lot of experience using it that way. Given that compiling the tool I linked was always kind of a pain, I'd recommend investigating this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants