Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] Long home page loading time for large libraries #81

Closed
hkalexling opened this issue Jun 23, 2020 · 7 comments
Closed

[Bug Report] Long home page loading time for large libraries #81

hkalexling opened this issue Jun 23, 2020 · 7 comments
Labels
bug Something isn't working

Comments

@hkalexling
Copy link
Member

As noted by @cglatot in #79 (comment), the home page takes a long time to load when the library has over 1,000 entries.

With the newly introduced home page in v0.6.0, Mango has to check the reading progress of all entries in the library to display the "continue reading" and "recently added" sections, and so the page takes a long time to load if you have a lot of entries. Perhaps we could improve this by caching the two sections and updating them with a regular background task (similar to the library scan). @jaredlt It would be great if we could have your opinion on this.

@cglatot
Copy link

cglatot commented Jun 23, 2020

If the caching is server-side that could work, but if it was a local device cache then that kinda prevents switching reading across devices. Unless you are going to scan very often, which is then a waste of resources.

Are you currently scanning every single file when the Home page loads?
Is it possible to change / add to the data model to store reading progress entries elsewhere? That way, rather than having to scan the entire library, you can just load the reading progression entries.
Sorry for not having a more solid solution to this, I don't have any Crystal knowledge and I haven't yet had time to learn and dig into the code, so not sure how you are managing the data model and if this solution would fit into that.

@jaredlt
Copy link
Collaborator

jaredlt commented Jun 24, 2020

For Recently Added I think a background task and cache is the right approach. In fact, I think it should be bundled in with the library scan (what else is the method call, but discovering new library entries?). So as part of the library scan it would call that method and save the result in the server cache. When a user refreshes the page there is no 'stale' cache because the library scan hasn't occurred yet (ie. you can't have a 'new' Recently Added entry if the library scan hasn't even taken place yet). And further refreshes to Home shouldn't re-run the method as a new library scan hasn't take place yet. In hindsight this is actually a much better approach overall :)

Regarding Continue Reading this is a little trickier as users probably expect this to always be up to date. Ie. you finish reading a chapter, visit home, you would expect to immediately see the next chapter. If you see the chapter you have just finished it would be confusing. So stale cache becomes an issue here. Options:

  • Implement just the Recently Added update and see if that is the main bottleneck (ie. do nothing different for Continue Reading). I suspect we may still run into issues as users create bigger and bigger libraries
  • Implement a per user, server side cache but also bust the cache on certain event triggers ie. completing an entry (not sure how much this complicates things). Also, depending on the time it still takes for this task to complete users will still see a stale cache on occasion (not ideal).
  • Optimize the code
    • we could review bottlenecks and try to optimize by eg. only opening each json file once and reading all the data for all entries into memory
    • we could try to be more intelligent so rather than iterating through each entry in reverse, we could first grab from the cache, and then only check if that cached entry is still correct, otherwise try the next entry (this would primarily help for titles with a large number of entries)
    • these optimisations are just ideas and we'd have to check where the true bottlenecks are first

At some point down the line, I think the final solution will be to store all this in the database. Then we could take advantage of indexes etc. But until then some of the options listed above should hopefully help.

@hkalexling
Copy link
Member Author

@cglatot Yes, I plan to do server-side caching. Currently, the file metadata (e.g., data added, reading progress, last read time, etc.) is stored in a file named info.json inside the corresponding title directory. For example, ~/mango/library/One-Punch Man/info.json stores the metadata of all archive files in ~/mango/library/One-Punch Man/. The reasons why we use JSON files and not a proper DB to store the metadata can be found here #37 (comment).

@jaredlt Good points! Yes, it's reasonable to do it as part of the periodic library scan, but we have to be careful not to make the scan time even longer (#79). I haven't got the time to do a real benchmark, but browsing the source code I suspect the bottleneck is the IO. Each info.json file has to be opened/closed many times so I agree bulk-loading the metadata of all entries in the title is probably the way to go. I will try it out later this week. Appreciate your inputs!

@cglatot
Copy link

cglatot commented Jun 25, 2020

@hkalexling - if the Home page optimizations still do not improve the load time enough, would it not be possible to have a json file strictly for reading progress? That way when loading Home you are only reading a single file for progress instead of having to open every series' JSON. I guess the issue with that is that every time you update the reading progress you would need to do it in 2 places (unless you move it ALL to a reading-progress.json and use that for everything).

@hkalexling
Copy link
Member Author

@cglatot As mentioned in the original comment #37 (comment), we don't use a DB or a central JSON file because it's difficult to point to a specific file. Let's say you store reading progress in a progress.json file like this:

{
    ...
    "/home/user/mango/library/One-Punch Man/Ch.126.cbz": "100%",
    "/home/user/mango/library/One-Punch Man/Ch.127.cbz": "90%",
    ...
}

If the One-Punch Man/ folder is renamed or moved, the progress of every entry in the folder will be lost.

I've made some changes in 9d76ca8, which greatly improves the home page load time. I tested it with a library containing over 10k entries and the page is loaded within 800ms, so I think the bulk-loading optimization is enough.

@hkalexling
Copy link
Member Author

Fixed in v0.7.2. Thanks for the bug report!

@cglatot
Copy link

cglatot commented Jul 1, 2020

Can confirm - Home used to take about 8+ seconds to load, now takes about 800ms! Thanks :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants